I've fiddled with my blog template because I decided I wanted more horizontal viewing space, given that it was using less than a third of my 1920 horizontal pixels. If it feels too spread out for you, I added a drag-and-drop handle over to the left to let you resize the main content column. The javascript is pretty primitive. If it breaks, drop me a comment.
>
>
>
>

Saturday, October 25, 2008

On a Quest for Social Lending

(Jump right to the list.)
This "global economic crisis" isn't all bad. If you've got some cash piled up, like Warren Buffet--even if it's not quite the $44 billion he started this year with--you're in a great position to be hunting for opportunities while others are just trying to make ends meet as their credit lines shrink. One such opportunity, in my opinion, is something that recently came to my attention: the relatively new movement called "social lending".
The shortest way to explain it is that people borrow money from other people instead of from banks. Why social lending? Consider how banks make money. First, they buy your money by giving you interest on money you deposit with them in the form of savings and checking accounts, MMA's, CD's, or whatever. Then they turn around and sell your money to someone else at a higher rate: by giving a home loan for 6%, a car loan for 7%, a credit card for 13%, etc. (These are just rough guesses at today's rates.) This is what banks do. It's their purpose for existence and how they stay in business. It's called arbitrage. The question on the lips of social lenders is, "Why should the banks have a monopoly on this?" Social lenders want to get in on this action.
Social lending is also known as P2P (peer-to-peer or person-to-person) lending. As described in the wikipedia article linked above, it can refer to a "marketplace" type of lending, where lenders browse for borrowers they would like to lend to, or a "friend/family" type, where you are lending to or borrowing from a friend or family member and would like a service to help make the loan "official" with appropriate legal paperwork. In this post, I'm referring to the "marketplace" type of lending, in which you, as a lender, actively search for someone to lend money to as an investment. The model I'm primarily interested in is one where borrowers list their loans and lenders "bid" on them with a dollar amount they're willing to give and the interest rate they want for it. In essence, it becomes an interest rate "auction", with lenders bidding down the rate until the "auction period" expires.
Social lending allows anybody with a little bit of cash--as little as $50 in some cases (maybe less?)--to lend that cash out, thereby putting the money to work and getting a higher return than they'd get from a typical bank account. Before you get too excited, social lending sites have taken something of a beating in the past couple of weeks. A quick Google search turned up a WSJ blog and a Washington Post article about how these sites have been affected by recent events: tighter regulation by the SEC and higher-than-normal default rates are putting a dent in their business. Briefly, the sites have been forced to suspend all new business until certain SEC filings are complete--a process that can take months. Unfortunately, this is happening just at the moment when their businesses should be growing by leaps and bounds, as more and more loan-seekers are denied by banks and turn elsewhere to get the funding they need.
A word of caution wouldn't be out of order here: when you loan money like a bank, you're taking on risk like a bank. Don't lend indiscriminately or you'll find yourself in the same position as Wachovia:
"October 23, 2008--Wachovia Corp. reported a $23.9-billion third-quarter loss Wednesday, the largest loss at any bank since the financial crisis began..."
Wachovia, of course, won't be Wachovia for much longer. Lehman Brothers is another good example of how not to lend. I can't find any one, good article, but looking over a news search for "Lehman Brothers" for the past month brings up words like "carnage", "fire sale", "disaster", "subpoenaed", "death", and other unpleasant things. When you lend, choose your borrower wisely, and, as with any financial endeavor, diversify! Split your money across many borrowers to limit your exposure to defaults.
One other thing to check for... To loan money, you have to first have the money somewhere. Make sure that whatever service you go with keeps your money in an FDIC-protected account. If it doesn't, well... I won't tell you what to do with your money, but make sure you know exactly what you're getting into, and weigh the risks appropriately.
If all this doomsaying hasn't scared you off, if you're still interested in the potential investment opportunities offered by social lending, as I am, then you can start by looking through this list of social lending sites I've found. Not all of these are "marketplace" lending sites, but I'm listing them anyway so that it's a complete reference. Further, some of these sites serve particular countries exclusively, so make sure you check that particular detail. I've tried to keep at the top of the list all the sites where a US citizen could go and sign up for an account right now to start lending.

The List

Fynanz - http://www.fynanz.com/ - Social lending that specializes in student loans. This one was going to be further down in the middle of the list (arbitrarily), but I moved it to the top because it seems like a really good prospect to me. Take a look around their site. One thing that makes them stand out is that they recentaly gave referral and lending bonuses. It looks like that's over, but it seems promising for the future. They also have a guarantee system that protects some or all of your investment, which I haven't seen elsewhere. Finally, there's the fact that you're investing in someone's education.
LendingClub - http://www.lendingclub.com/ - A high-profile US social lending site. They very recently reopened lending after completing the aforementioned SEC registration process.
Loanio - https://www.loanio.com/ - Another site that is currently open for business to US residents. I'm not sure if they've had to or will have to register with the SEC. This is something to check on before you decide to go with them, since they might have to shut down lending for weeks or months to get the registration done.
Prosper - http://www.prosper.com/ - Probably the most prominent social lending organization serving the United States. As of this writing, they aren't accepting new loans because of the SEC registration process I mentioned above. They began the process on 15 Oct 2008, I belive, and it's unknown when they'll open back up to lenders.
CommunityLend - http://www.communitylend.com/ - A new player on the field: they're not open for business yet. It seems they intend to have a public beta, and you can be notified when it launches by entering an email address on the "I want to Invest" part of their site. Note: This is a Canadian company, and I can't find any information about who is eligible to be a lender. I've emailed them to find out. Another note: Before I even finished this post, the Chief Technology Officer of CommunityLend wrote me back--on a Saturday!--to say that because of financial service regulations, the service will only be available to Canadians. That's unfortunate. Their website looks promising.
Yadyap - http://yadyap.com/ - Another not-quite-launched site, Yadyap will reportedly be the social lending equivalent of payday loans. I don't know much else about it. You can be notified when they launch by entering your email address on their home page.
Zopa - https://us.zopa.com/ - Included here for completeness, Zopa recently stopped doing business in the US. They apparently still have a booming business in the UK and also offer services in Italy and Japan. Maybe they'll come back to the US. Who knows? The seem to have had a different business model than the typical auction of a "marketplace" social lending company.
GlobeFunder - https://www.globefunder.com/ - I'm not sure if this is the same kind of company as Prosper or LendingClub. They're not operational yet, and the site doesn't give a lot of detail about what they're trying to do. It does mention "individual lenders" on their Lenders page, so maybe it will be what I'm looking for.
Fosik Lending - http://www.fosik.com.au/ - An Australian lending site that does both "friends and family" and "marketplace" lending. The "marketplace" portion is currently in beta testing and will reportedly be made a public beta. I'm fairly certain that only Australian citizens are eligible, but it isn't stated on their website. I've sent an email requesting confirmation.
iGrin - https://www.igrin.com.au/ - Another Australian lending site that plainly states it's open only to Australians. It's too bad. It looks like a well-executed website.
"Friends and family" loans only: Virgin Money - http://www.virginmoneyus.com/ - No marketplace here. Their whole thing is about formalizing and managing loans between friends and family members.
LoanBack - http://www.loanback.com/ - About the same as Virgin Money, from what I can tell.
Finally, something a little different: "marketplace" social lending targeted at enterpreneurs in third-world or poverty-stricken areas. "Enterpreneur" here doesn't necessarily mean someone looking for $10,000 to rent some office space and set up shop. It could just as easily be someone looking for $300 to buy some pigs for a farm. That's not intended to be derogatory. I just want you to know what to expect:
Kiva - http://www.kiva.org/ - Just from poking around the websites to write this post, Kiva seems like the leader in this section. They have a polished site and and seem to have a lot of traffic. Kiva appears to be open to lenders from all over the world, US included. They target poor enterpreneurs all over the world, and the result is a strong tendency toward agricultural borrowing.
MyC4 - http://www.myc4.com/ - This site focuses on Africa. Borrowers seem to tend more toward light industry--textiles, light manufacturing, and the like. It's not clear whether US citizens are eligible, but the front page claims that they have investors from 75 countries. All monetary amounts are in euros.
United Prosperity - http://www.unitedprosperity.org/ - This one isn't operational yet, but seems like it will be similar to Kiva.

Monday, October 20, 2008

How to Process a File Line-By-Line in Linux

Being a Linux noob, I had to look around for how to process a text file line by line in a shell script. I have no idea if this is the best way, but here's what I figured out:
cat movies | awk 'system("echo " $1)'
where "foo" is the file. That sends the file to awk, and the stuff in the single quotes tells it to invoke the system command "echo" for each line of the file, passing the line to it.  You can obviously  substitute other commands for "echo", like "wget" to download a bunch of stuff if the file is a list of URLs.

Wednesday, October 15, 2008

How To: Run Selenium Tests with Hudson on a Headless Linux Server, Part Three--Configuring Hudson

So you've got Xvfb running, and you've tested it by taking a screenshot or three of xclock. Now to get it working with Selenium tests in your Hudson builds. First, let me say that my work was done with Selenium RC, in which you run a standalone Selenium server which is resposible for launching browsers and receives commands from your test scripts to run in the browsers. I'm not highly familiar with the other varieties of Selenium, so I can't say how similar the setup would be for them.
First, Selenium has to know which browser to start and/or how to start it and which display to use. If you're already using Selenium RC, you'll know that you have to pass a browser command to Selenium to indicate what browser to use. However, if you normally work in Windows, and Selenium can't find Firefox or IE on your Linux box, you might need to do a little more configuration here. A typical browser command to launch Firefox is "*firefox". Selenium has a list of "default locations" where it looks for the Firefox launcher. If it can't find it, you can specify it manually, like *firefox /usr/bin/firefox-bin. This tells Selenium that it's starting a Firefox instance and to use the given path. You must provide the path to firefox-bin and not just to the firefox script. Selenium checks to see if it's been given a script or an executable binary, and it will throw an exception if it finds a script. There's also an option to just pass a path and arguments to Selenium, leaving off the "*firefox" designator, but as the docs say, "If you specify your own custom browser, it's up to you to configure it correctly. At a minimum, you'll need to configure your browser to use the Selenium Server as a proxy, and disable all browser-specific prompting."
It's simple to tell Selenium to use the virtual display. Setting an environment variable named "DISPLAY" in Linux tells any graphical app that starts to start on the specified display, so it's just a matter of getting that variable set properly for the Selenium server process. Remember that it's the server that's responsible for starting the browser, so that's where the DISPLAY variable has to be available. If you're launching the server from a shell, you can just do export DISPLAY=:5.0 before launching the server. Naturally, you'll need to make sure the numbers match up with the display and screen that you configured in Xvfb. (See the first post in this series for details on that.) If you're launching the server with Ant, just add a nested element to the <jar> target that looks like this: <env key="DISPLAY" value=":5.0" />. However, keep in mind that if you add this to your build script, then you're tying all Linux users to that display. That's probably not good. (Been there, done that.)
Finally, make sure that Hudson is orchestrating everything correctly. This might actually be the smallest part of the whole thing since Hudson can invoke shell scripts, Ant targets, Maven goals, and a ton of other things. Make sure that Xvfb is running or that Hudson starts it. Make sure your application server/web container is running or that Hudson starts it. Make sure that any other stuff your application depends on is available, like a database. Make sure Hudson builds your web application and deploys it as appropriate. Make sure the Selenium server is running or that Hudson starts it. Then just have Hudson invoke the target, goal, or whatever that starts your Selenium test suite. Since Selenium tests are written as normal unit tests with JUnit or TestNG or whatever, there's really nothing to that. The only tricky part here is that you need to make sure the tests don't start until your application has fully started up. It's possible that you could start running tests before the URL for your application is even available on your web server.
That should be it! When the Selenium client starts up, it will communicate the start command to the server, which will start a browser in the virtual display. Then the tests will run just like they always do, sending commands to the Selenium server, and in turn to the browser, which makes HTTP requests to your web server, which is running the application that Hudson just built. You can use xwd and xwud, explained in the second post, to take and see screenshots of the browser as your tests are running. A cool idea that I've implemented in our environment is to set up a listener in your test framework (I used an ITestListener in TestNG) that will take a screenshot any time a test fails. This gives you extremely useful feedback to use when checking out the failures.

Tuesday, October 14, 2008

How To: Run Selenium Tests with Hudson on a Headless Linux Server, Part Two--xwd and xwud

In Part One, I explained how to start Xvfb to give you a virtual display to run graphical applications on. In this post, we'll verify that it's working properly by starting xclock and taking a screenshot of it. Being able to take a screenshot of a virtual display is a pretty useful skill all by itself. To take and view the screenshot, you'll use two different programs: xwd and xwud. The former takes the screenshot; the latter shows it. These should have been installed with your X server (I think).
First, start xclock on the system where Xvfb is running with xclock -display :5.0 so that it will run on the virtual display. Next, also on the system with Xfvb, take the screenshot with
xwd -root -display :5.0 -out xwdout
The breakdown:
-root
Tells xwd to capture the "root" window. This means it captures the entire screen. You can tell it to only capture a certain window if you want. See the -id and -name parameters in the man page if you want to try that.
-display :5.0
Tells xwd to look at screen 0 of display 5. This is why you have to know that info from when you started Xvfb!
-out xwdout
Gives the name of the output file to write the screenshot to. Xwd uses a custom binary format for its files. They can be read by xwud (next), and I understand that they can be converted to other formats by ImageMagick (free!), though I haven't tried it yet.
After taking the screenshot, you'll need to get the output file to a system that has a real X server running, so you can see the image. Some options for this are ftp or sftp, scp, or good old Sneakernet. Finally, view the screenshot you took: xwud -in xwdout. This shows the image stored in the file named xwdout. It should open a window where you can see a happy little xclock showing the time it was whenever you took the screenshot. Click anywhere in the window to close it.
That's it for this post. In Part 3: how to make Hudson, Selenium, and your build play with all this other stuff you've learned.

How To: Run Selenium Tests with Hudson on a Headless Linux Server, Part One--Xvfb

I've recently set up Hudson as the continuous integration server for my project at work. I chose Hudson over Cruise Control and Continuum for two reasons: Hudson was highly recommended by a former coworker (thanks Mike!), and, when I was choosing, the Hudson site was much friendlier and easier to navigate. I'm not going to cover setting up Hudson, because it really is as easy as the site makes it sound, and it has tons of built-in tooltips for help. What this series of posts is going to cover is getting Hudson to run your Selenium tests on a headless Linux server. I'll cover the Linux portion in fairly high detail, since I only had some basic Linux experience when I undertook this adventure, and I had to put all this info together myself from my own research. This isn't for total Linux noobs, though. You should at least know how to install packages and navigate around the file system from the command line before reading this.
A note for the uninitiated: Selenium is an open source library for testing web applications at the UI level. It uses JavaScript to interact with web pages, so you can script a series of user actions and ensure that your application functions as expected in the browser. But just as this isn't a series about setting up Hudson, it's also not a series about setting up Selenium.
Let's get down to Part One. Assume you have some Selenium tests in your test suite and you want to get them working from a Hudson-initiated build on your headless server. Headless means there's a good chance you have no X server running, and you can't run Firefox or your browser of choice without an X server, which means you can't run your web application. Instead of starting up a full-fledged X server just to run some UI tests, how about a virtual display instead? That's what a nifty tool named Xvfb gives you. Xvfb starts a very basic, virtual display in memory so that applications requiring graphical capabilities can run on a machine that doesn't routinely run an X server.
Step one is to make sure that Xvfb is installed. Note that this means you have to have an X server installed, too. Yes, Xvfb makes a lightweight, virtual display, but it still uses the X server to do it. Once Xvfb is installed, you should have an executable binary named Xvfb installed somewhere: probably in /usr/bin. I'll assume that's where it is. Next, we'll start it up:
/usr/bin/Xvfb :5 -ac -screen 0 1024x768x8
How this breaks down:
/usr/bin/Xvfb
The executable that actually starts Xvfb. Duh.
:5
Tells Xvfb to use display 5. X servers can have multiple "displays" with multiple "screens" per display. The default display is 0. I chose to use 5, because sometimes a real X server is started on the machine in question (for VNC connections), and I didn't want to find out what would happen if there were a collision. I don't know what the limit is. It doesn't really matter what you use, just as long as you remember what it is. You'll need it later.
-ac
Disables access control to the X server, enabling access by any host. This is Linux-speak for, "You can use this server from any host, local or remote." Per the Xserver man page, "Use with extreme caution. This option exists primarily for running test suites remotely." Since we're doing testing, we're fine.
-screen 0 1024x768x8
Creates screen number 0 on display 5 at resolution 1024x768 and 8-bit color depth. Obviously these numbers can be whatever you want.
Now Xvfb should be running and ready to invisibly display stuff for you. The values you set for "display" and "screen number" are important, and are combined in the form ":." to specify to Linux which display/screen combination to use for displaying things. In the example above, you would use ":5.0" to tell Linux to use the single screen that is configured. I should also add that it doesn't seem to matter which user starts Xvfb, so you can create a startup script in /etc/init.d and let it run as root if you like. You can test your Xvfb by starting a graphical application and taking a screenshot. That's what I'll cover in Part 2--xwd and xwud.

Friday, October 10, 2008

Unlocked

Well, this blog got automatically locked as a potential "spam blog", or I would've posted a couple of things over the past few days. I suspect the robots noticed the total disregard for the English language in my last post and locked me down for it. I'll post something interesting later today maybe.

Thursday, October 2, 2008

An Unexpected Chapter

I certainly didn't expect to be awake now, but a coworker is out of the country and working from a very different time zone, and he IM'd me with a question at about midnight, my time. The question was related to a problem we'd been seeing related to Apache Camel 1.4. There was a NullPointerException coming from inside Camel somewhere. This is the exception and stacktrace:
Execution of JMS message listener failed                                                                                 WARN  DefaultMessageListenerContainer(634) 01:25:03,534 DefaultMessageListenerContainer-10
org.apache.camel.RuntimeCamelException: java.lang.NullPointerException
at org.apache.camel.component.jms.EndpointMessageListener.onMessage(EndpointMessageListener.java:71)
at org.springframework.jms.listener.AbstractMessageListenerContainer.doInvokeListener(AbstractMessageListenerContainer.java:531)
at org.springframework.jms.listener.AbstractMessageListenerContainer.invokeListener(AbstractMessageListenerContainer.java:466)
at org.springframework.jms.listener.AbstractMessageListenerContainer.doExecuteListener(AbstractMessageListenerContainer.java:435)
at org.springframework.jms.listener.AbstractPollingMessageListenerContainer.doReceiveAndExecute(AbstractPollingMessageListenerContainer.java:316)
at org.springframework.jms.listener.AbstractPollingMessageListenerContainer.receiveAndExecute(AbstractPollingMessageListenerContainer.java:255)
at org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.invokeListener(DefaultMessageListenerContainer.java:887)
at org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.run(DefaultMessageListenerContainer.java:822)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.lang.NullPointerException
at org.apache.camel.component.jms.JmsMessage.createMessageId(JmsMessage.java:203)
at org.apache.camel.impl.MessageSupport.getMessageId(MessageSupport.java:127)
at org.apache.camel.impl.MessageSupport.copyFrom(MessageSupport.java:95)
at org.apache.camel.impl.DefaultExchange.safeCopy(DefaultExchange.java:98)
at org.apache.camel.impl.DefaultExchange.copyFrom(DefaultExchange.java:81)
at org.apache.camel.impl.DefaultEndpoint.createExchange(DefaultEndpoint.java:145)
at org.apache.camel.component.file.FileProducer.process(FileProducer.java:54)
at org.apache.camel.impl.converter.AsyncProcessorTypeConverter$ProcessorToAsyncProcessorBridge.process(AsyncProcessorTypeConverter.java:43)
at org.apache.camel.processor.SendProcessor.process(SendProcessor.java:75)
at org.apache.camel.management.InstrumentationProcessor.process(InstrumentationProcessor.java:57)
at org.apache.camel.processor.DeadLetterChannel.process(DeadLetterChannel.java:155)
at org.apache.camel.processor.DeadLetterChannel.process(DeadLetterChannel.java:91)
at org.apache.camel.processor.Pipeline.process(Pipeline.java:101)
at org.apache.camel.processor.Pipeline.process(Pipeline.java:85)
at org.apache.camel.management.InstrumentationProcessor.process(InstrumentationProcessor.java:57)
at org.apache.camel.processor.UnitOfWorkProcessor.process(UnitOfWorkProcessor.java:39)
at org.apache.camel.util.AsyncProcessorHelper.process(AsyncProcessorHelper.java:41)
at org.apache.camel.processor.DelegateAsyncProcessor.process(DelegateAsyncProcessor.java:66)
at org.apache.camel.component.jms.EndpointMessageListener.onMessage(EndpointMessageListener.java:68)
... 8 more
He IM'd me to say he'd discovered that trying to use two file endpoints in the same camel route caused this exception. After some testing, I narrowed his theory to "using a file endpoint anywhere but the end of a route". I should say that this is in a route that processes JMS messages. Here's the problem: Camel pulls a message from a JMS desintation and sends it along the route as an org.apache.camel.component.jms.JmsMessage packaged inside an org.apache.camel.component.jms.JmsExchange. When you get to a file endpoint, it does this:
1 public void process(Exchange exchange) throws Exception {
2     FileExchange fileExchange =
          endpoint.createExchange(exchange);
3     process(fileExchange);
4     ExchangeHelper.copyResults(exchange, fileExchange);
5 }
That's in org.apache.camel.component.file.FileProducer, for those following along. This is what creates the file from the incoming message/exchange. Line 2 there creates a FileExchange containing FileMessages based on the incoming JmsExchange, which contains JmsMessages. Line 3 is what writes the file using the contents of the newly created exchange. Then, for whatever reason, Camel copies the state of the FileExchange back onto the incoming exchange (yes, that method is copyResults(destination, source)), and that's the exchange that gets passed on down the route. In ExchangeHelper.copyResults, the message from the source exchange (the out message if it exists, the in message if out doesn't exist) is copied to the destination's out message. The destination doesn't have an out yet, so one is created, and then the state of the source message is copied to it. The problem is that JmsMessage has a property of type javax.jms.Message where it stores the original JMS message that it pulled from the JMS broker. FileMessage obviously does not have this. Basically what the above code does is copy from JmsMessage -> new FileMessage -> new JmsMessage. Plainly, then, any state in the original JmsMessage that doesn't fit in FileMessage will be lost in the new JmsMessage. One of those is the javax.jms.Message property. Unfortunately, this property is an integral piece of Camel's JmsMessage. One place it's used is in generating a message id in the createMessageId method, which you'll notice is the top line in the stack trace for the NullPointerException above, and the message id is used all over the place. What this boils down to is that you may not use a file endpoint in a route where the message comes from a JMS server except as the very last endpoint.* However, you can fudge this a little using Camel's implementation of the multicast pattern. This will break:
from("someJmsEndpoint")
    .to("file://foo").to("file://bar");
But this works:
from("someJmsEndpoint")
    .multicast().to("file://foo").to("file://bar");
Each file endpoint still has to be at the very end, but multicast() effectively lets you have many "ends" to a single route. *This may be a problem with mixing other message and exchange types as well, but JMS/File is the only one I've seen break myself.

Wednesday, October 1, 2008

The First Chapter

Here's an interesting problem I came across the other day related to transaction demarcation.
I'm writing some message-driven code using a JMS broker--specifically, ActiveMQ: http://activemq.apache.org/. A lot of the processing is actually running in the broker JVM itself using Apache Camel to move messages around: http://activemq.apache.org/camel. We recently had trouble with some of the processing code when we first deployed the message broker to a staging environment and tried to push several thousand messages through it. It ran through a bunch of the messages just fine, then it stopped--no errors in the log, no crash of the broker, no CPU use... nothing. It just sat there grinning stupidly at me. Here's how the message processing works...
New message comes in. Get the important details out of it and invoke a processMessage method that does everything needed to put the message into our database. The database is Oracle in staging and production but Apache Derby in development. The processMessage method is wrapped with a typical transaction using Spring's declarative transaction management. Inside that method, another method is invoked that’s configured with PROPAGATION_REQUIRES_NEW. There are a few factors working together that make this necessary, but they’re not relevant here.
It goes something like this:
  1. Begin outer transaction/open Hibernate session 1
  2. Do some transformations and creation of object A to represent message
  3. Get related object B from database or an external source
  4. If object B came from outside, save it with session.save(B) (session 1)
  5. Associate object A to object B
  6. Begin inner transaction/open session 2
  7. Save object A with session.save(A) (session 2)
  8. Commit inner transaction/close session 2
  9. Attach object A to session 1 for continued processing
  10. Maybe make other changes to object A
  11. Commit outer transaction/close session 1
Here's how the problem manifested...
When the broker froze up, the logs stopped right after a SOAP call was made for object B and we tried to save it. Significantly, it was the first time since the broker was started that we’d had to make a SOAP call and save the result to the database. The last thing the logs record is that B was retrieved and saved successfully, but it never made it into our database. We ruled out the possibility that the problem was in the data being retrieved. It was definitely a problem with the code running in the broker.
That’s it. Can you figure out what’s wrong? Read on for the answer.
Hint 1: Object B has unique constraints on some of its fields in the database.
Hint 2: I’ve left out some details about how objects A and B are mapped. I didn’t have the information on hand when I was analyzing the problem and was forced to make some assumptions that allowed me to reach a solution. I still don’t know exactly how they’re mapped, because I already fixed the problem without looking at the mappings. Assume that the relationship from A to B is mapped at least as cascade=”save-update”, and therefore object B will be saved when object A is saved in step 7 above.
Hint 3: It’s all in the transactions. Look there.
What’s going on: I made one other assumption to explain things related to the session.save(B) call in step 4 above. That call will cause an insert statement to be issued for B at some point. My assumption was that, for some reason, the session is being flushed and this insert is happening before the inner transaction is begun in step 6. (I can’t give you any more information on this right now, as I never verified this assumption, either.)
Given these two assumptions, here’s what’s happening at the database level:
  1. Open connection 1, autocommit off
  2. Insert B on connection 1
  3. Open connection 2, autocommit off
  4. Insert A on connection 2
  5. Insert B on connection 2
  6. Commit on connection 2
  7. Commit on connection 1
Do you see the problem now? If not, you probably need one more crucial piece of information: earlier uncommitted statements against an Oracle database will block later, conflicting statements from being committed until the earlier one is committed or rolled back. More specifically, if you try to insert object B on two different connections simultaneously, the second one can’t know if it should succeed or fail until the first one is committed--remember B has unique fields! If you look back at my sequence of database events, you’ll see that I was trying to commit the second connection and then the first connection.
Tie all this together, and you have a deadlock between a single thread and the database. Connection/transaction 2 waits for connection/transaction 1 to commit or rollback, and connection/transaction 1 is unable to do either because execution is stuck waiting for connection/transaction 2 to finish doing what it’s going to do.
Nice, huh?
Solution: Make sure object B only gets associated to one session, the one tied to the inner transaction, so that it’s inserted only one time--along with object A.
Final note: This problem never cropped up in development because Derby apparently doesn’t care about database locks. It seems to blow right past many of them without batting an eye. I discovered this while experimenting with ActiveMQ using JDBC persistence in a Master/Slave failover configuration.