Evaluating Amazon EC2 for Scientific Computation

I’ve written some fairly heavy Markov chain Monte Carlo code in Java and would like to let it run for a while. The cluster in the Inference Group is not terribly up-to-date and it’s obnoxious to run computationally-significant jobs on colleagues’ desktops anyway. This seemed like a good opportunity to try out Amazone Elastic Compute Cloud (EC2). The idea with EC2 is that you can fire up an “instance” whenever you want and you just pay for when it’s running and for the bandwidth. For scientific computation this is very appealling. I don’t always have jobs to run, so it would be nice to just pay for what I need. They have the “normal” instances that are meant for hosting web applications and they also have “high-CPU” instances that would seem well-suited for scientific computation. They have a “medium” type (c1.medium) and an “extra large” type (c1.xlarge). The medium instance has two virtual cores, 1.7GB of memory and costs $0.20/hour. It seems roughly equivalent to a Core 2 Duo. The extra large instance has eight cores, 7GB of memory, is 64-bit and costs $0.80/hour.

It is really easy to get one of these up and going. You can just follow the instructions in the Getting Started Guide. You need to set up various security aspect (which you can do easily from the command line) and then fire up an instance. You need to pick one of the virtual machine images (AMIs). I suggest using one of the Ubuntu images here.

The frustrating part is that unlike web hosting, in scientific computation I want to fire up an instance and take it down easily. Unfortunately, any data on the instance is lost when you take it down. So, to get the tools required for your work, you will need to make a custom image. You start with a base image, set it all up and then store off your custom image to Amazon S3. From there you can start an instance from your private image just like you could with the public instances. Whatever calculations you make will need to be stored somewhere else if you shut down the instance again. If you have some ephemeral state that you require for starting your computation then you’ll need to upload it whenever you fire up a new instance.

So, it’s not a magic bullet, but it seems to have some nice potential. The 8-core machine is kind of like renting a Ferarri for a day. Even though my code parallellizes quite well, I don’t get anything near linear speedup. I think this may be due to limited cache-coherency. Nonetheless, it seems about twice as fast as the dual-core machine.

Leave a Reply

You must be logged in to post a comment.