Drupal and Amazon EC2

I am seriously considering to move my host to the Amazon Elastic Cloud Computing (aka EC2) service. Amazon is revolutionizing the way virtualization will become a mass product. It uses Xen (for the I-life-under-a-proprietary-rock-people: think an opensource paravirtualisation version of VMWare on steroids; faster, better) to deliver a "Linux" instance which you have complete control over and that is billed by use, the ultimate pay-per-use model (or in Dutch Pay-per-duur :-) ). The cool this about EC2 is it is truly elastic, you can have more or less CPU power depending on the need of your application and you can even roll out new servers on the spot within minutes.

Classic ASP models are using "eye to server" communication for this, a sysadmin or sometimes the customer can roll out a new server via a browser. The cool thing about EC2 is that this doesnt have to be restricted to "eye to server" communication but due to use of API's it can be "server to server". This is big! If a human has to roll out a new server, it means he has to know that the current capacity id too low. And humans will mostly find out so by paging from a server or some other communication, but always triggered by reading output from a server! So why not let the server / application decide it has to use more capacity? Server to server communications via an API.

EC2 is also reliable, as reliable as Amazon.com is itself. And most of all, it is rather cheap. If you utilize a full blown "instance" (1.7Ghz x86 processor, 1.75GB of RAM, 160GB of local disk, and 250Mb/s of network bandwidth) you will pay around 70 dollars. But most likely you will have some peak ond low hours, so you will pay less per server in most cases.

As Roland said, it was only a matter of time before someone successfully used EC2 for a Drupal site and sure there is one. In fact only one month after Rolands prediction, there was a Drupal site at EC2 ... about EC2! with a cool name as well: elastic8.com/.

Yes, Elastic8.com is running on AWS:

[root@tug ~]# whois `host www.elastic8.com | grep "address" | awk '{print $4'}` | grep -i netname
NetName: AMAZON-AES

Now here is where it gets interesting. Drupal -the leading CMS- has something we call throttle-ing. It means that if your site's load has exceeded some parameters, the site becomes more static and you loose some functionality. This way your site has a better probability it will survive a temprary huge increase of users. However, it does so by giving more pages with less functionality. This is good but not the best option. What one wants (pending budget) is that more resources are available when under load so you can deliver more pages with the same functionality!

Think about it; using the thottle API not to have less functionality but more servers! It wouldnt be very hard to make this I think. Would be a great feature request and a big hit fro Amazon if this gets in.

You could for example use 1 server with both Apache/PHP/Drupal and MySQL running on it and when the load increases dynamicaaly roll out a server, migrate the database to it, make a new connection and continue serving pages. I dont know however how -if at all- this would scale to n-frontend webservers and n-MySQL clustered server. But it sure is nice to dream about: "reverse throttling!"

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Interesting!

Migrating/syncing a running database is one problem, having multiple Drupal instances writing to the same (clustered) database another.

But once those are solved this would serve as very interesting news on Slashdot and Digg. "Please slashdot us!", the article would be the test itself!

Now I wonder: if the Drupal construction would contain an error, leading to the number of servers going wild, would it be possible to DoS EC2? ;-)

dos

i dont think you can dos amazon but you are free to try so :-)

however, an additional factor to throttleing should be introduced, dont go beyond n servers or wait x time before deploying another. since you dont want to have billed dozens of servers when someone DoS-es your drupal site :-)

--
groets, bert boerland

why not?

Dozens of servers won't be that much of a problem, as you get billed by the hour. This presupposes of course that the number of servers automatically scales back once the slashdotting is over.

Going the other way: think about a Drupal site that automatically shuts itself down (n=0) after x time without any attention from visitors. ;-)

(The name "Marvin" pops up in my head now ;-))

out

just spoke to him, he is still a bit down :-)

--
groets, bert boerland

great idea

love the idea to use throttle that way.

Costs are a bit more than $70

In my personal situation, my site may be a bit higher on the bandwidth side because I serve a lot of videos. But for those who are reading this article, you have to factor in all of the EC2 costs. True, on average, like your standard blog site, you may end up paying something in line with a co-located or hosted server.

However, with one of the many low-cost, shared host accounts, you get off with much less in terms of cost.

If you have even decent bandwidth utilization, then the costs may creep up and the financial proposition isn't that great. Notice that the highest of the costs is the bandwidth utilization at .20. The biggest advantage is the scalability and the reliability of the AWS data center. For those factors alone you're going to pay a bit higher premium (cumulatively).

My stats are based on a comparison of the EC2 specs based on an average monthly net use of 216 GB. With a server equivalent to 1.7GHz I would come out around $120/month depending on instance storage.

Compare that to what I have now, which is $130/month for a hosted rack server [dedicated] with 1500 GB monthly at 2.8GHz P4 w/ 1.5 RAM and 160GB HD (split across two 80's for redundancy).

As with all things, you have to run the numbers. For those sites where you may get the occasional (if ever) digg or slashdot. Considering EC2 from a cost standpoint isn't necessarily going to win you over. You really have to want/need the scalability and reliabilty.

But, as the EC2 promo page states, you only pay for what you use. What's going to REALLY be cool is when there's more competition! Just like the old days compared to now. You can get a pretty fast shared host for practically pennies. Comparing that to when you had to lease your own T1, buy expensive hardware and pay someone to manage it (if you didn't know how), the future looks awesome for running a server!

Any takers?

This is something we're interested in and doesn't seem to difficult to do as a throttle_ec2 module of some sorts. Anyone interested in sharing the effort or has somebody already started on this?

-RobRoy

It's shade that there is no

It's shade that there is no way to obtain static IPs with EC2. Static IPs (even one) could bring usability of EC2 to next level.
Dynamic DNS could be solution, but static IP is definitely better one.

BTW, basically your idea is excellent.