Content: Blog

Scaling the peaks of Hacker News

Daniele Procida

Aug. 19, 2016

Divio Aldryn Cloud - Scaling the Peaks of Hacker News

In recent months, we’ve found ourselves at the very top of Hacker News a few times. It’s impossible to know when it’s going to happen, but every so often interest gathering around something we have done or written will send it climbing to the top of first page - the first thing that a few hundred thousand daily programmer visitors will read about, and visit.

On those days, our web analytics graphs redraw all their vertical scales, and look like the fantasies of a sadist planning a mountain stage in a bicycle race.

It’s fun to have 15 minutes (or 24 hours) of Internet fame, and it also gives your web servers a good workout - a nice real-life performance test. We can watch the requests pour in, and the load balancers responding. As the traffic continues to climb, we see the system scaling to meet demand, launching more application instances - two, four, eight or even more - so that response times remain exactly the same, no matter how many requests we receive each second.

Response times remain low despite the climbing load

 

Business as usual

Our website is hosted, just like our customers’ sites, on our Aldryn Cloud platform. Aldryn was made for this - if it failed to meet this kind of demand, that would be embarrassing indeed.

A platform like Aldryn is expected to be able to ramp up resource allocations smoothly and automatically to meet anything that’s thrown at the websites its hosts, even if the traffic it receives multiplies hundreds or thousands of times over just a couple of hours, and to scale back down again afterwards, because after all, these resources cost money.

So in a way, even a day out on Hacker News is just business as usual, as far as handling web traffic is concerned. Automatic, flexible and massively scalable hosting is now what people expect by default.

A thousand new websites in four hours

Sometimes, however, what gets to pole position on Hacker News doesn’t just send more visitors to our website to read a news item. If the news item is about a major new release of django CMS for example, it will invite readers to try out a new demo, and often a very large proportion of our visitors will take up the invitation.

A django CMS live demo isn’t just a few pages on our website. Each instance is a new installation, a complete system. They are containerised Ubuntu systems running in Docker, and each one has to be deployed. A deployment creates new Docker containers, building each from standard and custom images, and finally hooking it into our deployment architecture so that the django CMS instance it hosts can be made available on the web.

Every single visitor who hits the Try a live demo link gets their own individual demo site, running on its own containerised operating system.

As far as resources are concerned, each demo is much the same as a customer project on our system: everything required to deploy, host and manage a Django-based website. The only real differences are that we don’t provide separate Live and Staging server instances for demo users, and that demos last for 15 minutes, not indefinitely.

What’s more, we deploy these demos on exactly the same infrastructure as our regular customer projects - despite the sudden multiplication of load, we can afford to do that without a risk that performance will be affected, because the infrastructure scales so efficiently and seamlessly.

When we launch a new project, whether a demo or an actual customer project, Aldryn:

  • acquires the images

  • builds the containers

  • runs the automated processes to install application software - Django, django CMS, and numerous other Python packages (from our own pip servers), and also various binaries

  • runs through the Django deployment process, creating the new project, running all of its migrations, collecting static files and so on

  • runs the frontend deployment process, using Gulp and npm to install packages, compile code, pre-process CSS and so on

Now let’s say that over a four hour period one hundred thousand hackers see us mentioned at the top of Hacker News, and that one in ten of them actually hit the link to visit our website, and that one in ten of those go on to try out the demo.

That’s one thousand demos to be served - one thousand complete systems, the equivalent in effect of a thousand new websites and a thousand new servers to be built for a thousand new customers - over four hours.

Soaring demand for demos

 

Bear in mind that everything we consume in order to provide these demos costs us money. It’s critical for us that our demos, like all our other operations, consume the minimum resources possible. We want as many as possible of the CPU, bandwidth and memory resources we use to go towards making users’ interactions with the system faster and their sites more resilient, not towards unnecessarily-hungry deployment processes.

To give you an idea how aggressively we chase efficiency gains: just over a year ago, an Aldryn deployment would take around 14 minutes. Since then, without throwing more resources at the system, and just by implementing efficiency improvements, we’ve been able to lower that to under four minutes. For demos, we’ve been able to take advantage of additional optimisation, and cut each demo deployment to around 50 seconds.

In our four-hour period at the height of Hacker News stardom, that’s a saving of 13 thousand minutes of processing time.

But still, from start to finish, an Aldryn deployment of a django CMS demo site takes nearly a minute. The patience of a web visitor is measured in fractions of a second, not minutes. Quite obviously, a visitor from Hacker News is not going to wait four minutes to try out a demo, and in fact, our demos are available instantly.

How we do it

So, what we do is have a pool of demos ready to be allocated to visitors, and a system that will deploy new ones to add to the pool. But we don’t maintain a standing pool of a thousand demos, just waiting to be used.

Thanks to the efficiency improvements our engineering team have been able to procure in the system, we’re now in the happy position of being able to provide almost unlimited numbers of free demos without needing to worry about either being able to meet the demand, or the financial cost of the resources required to meet it.

The next time news of a django CMS release gets to the top of Hacker News and a thousand Django fans launch their demo the web analytics mountains can peak as savagely as they like, because behind the scenes the infrastructure can spawn new instances to match them without breaking into a sweat.

Try it for yourself

In the meantime,

blog comments powered by Disqus