(TL;DR: using an SSD cache in front of EBS can give a massive I/O throughput boost)
Internally, Swrve is built around an event-processing pipeline, processing data sent from 100 million devices around the world each month, in real time, with an average events-per-second throughput in the tens of thousands.
Each event processor (or EP) stores its aggregated, per-user state as BDB-JE files on disk, and we use EBS volumes to store these files. We have a relatively large number of beefy machines in EC2 which perform this task, so we’re quite price-sensitive where these are concerned.
We gain several advantages from using EBS:
- easy snapshotting: EBS supports this nicely.
- easy resizing of volumes: it’s pretty much impossible to run out of disk space with a well-monitored EBS setup; just snapshot and resize.
- reliability in the face of host failure: just detach the volumes and reattach elsewhere.
EBS hasn’t had a blemish-free reliability record, but this doesn’t pose a problem for us — we store our primary backups off-EBS, as compressed files on S3, so we are reasonably resilient to any data corruption risks with EBS, and in another major EBS outage could fire up replacement ephemeral-storage-based hosts using these. In the worst case scenario, we can even wipe out the BDB and regenerate it from scratch, if required, since we keep the raw input files as an ultimate “source of truth”.
One potential downside of EBS is variable performance; this can be addressed by using Provisioned IOPS on the EBS volumes, but this is relatively expensive.
However, our architecture allows EPs to slow down without major impact elsewhere, and in extreme circumstances will safely fall back to a slower, non-realtime transmission system. If things get that slow, they generally recover quickly enough, but in the worst-case scenario our ops team are alerted and can choose to reprovision on another host, or split up the EP’s load, as appropriate. This allows us to safely use the (cheaper) variable-performance EBS volumes instead of PIOPS; it turns out these can actually perform very well, albeit in a “spiky” manner, with occasional less-performant spikes of slowness.
As we were looking at new EC2 instances several months back, we noticed that decent spec, high-RAM, high-CPU instances were starting to appear with SSDs attached in the form of the “c3” instance class. These SSDs, in the form of two 40GB devices per instance, were far too small to fit our BDB files with enough headroom for safety, unfortunately. But could they be used as a cache?
Our data model tends to see lots of accesses to recent data, with a very long tail of accesses to anything older than a few days, so there was a good chance caching would have a good effect on performance. Let’s find out!