I often encounter the misconception that EFDs are not beneficial unless you need to either reduce latencies below what traditional disks can get you, or you’re short-stroking your disk in order to maintain performance. So I figured I’d go through the three general use cases I talk about with EFDs:
- Do more stuff
- Do the same stuff faster
- Do the same stuff, but with less gear
These are not mutually exclusive. In most cases, EFDs allow people to do more stuff, faster, with less gear. But your goals for EFDs will certainly flavor how to best deploy them.
Do more stuff (increase scale)
Let’s use an entirely contrived order processing system (like a trading desk). Let’s say this system can support 1,000 trades a minute. But during peak trading times, you’re getting more trades than you can process.
The business case for EFDs here would be that you can increase revenue by processing more orders. Here’s an example where six EFDs supported seven times the transactions of six traditional fibre channel drives. And this is a perfect example of how increased scale and reduced latency are not mutually exclusive – the response times on the EFD drives were 7 times lower than the response times on the spinning disks.
Note that both of the cases thus far really depend on how much your EFDs cost, and how much productivity improvement you’re going to see from their deployment. That’s significantly different than this one:
Do the same stuff faster (reduce latency):
Let’s take a large manual order-entry system, where user wait time for a query is 5 seconds, and users do about a one query per minute. Let’s say the performance gate in this scenario is storage and it’s getting about 5-7 ms latency (about as good as you can get with a performance HDD at scale due to rotational latency).
The business case for EFDs here would be that employees in this role spend about 8% of their time waiting on the database. If you can reduce that to 1.6%, you realize massive productivity improvements.
Here’s some data: Note the graphs are not on the same scale.
Do the same stuff, but with less gear (decrease footprint):
Let’s say that you’ve got an application that’s fat and happy residing on ninety 10k performance HDDs. You’re not short-stroking them too badly, but it’s still taking about $6,000 a year to power them, $2,000 a year to cool them, about 18U of rack space to store them, not to mention the maintenance cost associated with them. But the fact is that in most environments, only part of the data set is actually frequently accessed. An example of this might be an order processing or inventory system that may go back years or even decades. The old data isn’t accessed very frequently at all, while the newer data is getting hammered.
Using either manual or automated tiering, you put the older data on cheaper, denser, and more power efficient nl-SAS drives, while the most frequently accessed data can reside on faster, less dense (but still more power efficient) EFDs.
Here’s some supporting data. Now, the performance was slightly increased, but the bulk of the savings came in the form of footprint – acquisition, power, management, and so forth. Had the goal been to increased scale or reduce response time, then the deployment method would have differed.