Microsoft SQL Server is introducing a good number of new features with the upcoming “SQL14″ release. Most notably, it implements an in-memory solution (code named Hekaton) .. however, there are other features included. The SQL14 Customer Technology Preview has been available for some time, and the CTP1 media was used for this exercise, running on Windows Server 2012. I decided to look at one of these new features called Buffer Pool Extension (BPE), and wondered how it might behave to some similar EMC technology that we’ve used and validated previously for SQL Server environments.
Buffer Pool Extension
As the name suggests BPE is a mechanism to extend the buffer pool space for SQL Server. The buffer pool is where the data pages reside as they are being processed to execute a query, and it’s generally limited by the main memory (DRAM) available on the server itself. While available memory in servers has been on the increase, so have the database sizes as well, having an adequately sized buffer pool helps keep data pages around, and keeping them around means that you don’t have to go to disk to execute a subsequent query that references these same pages. In general, performance is going to be better, the less disk I/O you have to generate.
Performance is of course based on the speeds and feeds. DRAM is typically very, very fast, disks on the other hand are orders of magnitude slower that DRAM speeds. More recently a rash of new “Server Flash” solutions have come to market – these are generally PCIe based solutions. These server flash solutions fall (in terms of performance) between DRAM and disks. This is because they’re sitting on the PCIe bus, and subsequently have a more efficient means to service I/O .. it helps that they are also flash based (no moving parts). These devices can also have very large throughput characteristics, and generally have pretty low latencies because of the performance characteristics of the PCIe bus. The other thing that they deliver is large amounts of storage at a cheaper cost that something like DRAM. Arguably Solid State Disks (SSDs or even Enterprise Flash Drives) have some of these characteristics, but drives of this type are much slower than Server Flash, because they live behind IDE or SAS controllers, FibreChannel controller or some other HBA.
So if you want to expand the SQL Server buffer pool so as to keep data pages around (besides what is available in DRAM) .. you will be able to use SQL14 BPE to help with that. Effectively, you define a Buffer Pool Extension as a physical file. You specify where the file lives (so the storage needs to be seen as an NTFS volume), and once defined, SQL14 will start to use this space to keep data pages around. Which data pages, and for how long, depends on the active dataset size, the space defined. One interesting rule is that “dirty pages” cannot exist on the BPE device. A dirty page is a page that has been updated, but has not been flushed to disk yet (of course the change is always written to the log file). Dirty pages are either flushed to disk by something like a lazy writer, or a checkpoint operation. Once a page no longer has changes to flush, it can be moved to the BPE storage. Equally, a page that is read to satisfy a query, but is not updated, may be put out on the BPE storage – if you’ve already read the page, then it’s trivial to push it to the BPE.
Configuring BPE is done via the ALTER SERVER T-SQL statement, for example, in this environment:
ALTER SERVER CONFIGURATION
SET BUFFER POOL EXTENSION ON
(FILENAME = ‘F:\BPE\EMCExtremeSF.BPE’, SIZE = 200GB);
You can also imagine that there are various algorithms in place to age the pages on the BPE storage, and discard the oldest/unused pages in preference to data that is just read, and might be re-read. It’s a complex beast when you consider the various activities going on. But the goal is simple … keep more data available on high performance (and I wold argue, low latency) storage, such that you can improve the overall efficiency of the database environment.
EMC Server Flash
For the testing, EMC’s ExtremSF device was used. In this instance, a 300 GB SLC version of the ExtremSF line, which now includes eMLC versions that provide over 1 TB of storage. The ExtremSF controller was used in two ways to look at optimizing the SQL14 environment. Firstly, it was used as a traditional storage device, which, through Disk Management was partitioned to provide a 200 GB storage allocation (actually the volume was a little larger than 200 GB, the BPE file itself was created at 200 GB size). In the second set of tests, the ExtremSF card was used in combination with the ExtremSW product to cache the SQL14 data files – more on that later.
The Test System
Because the testing needed to put some pressure on the SQL14 environment, I did what I would probably not recommend in any production environment, and that was to reduce the available amount of memory for the SQL14 instance to 20 GB. That would subsequently, severely limit buffer pool space, and limit scale as a result of requiring much more I/O. It also forces behavioral changes to SQL Server, forcing more writing of data pages, etc … again, I would never recommend doing this in practice! But these limits were constant across the tests, as was the workload, the only variables ended up being the use of BPE, and subsequently ExtremSW.
The database itself contained around 750 GB of data and index. Thus the dataset was 37.5 times larger than the total amount of memory allocated to SQL Server. Of the 20 GB allocated to SQL Server, only a portion of that is used by the SQL instance for the buffer pool .. so the ratio of data to buffer pool would actually be a little more extreme. But what’s more important to consider is the “active” dataset – for example, you could have a 1TB set of data and index, but if you are only actively accessing a very small portion of the data/index, then it doesn’t really matter how large the dataset is … it matters more about the data that you are actively touching. In this case, the OLTP workload was fairly random across all the data/index.
Another aspect that remained constant throughout all testing was the underlying storage used for the database itself. This was, unsurprisingly, an EMC array. Here I even limited the total number of spindles, thus forcing more I/O to the data files, and subsequently increasing latency.
Prior to each test, the database was restored from a backup. Multiple runs were executed for each configuration (BaseLine, BPE and ExtremSW), and the average was used in the presentation of results (unless stated otherwise).
So what did the relative performance look like?
The performance is presented in terms of relative difference, since the actual numbers themselves do not matter – just how the workload changed for the various configurations. The numbers used are the Transactions per minute (tpm).
So for the same workload, same configuration of DB, but varying the usage of server flash as being SQL14 Buffer Pool Extension or using the same server flash as EMC ExtremSW, the system processed 1.72 times more when using Buffer Pool Extension, and 2.32 times more when using the same infrastructure with ExtremSW.
But there’s more ….
Efficiency Vs Time
How long it takes to make effective use of the performance enhancing server flash is also interesting. So a quick comparison of the workloads against the ExtremSF card for both the BPE implementation and that of ExtremSW.
SQL Server Batch Requests per second is one metric that may be used to identify how much “work” is being done by SQL Server, as it is a metric to determine the number of statements being executed. Given that the workload is the same in every run, then this may give a comparison, and doing so, we see the following. (in this case, these are the numbers from two specific runs – not averaged across runs).
The X-axis in this case shows the time, in Hrs:Min:Sec from the start of each run, and since the two runs executed for different periods to time (the test with BPE enabled was run for much longer to allow the utilization to reach steady state) … you can see that the ExtremSW test was terminated after about 7 hours, although steady state was attained after only about 3hrs. The test run with BPE reached its steady state after about 10 hrs. Also worth noting is the slope of the change. ExtremSW had a much more aggressive improvement over a shorter period of time. Overall, at steady state, the ExtremSW environment was processing more batches/sec than the Buffer Pool Extension implementation.
If it’s the same card, why is there such a difference?
Given that the hardware used was the same, there are implementation characteristics that will alter the performance. Not the least of which is the aforementioned fact about the BPE storage only being able to hold non-dirty pages. As a result, any updated pages will need to be pushed to durable media before being moved into the BPE. That’s likely to be a small, but not necessarily trivial impact.
ExtremSW, on the other hand, is a rather different beast. In the Windows environment ExtremSW is implemented effectively as a filter driver. When configured as cache, the storage allocation on the ExtremSF card is used as a central storage (cache) pool by the driver. In the following image, the ExtremSF card is seen as HardDisk0 (Disk0 in the GUI). The “ExtremeSF” NTFS volume was created to consume space from the device, such that when the ExtremSW implementation was activated, it would only use 200 GB, which is the “OEM Partition” seen on that device.
Individual LUNs (HardDisks as seen by Windows) are then bound to this cache pool. As data is read from the disks, that data is stored in the cache immediately, and remains there until it becomes stale at which point it effectively gets dismissed. Data that is updated is also stored in the cache pool, and as of this release of ExtremSW, all writes are implemented as Pass-Thru, so the write has to go to the backing disk in all cases … but the updated state is retained in cache (you don’t need to re-read what you have written).
Thus all data is cached on reads and writes, so there’s a tendency to be more efficient. At least when comparing addition mechanisms that need to destage data out.
Again, in this configuration, the ExtremSW cache size was limited to 200 GB. So it was effectively the same space on the ExtremSF card as the BPE file. There were 12 data LUNs in use (HardDisk4 thru HardDisk15 having NTFS volumes DATA01 thru DATA12, as seen in the previous mage) and these were bound to the ExtremSW cache pool, by executing the following calls to the VFCMT utility (the management tool for ExtremSW).
vfcmt add -source_dev harddisk4 -cache_dev harddisk0
vfcmt add -source_dev harddisk5 -cache_dev harddisk0
vfcmt add -source_dev harddisk6 -cache_dev harddisk0
vfcmt add -source_dev harddisk7 -cache_dev harddisk0
vfcmt add -source_dev harddisk8 -cache_dev harddisk0
vfcmt add -source_dev harddisk9 -cache_dev harddisk0
vfcmt add -source_dev harddisk10 -cache_dev harddisk0
vfcmt add -source_dev harddisk11 -cache_dev harddisk0
vfcmt add -source_dev harddisk12 -cache_dev harddisk0
vfcmt add -source_dev harddisk13 -cache_dev harddisk0
vfcmt add -source_dev harddisk14 -cache_dev harddisk0
vfcmt add -source_dev harddisk15 -cache_dev harddisk0
SQL Server transaction log devices don’t really benefit from ExtremSW, and it’s not really recommended in such instances to include the transaction log. In this environment, the transaction log was on a separate LUN … HardDisk16 … and that was left out of the ExtremSW environment.
It’s clear that Buffer Pool Extension has a positive impact to this SQL Server workload. Its performance impact is definitely related to the characteristics of the storage used for the BPE file. Server based Flash storage devices, like ExtremSF, have the performance characteristics to improve the throughput of SQL Server environments. This testing was based on CTP1 of the SQL14 product, and much change could be expected in the intervening time before launch. As a result, performance may change with respect to efficiencies of BPE.
ExtremSW definitely is very efficient in improving the performance of SQL Server databases – there’s a number of papers that cover solutions using SQL Server 2008, etc. It’s also true that ExtremSW is not specifically tied to SQL Server. As mentioned, it’s a filter driver that binds to LUNs. What those LUNs are used for, is irrelevant to ExtremSW, because the implementation is simply going to cache the data on those devices. So if there’s an application that re-reads the same data, then it will see a benefit.
SQL Server Buffer Pool Extension is obviously a SQL Server feature, so its benefits are limited to this application. Conversely, BPE is included in the appropriate versions of SQL Server, so you get that with that version. ExtremSW is an incremental cost as it is a separately licensed solution. ExtremSF (the server flash card) is assumed to be common in both BPE and ExtremSW implementations.
In the end, the overall efficiency is also tied to the application, and the overall active dataset size. Again, in this case, the Dataset size was around 750 GB, the cache size (both BPE and ExtremSW) was 200 GB. As the ExtremSF card size, the dataset size and/or the active portion of the data changes, so will the results, and overall effect on any given environment. Alas, it is the great “It depends” .. because it does.