vSpecialists and an Exchange 2010 Private Cloud Webcast

So, I’ve been lucky enough to get invited to participate in an amazing webcast series with some really incredible speakers in August.

I’ll be talking about “Getting Microsoft Exchange 2010 into the Private Cloud” – a recommended new approach towards your company’s Exchange strategy with technical best practices to help you understand the term Private Cloud and turn it into something actionable and real.

Please sign up today!     Other sessions include:

  • Building Blocks for a 100% Virtualized and Protected Data Center
  • Disaster Avoidance with VMware vSphere and EMC VPLEX
  • vCenter Plug-ins from EMC


The other three webcasts are being delivered by some really smart folks:  Scott Baker, Scott Lowe, and Tee Glasgow – they are all vSpecialists!   The level of talent on this webcast series is daunting.  How can I measure up to these vSpecialists…  It’s like the Karate Kid trying to take on 3 ninjas.

Since vSpecialists are pretty much driving the virtualization agenda at EMC  …  and there’s three of them and one of me…  I thought it would be proper to not try to beat em, but join em, and become a “virtual” vSpecialist for the month of August.

Check it out (gotta love PhotoShop):

I've been working out - with Windows !

Oh yeah, read more about these vSpecialists here.    They’re doing amazing things and they are always looking for a few good geeks.

SQL Data Layout Best Practices

Hello and welcome to the first installment of SQL Velocity!


Over the coming weeks and months I will be documenting best practices for high performance database installations.  The focus of this material will be running SQL server on EMC Symmetrix arrays. 


This first installment covers separating SQL Functions to optimize storage I/O.  For the purposes of this installment we will separate databases into three basic types:

  • Online transaction Processing (OLTP)
  • Data Warehousing
  • Analysis Services (OLAP Cubes)

These database types are separated because they generally access their data files in very different ways. 

  • Online Transaction Processing systems are assumed to issue random read/write I/O to their data devices.
  • Data warehousing systems (often called Business Intelligence systems) often load data into the data warehouse using sequential writes, and access the data using Sequential read I/O’s
  • SQL Services Analysis Services (SSAS) utilizes the NTFS file system directly.  SSAS reads and writes these many files in a highly random pattern.  Many times SSAS will randomly read many files while creating new files sequentially.  SSAS can be an excellent candidate for high performance Enterprise Flash Drives.

Each database utilizes four distinct file types.  Each file type has its own data access pattern:

  • Data – Each different type of database performs either random or sequential I/O access patterns, as noted above.
  • Log – Log access is mostly sequential.  The database writes data to the log, and then writes the data into the database.  Note that SSAS does not use Log files.
  • Temporary database (TempDB) – TempDB is used when the system needs to perform temporary data base operations.  TempDB access is assumed to be highly Random.  SSAS does not use a TempDB device.  As a side note, TempDB can often benefit from high performance Enterprise Flash Drives (EFDs).
  • Backup – Database backup files should be stored on a separate backup volume.  Backup devices are accessed sequentially.

Best practices:

  • Aggregate Random I/O:  Random workload disk drive performance is based on rotational latency and seek times.  For example, a 10,000 RPM drive spins 166 times per second.  It takes some time to physically move the drives read/write head (this is measured as seek time) therefore I conservatively use 100 I/O’s per second (IOPS) to calculate how many drives are needed to support a given work load.  If I need to perform 1000 IOPS then I need 10 drives (Note:  There are a significant number of variables in these calculations; that data is beyond the scope of this post).  You will find that issuing I/O serially will not be able to achieve this performance threshold.  In order to reach optimal performance many I/O’s should be issued to the RAID set (the number of I/O’s the system can sustain is based on many factors, again beyond our scope today).
  • Separate Sequential I/O:  A single physical hard disk can perform about 10 times the number of sequential IOPS as random IOPS.  Sending two sequential data streams to the same drive becomes a random operation thus conceivably reducing performance by a factor of ten.  By isolating sequential operations, fewer drives are needed.  Using the same 10 drives listed above; sequential access will yield 10,000 IOPS vs. 1,000 random IOPS, based purely on performance requirements. Separating two sequential files onto 4 total drives will yield 4,000 IOPS vs. only 400 IOPS when the sequential operations are combined.

Based on these rules the log and backup devices should be isolated. 

That sounds really easy.  Dedicate a few drives to a small DB log file and a whole lot of drives to the backup drive, ensure that nothing else shares these drives and you are home free.

Note:  There is one other really good reason to separate database functions and that is Business Continuance.  If all data base functions are placed on the same physical disks then one disaster can render all data lost.  If, on the other hand, the backup and log files are on separate physical media there is a better chance at recovery. 

Imagine that someone manages to drive a forklift through your storage array.  If you separate the backup and log files then there is an excellent chance of recovering all data.  Since the SQL Server instance writes to the transaction log before updating the database, keeping the log files along with the backup files will enable a full data recovery.  Shame on the fork lift driver – lucky for you – you planned for that.  Planning for your datacenter falling into the ocean, catching fire, or losing power; well you should implement remote storage replication, but that is a topic for another time!

This is where things get a bit complicated.

One easy way to separate I/O should be to simply create different storage devices (called Logical Units or LUNS), and on each of those LUNs create a single separate NTFS volume, such as the following:

  • E$ – Backup
  • H$ – Data
  • O$ – Log
  • T$ – Temp DB

A Symmetrix array has two technologies that greatly enhance performance:

  • Hyper Volumes
  • Virtual Storage (often called virtual provisioning).

While these technologies will enhance performance they can make it more challenging for the average user to clearly identify where their storage actually resides.

The Symmetrix array splits each physical disk into pieces called Hyper Volumes.  The hyper volume offers an excellent performance boost over using a single physical disk.  Using 400GB physical disks a single volume of 1 terabyte would only utilize 8 physical disks (assuming that the disks are configured using RAID mirroring).  Utilizing 50GB hyper volumes this same 1TB volume can be spread across 40 mirrored volumes.  This offers 5 times the performance when compared to utilizing the basic 8 disks.


Figure 1: Hyper volumes

Understanding a specific volume layout is important.  An administrator may separate data files onto unique volumes, only to later discover that the underlying storage is amalgamated.

Figure 2 shows an example of SQL server sharing all Physical Disks.  This configuration will induce a condition known as excessive head seek.  This causes the disk read write heads to move all over the disk surface whenever an operation is initiated.  A worst case example would be running a production backup.  The backup operation will cause the disk heads to move rapidly, back and forth, as data needs to be read from the Data and Log volumes and written to the Backup volume.  This head seeking will cause I/O that is much slower than random I/O.


Figure 2: Poorly configured SQL layout

Whenever possible do not allow a SQL data file to “wrap” back onto the same physical disk.  Figure 3 is an example of a configuration that will cause excessive head seeking.  Ensure that enough physical disks are allocated to each volume so that it can support the necessary I/O load.  Balance highly utilized hyper volumes with less frequently accessed data.


Figure 3: Data volume wrapping

Many Symmetrix deployments will take advantage of Virtual Provisioning, also known as Thin Provisioning.  A thin data pool can greatly improve storage utilization and work significantly increase performance.  Ideally the administrator may create one pool for data and temporary data base files, and a second pool for log and backup files. 

Best Practices

·     Utilize SQL best practices to separate data files.  Visit www.SQLCAT.com for best practices from the Microsoft SQL Customer Advisory team and www.EMC.com/collateral/software/solution-overview/h2203-ms-sql-svr-symm-ldv.pdf for our EMC SQL server best practices tech book.

·     Using Standard Hyper Volumes:

o   Ensure that heavily used SQL volumes are paired with less frequently accessed data (such as an archive). 

o   This will dedicate maximum performance to the SQL workload, while ensuring storage capacity is well utilized.

·     Using Virtual Provisioning:

o   When possible create enough thin pools to ensure that backup and log data are separated from the database files.  This should boost performance and add an extra layer of reliability to the design.

o   Size the thin pool for performance and capacity.  Add enough physical disk resources to a thin pool to absorb the entire peak IOPS load that will be generated.  I like to use a worst case scenario, assuming all I/O is random, to determine the number of physical disks that are needed in every pool.

o   Balance data files that have high performance requirements with less frequently accessed data.

·     For the Symmetrix array:

o   Ensure that workloads are balanced across

§  Physical Disks

§  Disk Groups

§  Front End Ports


Figure 4: Well-balanced configuration

I hope you enjoyed installment number one.  Please feel free to post feedback, good or bad!



Exchange Server 2007 and 2010 Comparison Chart

Here’s a handy comparison chart that has been floating around a few presentations here at EMC (credit to M. Jones).   I figured it was time to share this one as many companies are asking us about helping them with their upgrade plans.   Clicking on it makes it bigger… but calling it interactive would be a stretch!

Great chart, eh?

Don’t you just love charts?? Me too.

Check this out this great tune, dedicated to charts – it’s a rocker!

SQL Data Warehousing with Fast Track

Put simply, Fast Track is a program launched by Microsoft that targets Data Warehousing workloads with a goal of achieving incredible performance with low-cost hardware.

These prescriptive configurations include a blend of resources and best practices that really leverage the latest features in SQL 2008 Enterprise:

  • CPU/Processors
  • Memory
  • Networking
  • Storage

EMC has helped to build out solutions that leverage the CLARiiON AX4 series.  You can read more details here.

EMC Consulting also does a great deal of work with Fast Track and advising clients on how to optimize their data warehouses.  See this excellent presentation from EMC’s James Rowland-Jones.  Just look out for Slide 14.

I’d love to see him do this live and the slides are clearly written with enough space to allow his personality to shine through.

Guess what?

You can see him speak at SQL Bits (Sept 30 – Oct 2) at York University in the UK.

James has also written books about SQL, and has also decoded the magic -E SQL startup parameter for serious SQL gurus.

A Very Interesting “Community-Focused” Project.

As a Microsoft Most Valuable Professional (MVP), I’ve had the opportunity to work with some pretty amazing people over the years, and have always been impressed with how “Community-Focused” this group of people can be.

I was, however, recently blown away by a friend of mine who’s also an MVP, and who I’ve had the pleasure of working with on numerous occasions. Arnie Rowland, who runs a consulting company based in the Pacific Northwest called "Westwood Consulting, Inc.” came up with what I think is a brilliant idea. (More on this in a second, but let me set the stage first)

One of the perks of being awarded the MVP award from Microsoft is you receive special attention from the product teams from time to time. Once in awhile a product team will decide to do something very nice for MVPs, like send some special SWAG, put on a special chat session, or offer an invite to take part in face to face meetings. Sometimes, in conjunction with the marketing teams, they offer SWAG that can be very impressive. Well, this year the Developer Division decided that developer MVPs would receive Not-For-Resale MSDN Subscription vouchers that they could give out any way that they chose. (Since many MVPs spend a fair amount of time in public speaking engagements, offering one as a giveaway for the event is likely what they had in mind). These things retail for just over $12K each, so it was certainly a very generous give-away. Of course it now begs the question, how do you maximize the value of these things and get them to people who could really benefit from them.

Here’s where Arnie comes in. He came up with this great idea, which in a nutshell says, “If you’re an unemployed or underemployed developer, we’ll give you free software and the information you need to use it if you’re willing to use it to help out a non-profit agency --- Oh, and you have to prove that you are willing to treat this seriously by submitting a proposal for the work you’ll do”. Arnie discusses this all in a blog post here: http://sqlblog.com/blogs/arnie_rowland/archive/2010/07/12/while-you-don-t-get-a-free-lunch-you-will-get-your-just-deserts.aspx All in all this is an amazing “Win-Win” type project. The un-or-underemployed developer receives over $12K worth of software (there’s more than just the MSDN subscription on the table) and some deserving non-profit organization gets a problem solved!

When I talked to Arnie about this, I realized that I definitely wanted to be involved, so I donated the MSDN subscriptions that I had been given to him for this project. As it turns out several other MVPs have decided to do the same, so this is starting to almost go viral. The guys on the ping show over on MSDN Channel 9 picked it up in a recent episode, and it was also mentioned in a recent MSDN Flash.

So, if you’re reading this and you’re interested in helping out a non-profit organization, head on over to Arnies Blog, and submit an idea. Who knows, you may end up with a very cool pack of software.