Windows File Server or EMC File Services? You decide…

I was just asked if EMC Marketing had a competitive slide deck covering the differences between Windows File Services in Windows Server 2012R2 and the EMC File Portfolio of products.  Hmmm, I thought… What ARE the differences?  There are so many, I'm not sure where to start.  So let's do this: let's start with an old expression; "the mother of invention is necessity".  In this context, that means, "what was the initial set of requirements?"  Let's take a look at a potential list:

  1. Store "home" directories
  2. Store high-performance VHDx files in a Hyper-V Server Cluster
  3. Replicate advertising content (videos, PowerPoint docs, and graphics files used for placing ads in our magazines and web sites) to another site 120-miles away so they can use the unified file space
  4. Link a bunch of file servers together using DFS-R (Distributed File System - Replication) Services
  5. Store a single department's files in a single remote location
  6. Store user files so that they can access them on any mobile device anywhere in the world without using VPN

Ok… that's a diverse set of requirements.  So many use cases, but all of them using file servers to reach the goals.  What should we do?

Here's where it gets easy, but also gets interesting. By clearly defining what you need to accomplish, you can easily match needs to features or offerings.  For example: 1. Store Home Directories: how large are they?, are there capacity quotas?, are they scanned for corporate governance compliance (data that might contain credit card numbers)?, are the directories synchronized to laptops using My Documents redirection?  If all these are true, you may need the advanced features offered by an Isilon Scale Out file system. Check out Isilon Family - Big Data Storage, Scale-out NAS Storage - EMC

What about #2. Store high performance VHDx files for use in a Hyper-V Cluster?  Well, now we are in a completely different place, eh? All of a sudden we have concerns for performance, availability, clustering support, metadata handlers, snapshots, recoveries, potential replication concerns, we need to arrange VSS backups of individual files.  It's like we aren't talking about a file server anymore.  We are finding more and more customers clamoring for the advanced, scale-to-fit, extremely efficient, and trusted VNX Family - Unified Storage Hardware and Software for Midrange - EMC.

#3 - Replicated and shared file spaces have been a challenge for IT professionals since the dawn of IT.  Microsoft's Distributed File System Replication is an extremely evolved solution for highly specific use cases.  Sharing replicated file spaces is a tricky task, Windows Server 2012R2 delivers a unique and highly optimized set of tools to allow small sets a data to be replicated among sites efficiently.  There are complex setup steps, but Virtualizing your Windows File Servers can ease management headaches and reduce costs!  Our paper on Microsoft Private Clouds is great starting place for your journey. http://www.emc.com/collateral/whitepaper/h11228-management-integration-cloud-wp.pdf  Likewise, #4, specifically calls for DFS-R -- EMC technologies, when working WITH Microsoft technologies will lower costs, reduce downtime, reduce support issues, and allow you to reach your business goals faster.

#5 -- Store a single departments files in a single location? This is one of the requirements that can go in so many directions because it seems as if the requirement is simple.  The problem here is latent -- it doesn't present itself at first.  At first blush, you might think, "hey, no problem: stick a VM out at the remote office.  Back it up with Avamar - Backup and Recovery, Data Deduplication - EMC , and I'm done.  Maybe you are, maybe you're not.  What if the remote location has power issues, physical security issues, the staff constantly deletes files and needs point-in-time recoveries routinely?  The simple problem just consumed your week.  EMC knows that these little "ankle biter issues" are detailers for IT shops. Handling remote file usage has become the bane of so many IT shops and IT managers.  EMC sees that there is more to remote file access than placing a VM at a remote location; that's why we introduced leading technologies to assist you and your users get what they want when they want it.  My ol' pal Paul Galjan has posted an article to get you thinking about the possibilities: http://flippingbits.typepad.com/blog/2014/05/mobilize-sharepoint-with-syncplicity.html

#6.  Access anywhere files.  Oddly, sometimes Dropbox just isn't good enough.  That's why EMC launched Syncplicity.  Please take a moment to see all you can do for your increasing group of remote and mobile users. Features

The summary to all this is that File doesn't mean File Server anymore.  It means storage.  Every storage scenario is different and that's why EMC has a proud portfolio of offerings.  Not just services, not just products, not just software.  EMC has become the answer to an ever increasing number of questions and scenarios.  Thanks for reading.

Provisioning EMC Storage for Windows 4X Faster with ESI

The EMC Storage Integrator for Windows was made to simplify and automate many of the mundane tasks associated with provisioning storage in a Windows environment. It is a free download and comes complete with a simple MMC interface as well as PowerShell commandlets, SharePoint provisioning wizards, and System Center plugins for Ops Manager and Orchestrator. Good stuff here.


SQL Server Performance: Flash is applied at various points in the IO chain, not just at the DB!! Part1

This is part one… part two next week!

SQL Server: Where is Flash most effective?  Everywhere!

 

I am, what is commonly referred to as, a “reluctant blogger”.  I’m the guy that drives my Marketing department crazy.  They keep telling me, “Jim! You know stuff!  You need to share it!”  It’s not that I disagree, it’s that I’m fully aware that the Internet is loaded with content created by people who know lots of stuff. This time, however, I have information that I know won’t be somewhere else on the Internet.  This time, I can offer content that will make a difference. I’ll begin by telling you a bunch of stuff that you already know… In an effort to establish a baseline, a starting place…

SQL Server is a complex platform, but not so complex that we can’t talk about big concepts.  I’ll attempt to take the significant areas of SQL Server and explain the details that matter most to this discussion.  In this post, I’ll explain three things:

  1. Three various common types of SQL Server workloads,
  2. How SQL Server processes data, uses cache, the log stream, and the database files themselves, and
  3. How EMC Xtrem technology ties into SQL Server and into each various workload.

Three typical SQL Server workloads:

  1. OLTP – this is your run-of-the-mill on-line transaction processing database.  The kind that stores transactions like orders you place at Amazon or Zappo’s or Geeks.com or even Home Depot.  The transaction contains an order number, the items you purchased, and how you paid – maybe even your credit card number.  A typical OLTP database will Read 80% of the time, Write 20% of the time, and IO sizes of 64KB to the database with 8KB Writes to the log file.
  2. DSS/BI/Reporting – most SQL admins will resent the fact that I just globbed all three of those “clearly different” workloads into one category, but stick with me for a minute: all three have generally similar workload patterns.  All three ask for heavy use of Indexes to increase query speeds, all three generally receive updates in a dribble or data stream manner. Occasionally, there will be a nightly load of data but the newly imported/incoming dataset is generally small compared to the overall size of the existing data.  The most important element these three share is the use of TEMPDB. They all make extensive use of outer joins to answer queries.  This use of joins dictates the use of TEMPDB.  As we know, there is only one TEMPDB per SQL Instance, so it’s very difficult (read “impossible”) to dedicate TEMPDB resources to an individual database.
  3. OLAP – These are SQL Server applications that build cubes on a nightly basis – these cubes care a collection of pre-analyzed data that allows business units to make decisions based on “what the data shows”.  These cubes are not ad-hoc queries – they are a predefined “view” of the data with multiple dimensions.  They allow a business unit analyst to quickly answer questions such as “How many white plastic forks did we sell in Dayton after we ran the flyer in the Sunday newspaper where we advertised a sale on picnic blankets?”. Cube builds are hard.  Depending on how they are executed, they can read every bit of data in an entire database more than once.  A cleverly engineered cube build will run in twelve minutes.  A complex set of data can cause a SQL Server to work on the problem for several hours.
  4. Now let’s talk about how SQL processes data and why these various tasks create singular issues for each SQL server they run on.  SQL Server has three basic parts that make it work:
    1. The SQL Cache – this is the space in SQL memory set-aside to hold actual database data that SQL Server is using to fulfill queries, update rows/fields, delete data elements, and generally go about its business.  SQL Cache is finite – limited ultimately by the available RAM in the SQL Server OS.  (This will change in SQL Server 2014… teaser…).  The important concept is that everything a SQL Server does goes through the cache and the cache is limited.ACID.png
    2. The Log Writer – every SQL Server instance has one (and only one) log writer. Regardless of the number of databases and log files within an instance, each SQL Server instance has and uses a single log writer.  What does this mean?  Log writes are not parallelized.  They are serialized.  They happen one at a time.  The Log Writer is arguably the essence of data integrity and database consistency. Clearly, ACID (Atomic, Consistent, Isolated, and Discrete) uses the order, finite nature, and time element aspects of log writes to fulfill its mission.  In order for SQL to “continue processing”, each data element that denotes a change to “the database”, must result in a posting of the operation to the log file.  Everything goes into the log file.  As the operation and the data hit the log file, the data/metadata is considered “hardened”.  As data is replicated between servers, it is the write-order consistency that allows ACID to hold.  Without write-order integrity, ACID breaks.
    3. The database itself: comprised of .mdf files and .ndf files.  As the dBA adds files for the storage of database pages, the pages (in eight-page extents) are stored in a round-robin fashion amongst the database files. The dBA or even storage adin is free to place these database files on a single volume or separate them across a multitude of volumes.  Each volume can then be placed on a single storage device or may separate storage devices: on one disk, one RAID group, one storage array, or many of those – even across separate datacenters.  As long as the database server can access the volume (even SMB!), the file can be placed there.  Pages are fetched from the database files as the database server fails to find the pages it needs in SQL cache.  If the page is not in cache, it goes to the file that is supposed to have the page – a disk request is issued, and the page is returned (along with seven other pages in the extent).  Data is stored (written) into the files when SQL Server has “Spare cycles”.  The Lazy Writer grabs processor slices when they come available and “commits” the data that was hardened into the log file into the database files.  The very interesting part of this process is that SQL Server aggregates the writes before it makes the disk requests.  That means that SQL Server is very clever at assembling and sorting and consolidating the data that it “flushes-out” to the disk.  Detailed analysis shows that SQL Server’s writer optimizes data-to-disk operations by as much as 3.5X.  This means that the raw IO that SQL Server “wants” to write to disk as it comes into the database is actually reduced by a whopping 67%!  The evidence of this is seen in black and white when you look at the relationship between Primary SQL Server write IOs and the write IOs of its Database Mirroring partner.  The secondary server will display between 2 and 3.5 as many IOPS to the disk volumes as observed on the primary server.  The reason is that the Mirror, the secondary, cannot aggregate database writes as the primary does.  It is so fixated on getting the data into the database, that the aggregation code has been disabled!
    4. TEMPdB – Now, I said there were only three parts of a SQL Server – and there are – but number 4, here, is added for clarification.  TEMPdb exists just once for every SQL Server Instance – keep in mind that any given Server OS installation can contain a number of SQL Instances – this is simply the number of times the SQL engine is running on a given server OS. Why would you want more than one? Clustering is one reason – you may want to allow SQL databases to run on all nodes of your OS cluster.  Another reason is TEMPdB.  As was just mentioned, a single TEMPdB is allowed to operate for any single instance of SQL Server.  TEMPdB can be distributed among multiple database files, just like any database.  Logging occurs just like any other database, but not exactly the same way.  For example, you cannot back up TEMPdB. There is no reason to, and you cannot restore it either.  TEMPdB exists for SQL Server to store temporary data while it’s working on the final answer to queries that require external joins.  The data from these table-joins lands in TEMPdB and is then used again as the source for the question that was actually asked.  After the query is completed, TEMPdB ditches all the data that was fetched into it.  It is not uncommon for Ad-Hoc queries to make extensive use of TEMPdB. It is also not uncommon to see a 40/60 Read/Write ratio, where more than half of the data that SQL Server fetches is discarded – only to be fetched again for another Ad-Hoc query.RoleOfTEMPDB.png

Ok…  Now, I realize that many dBAs who are reading this are thinking, “Dude, that is the most simplified, stupid, made-for-idiots explanation ever.”  And perhaps it is, but, as simple and basic as the explanation is, it’s a solid and factual representation of what happens inside a SQL Server.

With this information, we can start to understand how accelerating each of these areas can help a SQL Server go faster.  The idea is that “Performance” is not inexpensive. Performance can be achieved in a number of ways – beginning with cleverly assembled queries, indexes, and insert operations.  Addressing ways to make “good” code run faster, though, we can begin to apply various hardware technologies to SQL Server architectures to improve their speed.

There are three significant Xtrem technologies at this time (more coming in 2014 and 2015!).  The three are:

  1. Xtrem SF (Server Flash) – a hardware device that fits directly into a server.  The device is a multi-lane PCIe card that brings ultra-fast storage to a single server.  By itself, it is High Speed Storage.  It is available in a multitude of capacities and at least two NAND cell types.
  2. XtremSW Cache (Software) – device driver and management interface that allows a single host to store previously fetched data on a local storage device (it does not need to be XtremSF! But there are management benefits if it is…).  The caching software allows a SQL Server (or any application: Exchange, Sharepoint, Dynamics, etc.) to fetch a recently used and discarded data page without going all the way out to the rotating disk. The intention is that a relatively large dataset can reside on a local flash device (200ns!) but be protected by enterprise-class storage (3-6ms).  It’s a matter of bringing milliseconds down to nanoseconds.
  3. XtremIO (Storage Device) – this is an array device specifically built from the ground-up to benefit from high-speed disk drives.  Everything about the XtremIO device is based on the operating aspects of SSDs – but is not specifically built for SSDs.  If you can look into a crystal ball and see “the next thing” that’s faster than an SSD, XtremIO will use it.  It can absorb data like nobody’s business.  And the more you pump at it, the faster it seems to go.

Based on these ten elements, let’s begin to create linkages. So, just for review: we have three types of workloads, three types of SQL Server components (with a special mention for a fourth, TEMPdB), and three Xtrem technologies.  Where do each of these technologies fit each of these various workloads and/or SQL components? And why?

Combo number 1) OLTP makes heavy use of a small number of data elements as it processes orders/transactions.  These data elements could be a customer list, a parts list, a table of ZIP codes, shipping costs based on the number of shipping zones that a package needs to traverse on its way to a customer’s location.  The idea is that the SQL Server needs to have these lists handy at all times.  The traditional approach to keeping these data elements “close” is to increase server RAM.  This is a great plan, but can have two potential downsides:

1) RAM is ultimately finite – if I virtualize, I am taking and potentially reserving pooled RAM for a single server and I might actually exhaust all of the RAM that a physical server has available. There are many Intel-based servers that “max-out” at 96GB or even 192GB of physical RAM.  If I have a large number of VMs “fighting” for that RAM, I may run out. And,

2) assigning RAM to a VM is not all upside. As I add RAM, I also need to facilitate paging file capacity on physical disk.  The more I add, the more physical hard disk capacity I burn.  Now, disk is “cheap” so this is not a huge concern, but it needs to be addressed as I add server RAM.

 

 

 

Part 2 next week!

Windows Azure offers DCs in the cloud

Not often discussed, but very interesting… Microsoft Windows Azure offers, among all sorts of other things, the ability to create and join a DC to your corporate domain!  You can imagine the possibilities.

Install a replica domain controller in Windows Azure

You can imagine the benefits… the question becomes, "is Azure the best place for me to keep one of my Active Directory Domain Controllers?".  There are a multitude of local and national EMC-Powered Infrastructure-as-a-Service Providers who can offer not only DC's in the cloud, but all sorts of other customizable options as well.  A quick look at EMC Cloud Service Providers - Americas.

Managing Fibre Channel in VMM with SMI-S or How I Got in the Zone

Greetings from the Microsoft Technology Center in Silicon Valley (MTCSV) in Mountain View, CA.  I have been putting in a lot of time lately on the new System Center 2012 R2 Virtual Machine Manager infrastructure that is hosting all the operational compute and storage for the MTC.   There are numerous blade chassis and rack mount servers from various vendors as well as multiple storage devices including 2 EMC VMAX arrays and a new 2nd generation VNX 5400.  We have been using the SMI-S provider from EMC to provision storage to Windows hosts for a while now.  There is a lot of material available on the EMC SMI-S provider and VMM so I am not going to write about that today.  I want to focus on something new in the 2012 R2 release of VMM – integration with SMI-S for fibre channel fabrics.

 

There are many advantages to provisioning storage to Windows host and virtual machines over fibre channel networks or fabrics.  Most enterprise customers have expressed interest in continuing to utilize their existing investments in fibre channel and would like to see better tools and integration for management.  Microsoft has been supporting integration with many types of hardware devices through VMM and other System Center tools to enable centralized data center management.  The Storage Management Initiative Standard (SMI-S) has been a tremendously useful architecture for bringing together devices from different vendors into a unified management framework.  This article is focused on SMI-S providers for fibre channel fabrics.

 

If you right click on the Fibre Channel Fabrics item under Storage in Fabric view and select the Add Storage devices option you will bring up a wizard.

FC Fabric menu.PNG.png

The first screen of the wizard shows the new option for 2012 R2 highlighted below.

Add SMIS FC.PNG.png

We are using the Brocade SMI-S provider for Fibre Channel fabrics.  The provider is shipped with the Brocade Network Advisor fabric management tools.  We are using BNA version 12.0.3 in the MTCSV environment.  The wizard will ask you for the FQDN or IP of the SMI-S provider that you wish to connect too.  It will also ask for credentials.  We are doing a non-SSL implementation and we left the provider listen on the default port of 5988.  That is all there is to the discovery wizard.  The VMM server will bring back the current configuration data from the fibre channel fabric(s) that the SMI-S provider knows about.  In our case we have fully redundant A/B networks with 4 switches per fabric.  Here is what the VMM UI shows after discovery is complete.

Discovered Fabrics.PNG.png

Once we have discovered the fabrics we can go a the properties of a server that has FC adapters connected to one or more of our managed switches.  The first highlight below show that VMM now knows what fabric each adapter is connected.  This allows VMM to intelligently select what storage devices and ports can be accessed by this server adapter when creating new zones.  That’s right; with VMM 2012 R2 and the appropriate SMI-S providers for your storage and FC fabric you can do zoning right from within the VMM environment.  This is huge!

BL-460 properties.PNG.png

The second highlight above show the HyperV virtual SAN that we created in VMM for each of the adapters.  The virtual SAN feature set was released with Windows Server 2012 HyperV.  It is the technology that allows direct access to fibre channel LUNs from a virtual machine that can replace pass through disks in most cases.  That is also a really big topic so I’m going to write about that more in the context of VMM and fibre channel fabrics in a later article.  For today I want to focus on the use of VMM for provisioning fibre channel storage to HyperV parent clusters.  Now let’s take a look at the zoning operations in VMM.

 

The next figure show the Storage properties for a server that is part of a 5 node cluster.  The properties show what storage arrays are available through fibre channel zoning.  You can also see the zones (active and inactive) that map this server to storage arrays.

storage properties.PNG.png

Lastly, I want to show you how to zone this server to another storage array.  The place tostart is in the storage properties window shown above.  Click the Add | Add storage array icons to get to this screen.

create new zone.PNG.png

As you can see from the window title this is the correct place to create a new zone.  This is the same regardless of whether this is the first or third array (as in this case) you are zoning to the selected server.  I highlighted the Show aliases check box that I selected while making the above selections.  In order for the friendly name zoning aliases to be available they must be created in the BNA zoning UI after the server has been connected to one of the switches in this fabric.  You can also see the zone name that I entered that will be important when I move to the final steps to complete this example

.

Now that the zone has been created let’s take a look at the Fibre channel fabrics details.

FC Fabrics and Zones.png

I’ve highlighted the total zones defined in the Inactive and Active sets for the A fabric.  This shows that new zones have been created but have not yet been moved into the Active zone set.  If you open the properties of the Inactive zone set and sort the Zone Name column you can see the zone that we created 2 steps above.

FAB_A Properties.png

In order to activate this zone use the Activate Zoneset button on the ribbon.  One important detail is that you can only activate all the staged zones or none of them.  There are two zones that are in the Inactive zoneset that will be activated if you push the button.  Be sure to coordinate the staging and activation of zones in the event the tool is being shared with multiple teams or users.

Activate Zoneset.png

The world of private cloud management is changing rapidly.  VMM and the other System Center products have made huge advancements in the last two releases.  The investments that Microsoft and storage product vendors have been making in  SMI-S integration tools are starting to bring real value to private cloud management.  Take a look, I think you’ll be surprised.

 

Phil Hummel