EMC AppSync for VNX / Microsoft Environments

We had a great time launching EMC AppSync in Las Vegas a few weeks back!

Some of the highlights were an on stage demo, an appearance on Chad’s World Live, 4 breakout sessions, and so much more. We got interviewed by industry analysts and taught our TCs what AppSync was all about.

We also launched a new ECN (EMC Community Network) space where I’ll be spending a lot of time in the future. The product becomes officially available later this year and now we’re handling all of the customer requests to join our beta program and learn more about the product.

If you want to find out more about the launch and if you want to ask a question – go ahead and ask one over here!

Application Protection: There’s Something Happening Here

There’s something happening here
What it is ain’t exactly clear
There’s a man with a gun over there
Telling me I got to beware

Yes, it’s blasphemy to simply change a classic like Buffalo Springfield’s “For What’s It Worth” – but I will anyway to prove my point.

There’s something happening here

If you haven’t noticed, IT is changing rapidly. Just search for IT transformation, IT as a Service, and converged infrastructure to see how far we’ve come in only the past few years.  This industry moves!

What it is ain’t exactly clear

We know a Cloud is built differently, operated differently, and consumed differently. So we know companies have begun re-architecting IT in order to offer more of a service in order to react faster to meet user needs. They know they must change their operational models and in many cases their organizational structure. They might also seek converged infrastructures to get moving faster.    But… has protection changed to keep pace with this transformation?

There’s a man with a gun over there
Telling me I got to beware

It’s been said that in the song the gun is more of a metaphor for the tension between groups within the US before Vietnam. And in a much less violent analogy, the tension between the IT team and the application owners has never been stronger.

The application teams want to have great performance and protection of their application. But they’ve never been empowered by the IT department to protect themselves with storage-level tools. The storage team wants to let them, but they fear they might create too many copies of their data. Instead, the app owners went out and used tools for their own application, creating their own protection strategy which might not deliver the best protection they can get.  To win back the hearts and minds of the application owners and DBA’s, the IT department and the storage teams need to get better at protecting applications as a service.

On the Road to Application Protection as a Service

Many companies have has attempted to do this in the past – with products that help you protect and restore your applications and critical virtual machines. They have tools that install on the server and can “freeze” and “thaw” the current transactions into the database, so that when a snapshot is taken, there is a clean copy that can be easily restored.  The major benefit of these tools is SPEED as the copy process is incremental and the restore process is also lightning fast.  Restoring a 1 TB database in minutes.

It needs to get easier. Like any “enterprise” tool, many of these products designed for snapshots and replication require a significant learning curve. We need something simple that integrates with the tools we know and love.

We should provide self-service capabilities. Instead of spending hours and hours making sure application owners are getting the protection they need, they should be empowered to simply protect and restore their own data.

We are driven by service levels. IT departments and storage teams need to offer “protection service catalogs” with various (e.g. Platinum, Gold, Silver, Bronze) levels of protection varied by RPO – from very low data loss (synchronous replication) to more sporadic application-consistent snapshots – all from one interface. This makes it easy for the app team and people with the checkbooks to really understand the value placed on the different applications in your catalog.

There truly is something happening here
And what is will be made clear at EMC World 2012!

Hope to see you there!
Brian

Podcast: Integrated Exchange Single Item Restore

Like finding a needle in a haystack, EMC’s ItemPoint helps you find and extracts emails from disk-based copies made by EMC’s Replication Manager.

Please take a minute to listen to Neil Salamack and I hit the big points of why and how you’d do this. (Links to an MP3 file)

Trivia: That’s Neil on guitar during the opening music.

Background Database Maintenance in Exchange 2010

Database size is a critical aspect to consider in Exchange 2010 designs where DAG is in play. This is regardless of the disk backend you choose. JBOD, DAS RAID, and SAN designs all need planning where small passive databases are in play. Since Microsoft offers no configurability of BDM on passive databases, there is nothing you can do but be aware of and plan for the workload (aside from going with a standalone configuration).

Many Exchange administrators are used to having small databases (100-200GB) so that ESE maintenance tasks and restore SLAs are easier to address.  I find that this can trip up otherwise solid storage architectures.  BDM can lead to problems, primarily because admins aren’t necessarily aware that BDM schedules that they set via EMC or powershell applies only to the active copy of the database.

Microsoft calls background database maintenance “online database scanning” or “database checksumming” or “background database maintenance”.  It can be associated with page zeroing.  It’s a googlicious challenge – and a tagging nightmare for bloggers. 

Let’s see what Technet says about BDM (aka “checksumming”):

Background database maintenance I/O is sequential database file I/O associated with checksumming both active and passive database copies. Background database maintenance has the following characteristics:

  • On active databases, it can be configured to run either 24 × 7 or during the online maintenance window. Background database maintenance (Checksum) runs against passive database copies 24 × 7. For more information, see "Online Database Scanning" in the New Exchange Core Store Functionality topic.
  • Reads approximately 5 MB per second for each actively scanning database (both active and passive copies). The I/O is 100 percent sequential, so the storage subsystem can process the I/Os efficiently.
  • Stops scanning the database if the checksum pass completes in less than 24 hours.
  • Issues a warning event if the scan doesn't complete within three days (not configurable).

Let’s reiterate interesting bit:

Background database maintenance (Checksum) runs against passive database copies 24 × 7.

Now let’s look at what another technet page says about this:

Exchange scans the database no more than once per day. This read I/O is 100 percent sequential (which makes it easy on the disk) and equates to a scanning rate of about 5 megabytes (MB)/sec on most systems.

The important thing to remember is that when you set a maintenance schedule for a database as described on this technet page, the setting applies only to the active copy of the database.

Now, how do you plan for this workload?  Frequently, you don’t have to worry about it at all.  Let’s say you have 5000 users with 2GB mailboxes on a server.  That’s 10 TB of mailbox data.  If you have 2 TB mailbox databases, you have about 5 databases, or about 25 MB/s in BDM workload.  Not a problem.

However, if you limit your databases to 100GB, that can present a problem.  That 5000 users translates to 100 databases, each of which can launch a 5MB/s read workload (or more) at any given time.  Aggregated over a whole single mailbox server, 500MB/s is nothing to sneeze at, especially if it’s mixed with user workload on the same disk. 

What you can do to limit the impact of BDM:

  1. Use larger databases (1-2TB in size).  Although it will increase the amount of time required to maintain the databases, remember that with a DAG configuration, it’s often more expedient to reseed than it is to do ESE maintenance. 
  2. If you are concerned about restore times, use an array that can act as a VSS provider and store your first-tier backups in snapshots, clones, or CDP bookmarks.  In these scenarios, restore times are largely uniform whether you have a 1TB database or a 100GB database (log replay notwithstanding).
  3. If after considering hardware-assisted VSS, you STILL can’t leverage larger databases, confer with your storage vendor to architect for what will turn out to be large block reads at unpredictable rates and unpredictable intervals.  To the best of your ability, isolate the passive from active databases so that BDM doesn’t impact user mailboxes.
  4. Ask Microsoft why passive database maintenance is not configurable.  As it is today, a completely passive Exchange server can generate over 500MB/s in read bandwidth.  Let’s be crystal clear about this: 500+ MB/s is data warehouse territory, and it’s absurd to think that this can occur with no active users on it, and no backups running against it.  It makes little sense, especially in configurations where the silent corruption they’re looking for is detected by other techniques.

Now, I have a couple questions I haven’t been able to get an answer to:

  1. What happens to the schedule when checksum pass completes in less than 24 hours?  Does it start again in 24 hours?  Is that from the start of the prior run, or from the end of the prior run?  An authoritative answer would be helpful for those who have to plan for this workload.
  2. What factors can make the scanning run faster or slower than “most” systems?  Does the system issue a 256KB every XX ms?  Does that depend on the processor clock rate?  Can we thus expect the read rate to increase with clock rate?  The difference between 5-8 MB/s doesn’t really matter on a single database, but it really does matter when you’re talking about hundreds or thousands of databases on a single disk array.

EMC’s Exchange 2010 Storage Best Practices and Design Guidance

It has been a while since I’ve posted something new, so I am back and now want to catchup with a couple of new and cool things that we have been working on.

Recently, we released a great whitepaper that we’ve been working on as a result of all of our Solutions work against Exchange 2010. We’ve derived many best practices for sizing and design for deploying Exchange 2010 on EMC VNX and EMC VMAX platforms and we want to share that with you.

In this paper, we will discuss disk and RAID selection, LUN layout, and how to size when you might have already used the Exchange Role Calc and want to apply that detail to your EMC array design.

We also answer some key questions such as:

  • Should I consider thin LUN with Exchange 2010 on VNX? What about VMAX?
  • What are some of the most important design considerations for Exchange 2010 on VNX and VMAX?
  • Should I use FAST VP with Exchange 2010 or VNX or VMAX?
  • Should I consider FAST Cache in my Exchange design?
  • What about Storage Pools vs RAID groups on VNX?

Examples of Exchange storage building blocks are also discussed along with the methodology that we use in our own Solutions labs to derive a desired storage design. Our customers also want to know what our recommendations are when HA/DR is considered, such as what options are there to apply additional protection against my Exchange DAG?

I encourage you to please check out this paper if you have not already seen it. You can find it on our new EMC Community Network “Everything Microsoft at EMC” site at: https://community.emc.com/community/connect/everything_microsoft

My collegue Adrian Simays who runs the excellent Virtual Winfrastructure Blog and his team have worked tirelessly to create this excellent community site and I think you will find it to be very informative resource on everything Microsoft at EMC.

As well, this paper has been posted on EMC.COM at: http://www.emc.com/collateral/hardware/white-papers/h8888-exch-2010-storage-best-pract-design-guid-emc-storage.pdf

Until next time,

Dustin