EMC Replication Enabler for Exchange and Microsoft Hyper-V

There has been a lot of buzz around EMC Replication Enabler for Exchange (REE) which is our free plug in on the CLARiiON platform that allows for EMC’s implementation of Microsoft’s Third Party Replication API for Exchange Server 2010. This plug in allows for management of the replication solution when used with MirrorView/S and RecoverPoint. Basically, Third Party Replication API was built by the Exchange Product Group to allow Third Party vendors to ‘plug-in’ to the Database Availability Group (DAG) framework and instead of leveraging the native Continuous Replication within the DAG, leverage other forms of data replication such as array based replication. 

See below for how this works:

As part of the upcoming Microsoft Exchange Tested Exchange Solutions (ETS) program, EMC/Microsoft/Brocade/Dell have partnered together to release the 2nd whitepaper as a part of this program. The solution validation was done in partnership of EMC, Microsoft, Brocade, and Dell and was built out at the Microsoft Enterprise Engineering Center (EEC) labs in Redmond, WA  This solution is titled “Zero Data Loss Disaster Recovery for Exchange 2010: Enabled by EMC Unified Storage, EMC Replication Enabler for Exchange 2010, Brocade, Dell and Microsoft Hyper-V” – Yes the title is a mouthfull!

 Highlights of the Solution:

  • Exchange Server 2010 on Microsoft Windows Server 2008 R2 Hyper-V, 10,000 users per site (20k total), 500MB mailboxes, 150 msgs per day
  • 2 Exchange Database Avaliability Group (DAG) design, utilizing 3rd party replication API enabled by EMC REE
  • EMC Replication Enabler for Exchange with Mirrorview/S for synchronous replication between 2 sites
  • EMC CX4-480’s Storage (2 per site)
  • EMC Replication Manager w/EMC Snapview for app consistent replicas
  • Brocade ServerIron ADX application load balancers (1 per site)
  • Brocade 300 SAN switching and Dual Port 8GB FC HBA
  • Brocade FASTIron Ethernet Switches (1 per site)
  • Brocade MLX8 Core Router (1 per site)
  • Dell PowerEdge R910 servers (4 Hyper-V hosts) – Quad-Eight Core Intel Xeon 7560, 192GB RAM, 4 X 1GB Ethernet ports

 The physical diagram of the solution looks like this where we have 4 Hyper-V hosts across two data centers hosting a total of 20k users:

In addition to showcasing the storage and replication aspect with EMC REE and MS Third Party Replication API, we were also able to show some best practices in terms of guest building block and placement with regards to Exchange roles across Hyper-V hosts as seen below we have 2 DAGs across the Hyper V hosts with CAS/Hub roles along with DC/GC servers virtualized:

The solutions teams also performed some extensive performance validation in terms of both storage performance with JetStress and overall validation with LoadGen in both local failure and site failure scenarios. Remember, a solution not only needs to ‘work’ but it also needs to perform under load.

In short, if you are looking at an Exchange 2010 design and considering HA/DR and virtualizing the infrastructure, you won’t want to miss this very detailed whitepaper. Get it here: http://www.emc.com/collateral/software/white-papers/h7410-zero-data-loss-exchange-wp.pdf



Windows Geoclusters, Stretch-Clusters, and RecoverPoint/CE Failover

Taking a page out of Chief EMC Blogger Chuck Hollis‘ playbook, I’m attaching the graphics from entire PPT file that I thought would be important to highlight for this blog and its readers.  Some of the graphics didn’t fit to the page as well as I thought it would (I need to shrink them further). So if you like what you see, you can download the whole PPT right here: RecoverPointCE-MSfailoverclusterPPT

In a nutshell, EMC’s RecoverPoint/Cluster Enabler extends a Microsoft cluster across two sites.  A Microsoft cluster normally provides local site “HA” or high availability of server nodes, and RecoverPoint/CE adds “DR” or disaster recovery (AFTER) by stretching the second node to anywhere outside of your primary datacenter.  This presentation walks you through the basics behind that simple idea and provides some additional background.   Slide building credit goes to Gary Archer, a great guy who is always keeping me sharp on RecoverPoint’s latest features.

Recovery Time Objective: Targeted amount of time to restart a business service after a disaster event

Recovery Point Objective: Amount of data lost from failure, measured as the amount of time from a disaster event

Various approaches for DR and their RTO rankings

Microsoft Failover Clusters (formerly MSCS (or Wolfpack if you go back really far)) provides local HA, not DR across a site.  For this, you need to S-T-R-E-T-C-H your cluster. EMC’s Cluster Enabler is one way to do it, and using RecoverPoint with it would be like have your iPhone on Verizon.  Not the best analogy, but you get my point I hope!

Basic requirements – use SYNCHRONOUS or ASYNCHRONOUS - distance is not the issue but 400 ms latency ASYNC and 4 ms latency SYNC

Leverages majority node set clustering.    If you have 2 nodes/servers on Site A and 2 nodes/servers on Site B you will need a “tiebreaker” for deciding how to remain online after a failure – most common method for this tiebreaker is File Share Witness.  Many articles can give you additional background on majority node set clustering – it’s a good thing to know – I will point you to the blog from an old friend of mine John Toner, who writes about geographically dispersed clusters.

The architecture. 

What each piece does:  CE is a filter driver that “catches” Microsoft Cluster failure events and let’s the RecoverPoint-managed disk systems know to failover as appropriate.  Very sophisticated logic is built-in to prevent cluster split-brain – scenarios where the link is down and the application (such as a SQL server database) doesn’t know what is the correct owner of the disk resources.

See if you see what is happening above – AUTOMATIC FAILOVER.

Integrates with and supports Hyper-V

Works with latest features like Live Migration – so you can Live Migrate workloads locally for HA and failover remotely for DR.  You can control if you want to failover locally before failing over across a site.

Self explanatory – the failover steps in detail.

More detail of Live Migration support – note synchronous requirement.

Multi-array support.  We can create consistency groups with storage devices from multiple arrays in the same group.  This allows fora lot of interesting failover implementations (failover locally first, not remotely for example) and lets you keep components grouped together… like an entire SharePoint farm.

Hey, it works with Oracle on Windows too.

Recap of the benefits – hopefully it makes sense and it’s the reason that customers love this integration – with RecoverPoint/CE you get more control, less bandwidth required (3-12x savings on bandwidth as reported by RP customers), and it’s integrated with Microsoft Clusters to enable seamless failover.

Now that is a cool product.

SQL Velocity 2010-10-23 01:27:48

Scalable Shared Data Base Part 2: Setting up the Scalable Shared Data Base Start setting up Scalable Shared Data Base (SSDB) by answering a question; Is my database compatible with Scalable Shared Data Base? The computer science problem that a Scalable shared data base solves is to distribute data reporting…

Locked in a Room for a Few Days

This week I was back at EMC Headquarters in Hopkinton, MA working with the EMC Proven Solutions team.  I’ve mentioned this team before as they are responsible for testing and documenting how to deploy EMC technologies in common customer scenarios. I’ve highlighted some of their outstanding work in previous blog posts including:

For a few days this past week we essentially locked ourselves in a big conference room to hammer out the details of the next round of Proven Solutions that will likely start to see the light of day in early 2011 and will continue throughout the first half of next year.  There is a lot of work that goes into designing, developing, testing, tracking and documenting these Solutions which is why this team is comprised of a bunch of technical Rock Stars

We had some of our top Microsoft resources in the room including several who have their own blog sites including Brian Henderson, Dustin Smith, Eyal Sharon, as well as many other resources who are behind the scenes but spend countless hours working in the labs to build these solutions and work through technical problems to make sure customers can run a similar configuration error-free. 

What is great is that I didn’t have to roll up my sleeves and put up a fight to demand more Hyper-V solutions.  I didn’t have to throw white board markers like little daggers or jump up and down on my chair to get the team’s attention (which is a bummer because I was really looking forward to this part).  As you can imagine, there are a lot of different products (both EMC and non-EMC) to test so the roadmap debate can become pretty heated. But everyone came to the table with a laundry list of ideas of what we should test and Hyper-V was included in many of these suggestions.  This includes Hyper-V and Microsoft virtualization technologies used with many different applications in different scenarios.  These scenarios really represent what customers are asking us for and what they are interested in. 

I wish I could tell you what these look like but it is a little early for that but I guarantee you’ll hear plenty about them as they become available.  The point is EMC is listening to our field and to our customers and we have some exciting things on the roadmap as a result.  Do you have an idea for a Microsoft virtualized based solution?  Post it on here or email directly at Adrian.Simays@emc.com if you’re shy.  If the Solutions team doesn't listen at least it will give me a reason to jump around on some furniture.   

Virtual Provisioning for Exchange 2010 and Buying Groceries

Virtual Provisioning for Exchange 2010 makes sense for the same reasons that people don’t buy a year’s supply of groceries when they go to the market.

You get groceries as you need it – daily, weekly, bi-weekly – and you save on potential space in the fridge or freezer and the cost of the power and cooling to keep the groceries cold or frozen.

The same logic applies for your storage environments.

Had a few customer briefings yesterday and this picture came in handy to explain my point…  and although the pic above is Exchange specific, the benefits aren’t just limited to Exchange.  One other area where it makes perfect sense is for SharePoint content databases.  They start small and then grow like a balloon over time – but you don’t need to allocate all of that expected space up front.

Just remember when formatting to use Quick Format in Windows Server 2008 and for SQL Server database use the instant file initialization feature for database files so you don’t write zeros to the disk, destroying any benefits you might get from these features.

And to clarify – while thin provisioning is the industry-standard term for just-in-time allocation of storage versus “thick” or full allocation of storage, Virtual Provisioning is an EMC term that describes our management construct for delivering pooled storage… – how we eliminate the need for complicated RAID Group build-outs and have shifted towards virtually provisioned Pools of storage (thick or thin).

Pools are good – and they are not just for the rich!