So much to understand about your Cloud Options for Microsoft Applications …The path you take matters! The Radicati Group Perspective

In the world of IT…. things are moving very quickly.  In my role at EMC, I have the opportunity and quite frankly privilege, to talk to many customers, partners, press and analysts and as well as EMC’s many experts on Microsoft.  Consistently I hear from all about the abundance of information out there on cloud computing and virtualization… and how confusing it can be to cut through all of the noise and truly educate yourself.   

There are a number of EMC resources and  sponsored opportunities for IT professionals to do just that –  educate themselves on technical considerations for deploying Microsoft Applications such as Exchange, SharePoint, SQL Server database and Windows Server in a cloud environment.  

EMC recently worked with industry leading analysts and thought leaders  to discuss what customers should begin to think about as they evaluate which path to take regarding their Microsoft applications and cloud computing.  I am going to be posting a weekly blog on our efforts to help our customers, partners and  the broader  Microsoft community on the role that infrastructure can have in helping your transform your Microsoft environments to the Cloud.

As always – key place to start your education process is here on this blog and also on our Everything Microsoft Community on ECN.

Radicati Group Whitepaper- Benefits of Consolidating and Virtualizing Microsoft Exchange and SharePoint in a Private Cloud Environment – Understanding the Role of Infrastructure to Ensure Success

EMC experts partnered with Sara Radicati and her team at the Radicati Group to create a whitepaper and webcast to discuss the benefits of consolidating and virtualizing Microsoft Exchange and SharePoint in a Private Cloud Environment and how infrastructure such as that offered by EMC and our channel partners.   In this paper, Sara and her team discuss the benefits of consolidating and virtualizing  Microsoft

http://www.emc.com/collateral/white-papers/radicati-emc-wp.pdf

Microsoft Exchange and SharePoint Consolidation and Virtualization in a Private Cloud Environment Webcast Replay – A Customer perspective.

EMC also had the opportunity to continue our work with Sara and her team and also Phillip Reynolds , Associate Director of Technology, Williams & Fudge an EMC customer.  The web conference was moderated by Sara Radicati and brings together guest speakers from EMC and one of its customers, financial firm Williams & Fudge, to discuss how customers can best leverage virtualization technologies and other key infrastructure components that are critical to the deployment of a successful private cloud environment.  This web conference looked at the benefits of virtual private cloud deployment, and how EMC’s private cloud virtualization technologies and solutions can support the implementation of secure private cloud deployments of Microsoft Exchange Server and Microsoft SharePoint. 

See the webcast on replay – http://www.radicati.com/files/webconferences/2013/2-Mar-Virtualization/Microsoft_Exchange_and_SharePoint_Consolidation_and_Virtualization_in_a_Private_Cloud_Environment_3-28-13.wmv 

And the Premiere Opportunity to Learn More about Cloud –  EMCWORLD 2013

 EMCWorld – so many session and demos in so little time

Adrian’s Virtual Winfrastructure blog – Painting a Picture of Microsoft Solutions at EMCworld does a great job of providing an overview on all of the EMC Solutions and demos focused on Microsoft environments.. 

His blog is here – http://windowtotheprivatecloud.com/author/virtual-winfrastructure/

Hear the Customer Perspective on Why EMC VNX is a Great Option for Microsoft Environments

Customers of all sizes – across all industries – are seeing tangible benefits of using VNX as their storage platform of choice for Microsoft environments.   VNX provides a multitude of benefits for Microsoft environments by providing automated and economical unified storage with pace-setting performance, optimized for virtual applications.  It’s easy for EMC to promote the benefits of our technology…we work with it all the time.  But its even more interesting to hear from actual customers who have deployed VNX and our management solutions on how EMC has helped them to achieve their IT and business objectives.  

Watch European and American companies discuss the increased performance and management simplicity of running their Microsoft applications on EMC VNX unified storage.

http://www.emc.com/collateral/demos/microsites/mediaplayer-video/emc-customers-microsoft-applications-emc-vnx.htm

Whats in your SLA?

People have been considering and comparing public (hosted) and private (on-premises) cloud solutions for some time in the messaging world, and at increasing rates for database and other application workloads.  I’m often surprised at how many people either don’t know the contents and implication of their service provider service level agreement (SLA), or fail to adjust the architecture of private cloud solution and then directly compare cost. 

Here are my five lessons for evaluating SAAS, PAAS, and IAAS provider SLAs:

Lesson 1: Make sure that what’s important to you is covered in the SLA

Lesson 2: Make sure that the availability guarantee is what you require of the service

Lesson 3: Evaluate the gap between a service outage’s cost to business and the financial relief from the provider

Lesson 4: Architect public and private clouds to the similar levels of availability for cost estimate purposes

Lesson 5: Layer availability features onto private clouds for business requirement purposes

I’ll use the Office 365 SLA to explore this topic – not because I want to pick on Microsoft,  but because it’s a very typical SLA, and one of the services it offers (email) is so universal that it’s easy to translate the SLA’s components into the business value that you’re purchasing from them.

Defining availability

The math is simple.  It’s a 99% uptime guarantee with a periodicity of one month:

image

If that number falls below 99, then they have not met their guarantee.  For what it’s worth, during a 30 day month, the limit will be about 44 minutes of downtime before they enter the penalty, or about 8.7 hours per year.

But what does “Downtime” mean?  Well, it’s stated clearly for each service.  This is the definition of downtime for Exchange Online:

“Any period of time when end users are unable to send or receive email with Outlook Web Access.”

Here’s what’s missing:

  • Data:  The mailbox can be completely empty of email the user has previously sent and received.  In fact the email can disappear as soon as they receive it.  As long they can log in via OWA, the service is considered to be “up”.
  • Clients:  Fat outlook, blackberry, and Exchange ActiveSync (iPhone/iPad/Winmopho, and most Android) clients are not covered in any way under the SLA

Lesson 1: Make sure that what’s important to you is covered in the SLA

Lesson 2: Make sure that the availability guarantee is what you require of the service

Balancing SLA penalties with business impact

My Internet service is important to me.  When it’s down, I lose more productivity than the $1/day or so I spend on it.  Likewise, email services are probably worth more than the $8/month/user or so that you might pay your provider for it.  That doesn’t mean that you should spend more than you need for email services.  But it does mean that if you do suffer an extended or widespread outage, there will likely be a large gap between the productivity cost of the downtime and the financial relief you’ll see in the form of free services you’ll see from the provider. 

image

Callahan Auto Parts also offers a guarantee

I’ll put this in real numbers.  Let’s say I have a 200 person organization.  I might pay $1600/month for email services from a provider.  If my email is down for a day during the month, my organization experiences 96% uptime for that month, and as a result, my organization is entitled to a month of free email from the provider, worth about $800.

image

The actual cost of my downtime will very likely exceed $800.  To calculate that cost we need the number of employees, the loaded cost per hour for the average employee, and and the productivity cost of the loss of email services.  For our example of 200 employees, let’s imagine a $50/hour average loaded cost to business and a 25% loss of productivity when email is down:

200 employees x $50 cost per hour x .75 productivity rate x 8 hour outage = $60,000 of lost productivity

Subtract the $800 in free services the organization will receive the next month, and the organization’s liability is $59,200 for that outage.

Now how do you fill that gap?  I’m not entirely sure.  It could be just the risk of doing business – after all, the business would just absorb that cost if they were hosting email internally and suffered an outage.  If the risk and impact were large enough, I would probably seek to hedge against it – exploring options to bring services in house quickly, or even looking to an insurance company to defray the cost of outages – if Merv Hughes can insure his mustache for $370,000, then surely you can insure the availability of your IT services.  Regardless, it’s wise not to confuse a “financially backed guarantee” with actual insurance or assurance against outage.

File Photo:  What a $370k mustache may look like.  Strong.

Lesson 3: Evaluate the gap between a service outage’s cost to business and the financial relief from the provider

Comparing Apples to Oranges

image

See what I did there?

Doing a cost comparison between public cloud designed to deliver 99.9% availability and a private cloud designed to provide 99.99% or 99.999% availability makes little sense, but I see people do it very frequently.  Usually it’s because the internal IT group’s mandate is to “make it as highly available as possible within the budget”.  So I’ll see a private cloud solution with redundancy at every level, capabilities to quickly recover from logical corruption, and automated failover between sites in the event of a regional failure, compared to a public cloud solution that provides nothing but a slim guarantee of 99.9% availability.  In this instance, it’s obvious why the public cloud provider is less expensive, even without factoring in efficiencies of scale.

To illustrate this, I usually refer to Maslow’s hand-dandy Hierarchy of Needs, customized for IT high availability.

image image

Single Site and Multi-site Hierarchies of Need

If I want to make an accurate comparison between a public cloud provider’s service and pricing and what I can do internally, I often have to strip out a lot of the services that are normally delivered internally.  Here’s the steps:

  1. Architect for equivalence.  If I have a public cloud provider just offering 3 9’s and no option for site to site failover, for my database services, I might just do a standalone database server.  Maybe I’d add a cheap rapid recovery solution (like snapshots or clones) to hedge against compete storage failure and cluster at the hypervisor layer to provide some level of hardware redundancy.  If my cloud provider offers disaster recovery, I’d figure out what their target RPO/RTO and insert some solution that matches that capability.
  2. Do a baseline price comparison.  Once I’ve got similar solutions to compare, I can compare price.  We’ll call this the price of entry.
  3. Add capabilities to the private cloud solution after the baseline.  I only start layering features that add availability and flexibility to the solution after I’ve obtained my baseline price.  Only then can I illustrate the true cost of those features, and compare them to the business benefits.

Lesson 4: Architect public and private clouds to the same levels of availability for cost estimate purposes

Lesson 5: Layer availability features onto private clouds for business requirement purposes

Is it time to say goodbye to Jetstress?

The short answer?  “Yes”  The long answer?  “Yyyyyesssss”

But first let me get this out of the way:  If you want to run Jetstress against any storage configuration I come up with, feel free.  I wouldn’t put it forward if I weren’t confident it could handle the workload. 

Prior to 2007, Jetstress REALLY mattered.  You had 500MB mailboxes that could easily 2-5 IO/s per mailbox.  Cached clients were rare – so storage latency was the primary driver of customer complaints.  Over the last ten years, Microsoft has put a lot of effort into making Exchange a much more storage-friendly application, and they’ve succeeded.  Today you have 0.1 IO/s per mailbox, and it’s spread over 2-5 GB.  Exchange is now Just Another Workload.  So why are we spending all this time and money (not to mention implementation delays) using an unwieldy purpose-built testing tool for something that’s Just Another Workload?

With Exchange 2010 and its very modest IO profile, I question the value of Jetstress as opposed to other testing tools.  The level of effort and sheer amount of time required to create the databases, replicate them, and then run the test are significant.  It can run to weeks for reasonably sized deployment.  Yes, you get assurance that your storage rig is operating properly, but you can get that assurance from tools like Iometer, which can take seconds to set up, and mere hours to complete.

For all the effort and time involved in a Jetstress run, I just expect more.  I’d expect that my entire infrastructure would be validated.  I’d expect assurance that I have enough RAM and CPU in my virtual machines, that my network is up to snuff, access to my domain controllers and global catalog servers is sufficient… but I don’t get any of that with Jetstress.

If I’m going to put in the kind of time and effort into my testing that Jetstress requires, I’m going to fire up an entire infrastructure and use Loadgen and verify my entire configuration – not just my storage.  On the other hand, if I’m going to test my storage independently from my server and network, I would:

Roll my own Exchange IO test with Iometer in 30 minutes

  • Set up your storage on a your production mailbox server
  • Determine the file sizes for your database and logs  You can find that on the LUN Requirements tab of the Exchange Mailbox Storage Calculator

image

  • Using fsutil (built in Windows command line tool), create files called iobw.tst sized according to the DB Size + Overhead and Log Size + Overhead using fsutil.  For our example, we’re looking at 1595 GB database files and  34 GB log files.  This part is not strictly necessary, but I like it.  Creating a file called iobw.tst in the root directory of the target will prevent ioMeter from creating thick files that occupy the entire LUN.
    • fsutil file createnew e:\iobw.tst 1672478720 <———-simulated 1.5 TB database file
    • fsutil file createnew f:\iobw.tst 35651584 <————-simulated 34 GB log file
  • Download Iometer and the Exchange 2010 .icf file I’ve created.  Launch ioMeter and open the icf file.
    • If you’re using mount points instead of drive letters, download the latest Iometer Release candidate for mount point support
  • Determine the target IO throughput for the databases and logs.  This can be determined from the Role requirements tab of the Exchange 2010 Storage calculator

image 

  • Modify the transfer delay in “Exchange 2010 DB Workload” Global Access specification so it will generate the desired number of IOs. The math is: 1000 ÷ target IO/s. Our example requires 30 IO/s per database, and 1000 ÷ 30=33.3.  So we’ll set it to 33.  The original in the icf file is 25, which would generate 40 IO/s.

image

  • Modify the transfer delay in the “Exchange 2010 Log Workload” Global Access Specification so it will generate the desired number of IOs.  Our example requires 7 IO/s per log LUN, and 1000 ÷ 7=142.8, so we’ll set it to 143.  The orignial in the .icf file is 100, which would generate about 10 IO/s.

image

  • Assign the DB Worker and BDM Worker to the database LUNs
  • Assign the Log Worker to the Log LUNs
  • Click the Green Flag and start.  Let it run for 5-10 minutes for a quick sanity check, and stop it.  Make sure it’s driving the IO you want at the latencies you expect, and you’re not gated by CPU or anything like that.
  • Start a perfmon data collection (perfcollect is good for this).
  • Modify the test tab for a however long you’d like.  I recommend at least a few hours.

image

  • Take a nap
  • Go for a run
  • Eat some food
  • Watch some TV
  • When the test completes, open up your perfmon file, look at your disk latencies, make sure they’re steady, there were no spikes, and there were no aberrations in number of IO/s
    • If you’re an EMC customer and use perfcollect, zip up the perfcollect data collection and send it to your TC, or reseller TC, and ask for a WPA (miTrend) report on the server(s).  You’ll get a nicely formatted report with graphs and tables and twenty-seven 8×10 color glossy pictures with circles and arrows and a paragraph on the back of each one

Using this method, you can get in and out of testing mode within easily 36 total hours, and your time will be less than an hour of setup and analysis.  That translates into weeks of time where your users can spend enjoying your cool new messaging infrastructure.

Pimp My Exchange: The Microsoft Exchange Calculator with EMC Extensions

Challenge

People designing Exchange storage layouts often use the excellent Microsoft Exchange storage calculator.  This is a great first step, but the tool does not include things like background database maintenance (BDM) which can sometimes cause a disk IO testing tool like JetStress to fail and it also lacks in providing a visual view of the Exchange layout.

EMC Solution

EMC’s extensions add in some of the IOPS details (like BDM) that the base calculator might miss and we’ve also designed a tool called the DAG Instant Visualization Application (DIVA) that helps to visualize the environment in a more legible way.

Watch this great video interview with Jim Cordes (creator of these tools) for more details!

To get the calculator with EMC extensions and DIVA, go to the Everything Microsoft site.

The direct link to the pimped-out calculator is here.