If you are in the SharePoint IT business you probably know already that while STSADM backup is pretty powerful, scalability is not its forte. Yes, it can do a lot, but practically, it is very limited! Even in a highly optimized farm setup, you won’t exceed higher throughput than 15MB/sec. mileage may vary but bear in mind, overall throughput of the farm would degrade during a streaming backup.
What I find quite often is that customers rely on SQL backups for SharePoint believing they have a good backup plan. While SQL backup is crucial for content recovery, it is just not enough to help you recover SharePoint as a system. another use case would be migrating a farm or even refreshing a staging farm. what about configuration database? webparts? customizations? the list can be very long depending on your environment.
EMC has some very compelling (I love that word) tools to rapidly backup SharePoint. By leveraging the storage array replication in concert with SharePoint VSS for application awareness. Since this technology is extremely scalable, you should expect similar backup/replication durations in the case of a small multi-GB farm or a large multi-TB one (yeah…change rate is an important factor but you get the point)
The fabulous team in our Shanghai Proven Solutions Center (Frances, you rock!) has recently tested EMC Networker for Microsoft applications(NMM) with Data Domain for backup de-duplication. While the whitepaper for both Exchange and SharePoint is still in the works, I wanted to share some interesting results from the SharePoint backup deduplication test. note that this setup did not utilize the DD Boost capability for faster backup streraming (We would revisit the test once DD Boost for NW is available):
7 VMs Farm (vSphere ESX 4)
VM1: SQL Server 2008 (4 vCPU, 8GB RAM)
VM2-VM5: Web Front Ends (4 vCPUs, 4GB RAM)
VM6: Index Server (2 vCPUs, 4GB RAM)
VM7: App Server w/central admin (2 vCPUs, 4GB RAM)
- Data Domain 690 and CLARiiON CX-480
- 1TB of SharePoint content – 100GB * 10 content databases. But the total backup data will exceed 1100GB because it included SQL Search Databases, Configuration Databases, Search Indexes and SQL System Databases etc.
- SQL and Content Index servers leverage hardware-based VSS (EMC storage). WFEs use MS software-based VSS
- SnapView Clones used for rapid SharePoint backup
- All backup LUNs were RAID5 SATA disks
Initial SharePoint Farm full backup using LAN-free topology with proxy client:
- It took 3:41 hours to complete a full backup of 1150GB to Data Domain.
- Duplication ratio was 35.6% while post-compression data was 740GB.
- During backup, SQL CPU was 61% on average. Network utilization was only 0.3%.
- Write throughput to Data Domain was 152.1MB/s on average.
The following table listed the detail duration to do full backup of SQL Server.
|Prepare and Import Snapshots to proxy client(minutes)
||Backup streams to Data Domain (minutes)
||Deport the Snapshot(minutes)
||Backup index and Bootstrap (min)
Second SharePoint Farm full backup after data loading for 8 hours.
Change rate: 16GB/day(1166GB and we run another full back up again during the night:
- It took 3:39 hours to backup 1166GB of data to Data Domain.
- Duplication ratio was 90% while the post-compression data was 110GB. Total duplication ratio of two full farm backups was 63.5%.
- During backup, SQL CPU was around 56%. Network utilization was only 0.3%.
- Write throughput to Data Domain was 164.5MB/s on average and Peak write throughput was 240MB/s!
Granular full backup using NMM 2.2 SP1 with LAN-Based topology.
Because Granular backup has to use LAN, backup performance will be slower than the LAN-Free topology. We backed up three site collections simultaneously from three Web front-ends:
- It took 8 hours to complete a full backup of three site collections with 291GB data in total.
- Total compression ratio was 92.8% with 20.9GB post compression data (yeah, we need to investigate that, very impressive behaviour
- Three site collections backup was running through three WFEs. WFE’s CPU utilization kept at 20% and network utilization was 11.12% at peak time.
- Write throughput to Data Domain was 17.1MB/s on average.
- 30GB Incremental backup took 1 hour to complete
What about restores you ask?
I’ll address that in my next post…..