I would like my BLOB on the side, thanks!

Hey,

Just wanted to give you all a quick update on some recent work the solution teams did with regards to SharePoint storage tiering.

SharePoint, as you know (or not?) stores all the content it manages in SQL Server tables. while there is some merit in doing so, it’s mostly disadvantageous when considering larger deployments of SharePoint as a true framework for content collaboration.  Those unstructured binary objects once stored in SQL are called BLOBs (Binary Large Objects). As the size of the content databases keeps growing, the main contributor is the BLOB data, which grows significantly faster than any associated metadata; a BLOB would usually comprise ~95% of the content database size.

In order to support the “ECM for the masses” message, Microsoft introduced a couple of APIs to accomplish that externalization task. The first is EBS (External BLOB Storage) that is available since MOSS 2007 SP1 and recently RBS (Remote BLOB Storage) which is available in several flavors for SharePoint 2010.

In this post I’m going to highlight the recent integration work we have accomplished with a MetaLogix product called StoragePoint.

The solutions team in Santa Clara worked on a neat Cloud storage solution for SharePoint BLOBs based on EMC Atmos.

The 3TB SharePoint farm content was externalized to EMC Atmos which dramatically decreased the size of the content databases and  demonstrated MetaLogix’s StoragePoint capability to effectively manage the externalization of SharePoint BLOBs to EMC Atmos through it’s Atmos connector. oh and BTW, that farm was 100% Hyper-V.

Atmos on-premise testing shows that the performance is nearly identical to the traditional setup of SharePoint with SQL as indicated in the following table. Results indicate that relocating BLOBs to an external BLOB store (EBS) shows no impact to the overall user experience across the three user profiles simulated:

 

The Unified Midrange Storage Group (UMSG) in RTP,NC conducted similar tests on EMC Celerra NS-120 in a VMware vSphere virtualized farm.

Various BLOB store flavors were tested in that case, involving EBS BLOB store provisined by FC drives with and w/o Deduplication as well as SATA drives with and w/o Deduplication, all through a CIFS share.

The following figure shows the disk layout of the storage design:

Some highlights from that test:

  • SQL disk usage reports after BLOB externalization showed an 88% reduction in size of the content databases.
  • After deduplication was completed, 18% of the file system space was saved.
  • While externalizing BLOBs presents overall performance improvement in retrieving objects it may add some latency to search and modification activities which in most deployments would represent a smaller precentage than browsing content (Another factor is the size of the BLOBs externalized, the larger the object the more efficient EBS/RBS is).
  • Once content is externalized, SharePoint indexing gets a boost. full crawl activity finished in less than 1/3 of the original configuration. I believe it has to do with the nature of indexing which is sequential.

While these two solutions are based on SharePoint 2007, we plan to re-validate it soon on SharePoint 2010, but don’t expect any magic there. I suspect results would be similar.

I believe that BLOB extenalization is the catalyst to SharePoint adoption in larger organizations, leveraging  SharePoint ECM capabilities in the  Multi-TB club. EMC has a wide range of offerings in that aspect and these two solutions demonstrate only part of it.