Dear EMC XtremIO, why do I get a flashback of DCT PureDisk when I see you?


Once upon a time, there was a small data reduction software company called Data Center Technologies (DCT). Born in Belgium, DCT caught the eye of storage leader at that time, Veritas Software. Veritas was looking for cheaper options to bring disk based data reduction technologies into its tape dominant backup portfolio. Then leader in data deduplication, Data Domain, was considered too expensive.

DCT PureDiskXtremIO
What is it?Disk based backup with data reductionAll flash storage with data reduction
Who bought it?VeritasEMC
Why?Veritas needed to bring disk based backup technologies to its primarily tape based portfolioEMC needed to bring all-flash based storage technologies to its primarily disk based portfolio
What was the rational?DCT was less known startup with lower price tag when compared with the disk backup leader Data DomainXtremIO was a cost effective alternate as the dominant all-flash upstart was not available for sale
Who ended(ends) up dealing with it?SymantecDell
Market reaction – complexityScale-out architecture was way ahead of its time. Too complex to deploy and maintain.Scale-out architecture based on rack based servers, storage and a mess of cables. Complex to deploy and maintain.
Market reaction – data reductionFixed block deduplication. Great for files and folders. Not good for structured data.Fixed block deduplication. Great in hero benchmarks. Mileage varied for real world workloads.
RemediationLead with the trusted brand, NetBackup and tuck-in PureDisk behind its media servers until NetBackup itself is ready to handle deduplication natively. The final remnant of PureDisk was EOL’ed on December 2015 (5030 appliance).Lead with trusted brand VMAX(now all flash) for high performance workloads, position XtremIO for next tier. Rest of the story is still developing.

Veritas acquired DCT and brought its product, PureDisk to market as yet another option in its backup portfolio. PureDisk featured a scale-out architecture with fixed block deduplication. The individual nodes (content router with meta database engine) stored a share of data segments belonging to specific range of hashes (fingerprints). The product struggled to make a good impression among Veritas’ customers mainly for three reasons.

  • PureDisk positioning confused customers as Veritas already had two successful backup brands. NetBackup was the king of enterprise backup; Backup Exec was quite successful among mid to small businesses. PureDisk had to be positioned for use cases where customers wanted to eliminate tape drives entirely.
  • PureDisk scale-out architecture too complex for its time. It took considerable effort to build a storage pool together (repeated OS and product installation on multiple nodes, complex cabling, hash distribution strategy and so on). Competition used to mock the product installation as a rocket science project.
  • PureDisk had fixed block deduplication, quite primitive in data reduction. While it provided decent data reduction for files and folders, it wasn’t a great choice for structured data.

While product team was trying to figure things out, something unexpected happened. Veritas decided to merge with Symantec. The team still had to plough through data reduction strategy while ship’s direction is yet to be set. The team felt that it is better to make data reduction layer stand behind well respected product brands (especially NetBackup) and tuck in PureDisk’s data reduction technology.

It was not easy as hoped. PureDisk’s fixed block dedupe cannot be used as is to support all the types of applications that NetBackup supported. Engineering team innovated to bring a hybrid approach where the backup client can ‘see the data stream’ and divide it exactly at the logical boundaries. This is then fed into deduplication engine such that individual objects are identified and fingerprinted even if it is coming from a structured data source. This hybrid approach (later branded as ‘intelligent deduplication’) proved to be a decent makeover. Although this effort started within a few years (2007, NetBackup 6.5) It took a number of years to perfect the intelligent deduplication recipe (2012, NetBackup 7.5) for the most common applications.

Another problem was PureDisk’s scale-out architecture. While it was great on paper, it turned out to be a nightmare to install and maintain in customers’ environments. That architecture needed to be dumped and a new architecture needed be built from ground up.  Symantec did this in two prong approach. In the first prong, it simplified the deployment to a limited extent with the introduction of target deduplication appliances (branded as NetBackup 5000 series, in 2010) that could sit behind NetBackup or Backup Exec media servers. The second prong involved re-engineering the deduplication engine by abandoning all the complexities of PureDisk’s scale-out design and tucking it into media server. As the fixed dedupe engine is memory intensive, it took several years to polish it. Finally, when the media server embedded deduplication pool crossed the capacity threshold, Symantec declared EOL for older scale-out dedupe engine. This two prong approach played out over 8 years! It started in 2007 (NetBackup 6.5 supported writing to PureDisk pool through OST plugin) and ended in December 2015 (EOL for the last appliance in 5000 series, the 5030).

When I watch the story of XtremIO, I am getting the flashback of DCT PureDisk. EMC was looking for a cheaper way to bring in all flash array (AFA) solution to the market. It sets eyes on Israel based XtremIO. EMC brings in XtremIO as a storage product in its already established portfolio of VMAX, VNX and so on. XtremIO faces similar challenges with product positioning. It also has fixed block deduplication for data reduction which is not ideal for enterprise applications. It also features scale-out which proves to be complex to install and maintain. To make the matter worse, Dell acquires EMC thereby making customers worry as to which product(s) will survive in the merged company. Now the product team decides that VMAX is a safer brand and pushes XtremIO down in positioning. VMAX was not built for flash (just like how NetBackup was not built for disk when PureDisk arrived) so EMC would need to eventually bring some of XtremIO’s flash specific artifacts into VMAX.

When PureDisk had left a bad taste among customers, Symantec stuck NetBackup brand and built deduplication from ground up with limited artifacts from PureDisk. The salesforce didn’t even use the word ‘PureDisk’ in conversations. EMC is dealing with similar situation where VMAX brand needs to keep its loyal customers while the flash artifacts are natively being integrated.

I am not saying that XtremIO’s future is going to be similar to that of Symantec PureDisk. But its story until now are quite similar that of PureDisk. Time will tell.

Disclosure: I had worked for Veritas/Symantec. However the information in this story is based on publicly available knowledge. I currently work for Pure Storage (no relationship to PureDisk product in the story). The opinions here do not reflect those of my employer.

Leave a Reply

Your email address will not be published. Required fields are marked *