Client Direct: NetBackup vs. NetWorker

NetBackup introduced Client Direct capability a few years back with NetBackup 7.0 release. This is a break-through innovation in backup infrastructure architecture. Traditionally backup is a process where data is read from production client, transmitted over wire in its entirety to a backup server and then written to storage. The emergence of target dedupe appliances behind a backup server meant that backup can now take three hops through network. It hops from client to backup server first, then it hops from backup server to deduplication appliance. NetBackup changed this game. NetBackup client can dedupe backup stream at source and send deduplicated data directly to NetBackup’s deduplication pool, for example a NetBackup 5020 deduplication appliance, as illustrated below.

NetBackup Client Direct

This architecture is possible in NetBackup because it has several innovations that reduce the impact of running deduplication at the production client.

  1. NetBackup Accelerator: This technology features a platform independent track log that intelligently detects changed files without the need for enumerating the entire file system. Then it optimally synthesizes a full backup image at the storage. The result: Full backups can be run using the resources needed to run an incremental backup.
  2. NetBackup Client Side Deduplication Cache: This enables the production client to run deduplication by comparing the generated fingerprints for the chucks in the changed file (detected intelligently as explained in 1) against the previous backup set without shipping the fingerprint to storage for comparison. The result: Superior federated deduplication without the slow chatter across network.
  3. Intelligent Hybrid Chunking that is not CPU bound: Deduplication chunking is done typically using variable block method or fixed block method. The first one is CPU intensive and the second one is less efficient in data reduction rate. NetBackup uses the best of both worlds by using intelligent hybrid chunking. As deduplication-fingerprinting logic is built into the client, it can start the chunking exactly after identifying the object boundaries. Thus you get the advantage of not being CPU bound while also not suffering from low deduplication rate.

Reducing of impact on production client’s resources, reducing the impact on production network, reducing the number of hops and reducing the impact on backup server (translation, increased scalability) make NetBackup Client Direct a unique feature. The popularity of this feature had made ‘Client Direct’ a common innovation name that appears in RFPs for backup solutions.

The pressure is causing other backup vendors to come up with ‘Client Direct’. EMC announced last week that NetWorker 8.0 will have this capability and even named it ‘Client Direct’ so that the checkboxes in RFP can be ticked. A closer look reveals that NetWorker Client Direct is suitable for checkbox in RFP, but really not ready for primetime as is.

  1. NetWorker Client has no intelligent detection of changed files. NetWorker also does not have any sort of optimized synthetics. The result: Running full backups with NetWorker Client Direct will use significant amount of processing power from production clients.
  2. The NetWorker client and does not federate deduplication; it is done by DD Boost. As these two are essentially unaware of each other’s format, there is no way to cache fingerprints of the chunks from previous backups. That means excessive chitchat with the target Data Domain device during backups.
  3. DD Boost is the process of offloading some of the Data Domain deduplication processing to other systems. In this case, the production clients are taking that load. As clearly documented in Data Domain SISL architecture “SISL takes the pressure off of disk accesses as a bottleneck so that the system relies on the speed of the CPU to deliver inline deduplication performance”. Translation: CPU bound chunking. When this is offloaded to production clients, it can severely affect the performance of production systems with large backup workloads.

Even though EMC can mark the checkboxes in RFPs; their specialists are less likely to encourage POCs with NetWorker Client Direct. In a neck-to-neck battle, it appears that NetWorker has a long road ahead to match NetBackup Client Direct.

What do NetApp ONTAP and Symantec NetBackup have in common?

A friend of mine forwarded this link to the interview SearchStorage.com recently did with Dave Hitz, one of the founders of NetApp. It is an interesting read and the major topic is the new clustering capabilities in OnTap 8. When he was asked about EMC’s Isilon, I found his response to hit a home run.

“If you look at features EMC can support, you end up with a complete list. If you break apart their architectures and look at the same feature list by architecture, you end up finding the main feature Isilon has is clustering, which is great. Unfortunately, it’s not in combination with the full suite of rich data management capabilities. That’s the No. 1 difference Ontap has — it’s the same Ontap that has all this cool stuff in it.” ,  said Dave Hitz. 

The context here is the fact that the foundational technology powering all storage systems from NetApp is ONTAP (with E-series being an outlier) and customers get the choice of footprint and features to match their workloads. EMC’s storage division, on the other hand, provides different products for overlapping set of workloads like VNX, VMAX, Isilion etc.

If you think about it, this response is applicable even when you look at other business units from EMC as well. My favorite is EMC’s Backup and Recovery Services (BRS) division. They have four different products; Avamar, Data Domain, NetWorker and HomeBase, pretty much serving the same market. If I were to fit Dave’s quote in the context of Backup and Recovery and use Symantec’s NetBackup as the competitor for EMC Backup, it would go something like this.

If you look at features EMC can support as a vendor for backup and recovery, you end up with a near-complete list. If you break apart their architectures and look at the same feature list by architecture, you end up finding that the main value Data Domain has is storage reduction at target with federation capabilities for limited application workloads. Avamar has full management capabilities but only for smaller workloads. NetWorker has decent long-term retention capabilities and track record but had been on life support. HomeBase provides Bare Metal Recovery. Unfortunately, none of these products are with a full suite of rich data management capabilities for end-to-end protection that can bring down capital and operational expenses in managing recovery points. That’s the No. 1 difference NetBackup has — it’s the same NetBackup that has all those cool stuff in one platform and a lot more innovations like managing snapshots, replicas, virtualized applications, backup acceleration etc. 

As always, the standard disclaimer applies here. This is just my opinion. Although I work for Symantec, the above statement should not be considered as the view of my employer.

 

Will EMC BRS kill Avamar or NetWorker?

EMC World 2012 has come and gone. For those watching the Backup and Recovery Services (BRS) division would notice a drastic shift in strategy since last year. Is Avamar counting its days?

Surprised? Let me explain. Remember the “Tape sucks! Move on!”  Campaign sung by BRS last year? They even mocked Google for recovering from tapes. They wanted the world to look at Avamar and Data Domain, the two products with spinning disks as the houses of backups. The other child NetWorker was mostly ignored and was on life support just to get by with the era of tapes.

BRS seems to have come to grip with the reality to some extent. The incremental updates to Avamar and revelation of NetWorker 8 features tend to indicate that BRS is taking a 180-degree turn.

No real updates for Avamar Data Store: All the announced business critical applications support in Avamar are for both Data Domain Boost and Avamar native client. Hyper-V that is popular among SMB workloads is now available through Boost to a Data Domain target. Last year, BRS’ announcement was that DD is for specific work loads and Avamar Data Store is for everything else. Now Boost is getting more attention and Avamar engine by itself pretty much stays the same.  The blackout windows in Avamar Data Store already annoy customers. Data Domain deduplication engine is preferred for target dedupe and DD Boost will replace source side deduplication eventually? Inspired by Symantec’s Dedupe Everywhere strategy?

Note: Thank to Ian’s comment on clarifying that newer application support is available for Avamar as well. Not just for Data Domain through DD Boost.

Emergence of Media Access Node: BRS realized that customers with longer retention requirements would not buy in on ‘keep it on disk’ message. Tape provides economies of scale. Modern tape technologies are superior in performance and reliability. Now, BRS ships a NetWorker node underneath the cover as Media Access Node in Avamar to copy rehydrated data into tape in NetWorker tape format.

NetWorker 8.0 getting some facelift: Although NetWorker was ignored in keynotes, BRS made a deliberate attempt this year to show what is happening to NetWorker. It was expecting the morgue but now pulled back and is getting revved up. There is a long road ahead to convince customers, but BRS says it is putting equal number of resources on NetWorker as was done on Avamar.  Not to mention about the newfound love, Spectralogic, to compete with IBM and Oracle.

If you pay closer attention, all that Avamar got is to make things better for Data Domain (Boost expansion, multi-stream support…) and NetWorker (data stored in NetWorker tape format). In a nutshell, BRS wants everyone to keep backup data on either Data Domain dedupe format or NetWorker tape format. Once NetWorker and Data Domain Boost combination can support backups through WAN, Avamar may not have anything to offer. From operating margin perspective, Avamar as a product may become a dog in BCG Growth-share matrix? The one eventually going to morgue looks to be Avamar Dedupe engine?

Not seeing your comments about this post? Please read this note.