Client Direct: NetBackup vs. NetWorker

NetBackup introduced Client Direct capability a few years back with NetBackup 7.0 release. This is a break-through innovation in backup infrastructure architecture. Traditionally backup is a process where data is read from production client, transmitted over wire in its entirety to a backup server and then written to storage. The emergence of target dedupe appliances behind a backup server meant that backup can now take three hops through network. It hops from client to backup server first, then it hops from backup server to deduplication appliance. NetBackup changed this game. NetBackup client can dedupe backup stream at source and send deduplicated data directly to NetBackup’s deduplication pool, for example a NetBackup 5020 deduplication appliance, as illustrated below.

NetBackup Client Direct

This architecture is possible in NetBackup because it has several innovations that reduce the impact of running deduplication at the production client.

  1. NetBackup Accelerator: This technology features a platform independent track log that intelligently detects changed files without the need for enumerating the entire file system. Then it optimally synthesizes a full backup image at the storage. The result: Full backups can be run using the resources needed to run an incremental backup.
  2. NetBackup Client Side Deduplication Cache: This enables the production client to run deduplication by comparing the generated fingerprints for the chucks in the changed file (detected intelligently as explained in 1) against the previous backup set without shipping the fingerprint to storage for comparison. The result: Superior federated deduplication without the slow chatter across network.
  3. Intelligent Hybrid Chunking that is not CPU bound: Deduplication chunking is done typically using variable block method or fixed block method. The first one is CPU intensive and the second one is less efficient in data reduction rate. NetBackup uses the best of both worlds by using intelligent hybrid chunking. As deduplication-fingerprinting logic is built into the client, it can start the chunking exactly after identifying the object boundaries. Thus you get the advantage of not being CPU bound while also not suffering from low deduplication rate.

Reducing of impact on production client’s resources, reducing the impact on production network, reducing the number of hops and reducing the impact on backup server (translation, increased scalability) make NetBackup Client Direct a unique feature. The popularity of this feature had made ‘Client Direct’ a common innovation name that appears in RFPs for backup solutions.

The pressure is causing other backup vendors to come up with ‘Client Direct’. EMC announced last week that NetWorker 8.0 will have this capability and even named it ‘Client Direct’ so that the checkboxes in RFP can be ticked. A closer look reveals that NetWorker Client Direct is suitable for checkbox in RFP, but really not ready for primetime as is.

  1. NetWorker Client has no intelligent detection of changed files. NetWorker also does not have any sort of optimized synthetics. The result: Running full backups with NetWorker Client Direct will use significant amount of processing power from production clients.
  2. The NetWorker client and does not federate deduplication; it is done by DD Boost. As these two are essentially unaware of each other’s format, there is no way to cache fingerprints of the chunks from previous backups. That means excessive chitchat with the target Data Domain device during backups.
  3. DD Boost is the process of offloading some of the Data Domain deduplication processing to other systems. In this case, the production clients are taking that load. As clearly documented in Data Domain SISL architecture “SISL takes the pressure off of disk accesses as a bottleneck so that the system relies on the speed of the CPU to deliver inline deduplication performance”. Translation: CPU bound chunking. When this is offloaded to production clients, it can severely affect the performance of production systems with large backup workloads.

Even though EMC can mark the checkboxes in RFPs; their specialists are less likely to encourage POCs with NetWorker Client Direct. In a neck-to-neck battle, it appears that NetWorker has a long road ahead to match NetBackup Client Direct.