NetBackup media servers and vSphere ESXi hosts: The real workhorses

4. OpenStorage for secondary storage, now VMware is onto the same thing for VM storage

If vCenter is the control and command center in vSphere environment, the ESXi hosts are the workhorses really doing most of the heavy lugging. ESXi hosts house VMs. They provide CPU, memory, storage and other resources for virtual machines to function. Along the same line, NetBackup media servers make backups, restores and replications happen in a NetBackup domain under the control of NetBackup master server. Media servers are the ones really ‘running’ various jobs.

ESXi hosts have storage connected to them for housing virtual machines. This storage allocated to ESXi hosts is called data store. More than one ESXi host can share the same data store. In such configurations, we refer to the set of ESXi hosts as an ESXi cluster.

NetBackup media servers also have storage connected to them for storing backups. More than one media server can share the same storage. NetBackup decouples storage from media server in its architecture to a higher degree than vSphere ESXi hosts. An ESXi host does not treat storage as intelligent. Although most enterprise grade storage systems have more intelligence built-in, you still have to allocate LUNs from the storage for ESXi hosts. VMware understands that the old school method of storage (which had been used in the industry over many decades) does not scale well and does not take advantage of a number of features and functions the intelligent storage systems can manage on their own. If you were in VMworld 2011, you may already know that VMware is taking steps to move away from LUN based storage model. See Nick Allen’s blog for more info. NetBackup took the lead for secondary storage half a decade back!

NetBackup is already there! Symantec announced OpenStorage program along the same time NetBackup 6.5 was released which revolutionized the way backups are stored on disks. All backup vendors treat disk in the LUN model. You allocate a LUN to the backup server and create a file system on top of it. Or you present a file system to the backup server via NFS/CIFS share. To make the matter worse, some storage systems presented disk as tape using VTL interfaces. The problem with these old school methods can be categorized into two.

First of all, the backup application is simply treating the intelligent storage system as a dumping ground for backup images. There is really no direct interaction between the backup system and storage system. Thus if your storage system has the capability to selectively replicate objects to another system, the backup server does not know about the additional copy that was made. If your storage system is capable of deduplicating data, the backup server does not know about it. Thus the backup server cannot intelligently manage storage capacity. For example, free space reported at the file system layer may be 10Gb, but it may be able to handle a 50Gb backup as the storage features deduplication. Similarly expiring a backup image with size 100Gb may not really free up that much space, but the backup server has no way of knowing this.

Secondly, the general-purpose file systems like NTFS, UFS, ext3, CIFS, NFS etc are optimized for random access. This is a good thing for production applications. But it comes with its own additional overhead. Backups and restores generally follow sequential I/O pattern with large chunks of writes and reads. For example, presenting a high performance deduplication system like NetBackup 5000 series appliances, Data Domain, Quantum DXi, Exgrid etc as NFS share would imply unnecessary overhead as NFS protocol is really for random access.

Symantec OpenStorage addresses this problem by asking storage vendors to provide OpenStorage disk pools and disk volumes for backups. This is just like what VMware wants Capacity pools and VM volumes to do for VM data stores in future. OpenStorage is a framework where NetBackup media servers simply provide a framework using which it can query, write read from intelligent storage systems. The API and SDK is made available to storage vendors so that they could develop plug-ins. When this plug-in is installed on media server, now the media server gains the intelligence to see the storage system and speak its language. Now the media server can simply stream backups to the storage device (without depending on overloaded protocols) and the intelligent storage system can store it in its native format. The result is 3 to 5x faster performance and the ability to tap into other features in storage system like replication.

In NetBackup terms, now the media server is simply a data mover. It moves data from client to storage. Since the storage system is intelligent and media server can communicate with it, it is referred as a storage server. Multiple media servers can share a storage server. When backups (or other jobs like restores, duplication etc) need to be started, NetBackup master server determines which media server has the least load. Then the selected media server loads the plug-in and preps the storage server to start receiving backups. You can compare this to the way VMware DRS and HA works where vCenter server picks the least loaded ESXi hosts for starting a VM from common data store.

Okay, so we talked about intelligent storage servers. How about the dump storage (JBOD) and tape drives? NetBackup media servers support those as well. Even in the case of a JBOD, which can be attached to the media server, NetBackup media servers make them intelligent! That story is next.

Next: Coming Soon!

Back to NetBackup 101 for VMware Professionals main page