What are my top 10 reasons for hopping onto Pure Storage?

Photo credit: Pure Storage Inc
Top 10 reasons for hopping into Pure Storage ride

If you had asked me about Pure Storage a year ago, I would have said that it was one of the leading flash storage array vendors in the market. Although I was not wrong, I must confess that my view then was a bit myopic. Once I got to know this company better, I couldn’t resist joining its team as one of Puritans. Here are my top 10 reasons.

  1. Reliving VERITAS (Software): Don’t you remember your first crush? Do you hope to build a time machine to meet her again? VERITAS and Sun Microsystems used to be my crush as I got into technology. Pure Storage reminds me of that VERITAS of early 2000s, she was the hottest girl in the bar. Everyone wants to partner with her because she stole the show from aging big irons.
  1. It is all about fostering the team, not building MVPs: The Company hires people with the mindset to work towards a bigger mission. Irrespective of the title and role, employees know how to talk both business values and technical merits of what is being build to solve unmet needs of customers. There is a sense of accomplishment on where Puritans are today; at the same time there is a strong ambition for where they want to be in future.
  1. Open workspaces: When you walk into a Pure Storage facility, you would notice the open collaborative workspaces. You cannot distinguish between spots that are used by university hires and VPs. When a Puritan needs your attention, he/she may just come to your desk, yell across the desk, instant message or shoot Nerf darts. I am sending my letter to Santa for a good Nerf gun.
  2. Insider view matters: We all know the value of someone recommending us for a role in an organization. I am fortunate to be around people who believe in what I could bring to the table. Two Puritans who had been with VERITAS in their previous lives helped me understand Pure Storage from insiders’ perspective. These individuals are thought leaders who weren’t shy to take some bold risks to embrace what Pure had promised.
  1. It is not all about work – work – work: Puritans know how to have fun. Tons of pictures in social media speak for themselves. #PaintItOrange
  1. The innovation starts with software: All-flash array (AFA) is the tangible product from Pure, but the innovation started with its core Purity Operating System that powers those arrays. Enterprise grade reliability and performance that is coupled with consumer level simplicity and efficiency could make you see why Pure is considered the Apple of data centers.
  1. Harnessing the power of cloud: Many big vendors in enterprise IT talk about cloud as a way to stay relevant as disruption is imminent. How many times have you seen the same legacy technology repackaged as ‘cloud offering’? Pure Storage used cloud as an opportunity to redefine and merge the lines between management and support services. Pure1 is just the beginning of this innovation.
  1. Storage virtualization meets data virtualization: Storage virtualization is a way to consolidate and manage storage media to improve availability, efficiency and performance. Data virtualization levels up the same paradigm where the application/data owner can create and manage copies without the need to understand where and how it is stored physically. Pure Storage’s data reduction methods and space efficient copy creations blur the line between storage and data virtualization.
  1. Nip CDM problem in the bud: Thanks to the early works from analyst firms like IDC and players like Actifio and Delphix, organizations are starting to understand the storage waste created by copy data. Legacy storage vendors have no motivation to solve the problem, as it would cannibalize high margin revenue from spinning disks. Purity’s approach to virtualize storage and data while letting the application owner manage copies from their familiar tools is powerful enough to kill copy-data sprawl at the source.
  1. Mission and drive to lead the market: Unlike many storage startups that were designed for sale to incumbents, Pure Storage is on a mission to become a mainstream player. While I understand that Pure Storage’s board of directors has the fiduciary duty to act on behalf of shareholders; the visionary management team, energetic employees and ecstatic customers are likely to give enough reasons to let Pure grow on its own.

Disclaimer: I am an employee of Pure Storage, Inc. My statements and opinions on this site are my own and do not necessarily represent those of Pure Storage

Note: This post originally appeared in my LinkedIn Pulse page


Did Rubrik make Veeam’s Modern Data Protection a bit antiquated?

Veeam Antiquated?
Veeam Antiquated?

Modern Data Protection™ got a trademark from Veeam. No, I am not joking. It is true! Veeam started with a focused strategy. It will do nothing but VMware VM backups. Thankfully VMware had done most of the heavy lifting with vStorage APIs for Data Protection (VADP) so developing a VM-only backup solution was as simple as creating a software plugin for those APIs and developing a storage platform for keeping the VM copies. With a good marketing engine Veeam won the hearts of virtual machine administrators and it paid off.

As the opportunity to reap the benefits as a niche VM-only backup started to erode (intense competition, low barrier to entry on account of VADP), Veeam is attempting to re-invent its image by exploring broader use cases like physical systems protection, availability etc. Some of these efforts make it look like its investors are hoping for Microsoft to buy Veeam. The earlier wish to sell itself to VMware shattered when VMware adopted EMC Avamar’s storage to build its data protection solution.

Now Rubrik is coming to market and attacking the very heart of Veeam’s little playground while making Veeam’s modern data protection a thing of past. Rubrik’s market entry is also through VMware backups using vStorage APIs but with a better storage backend that can scale out.

Both Veeam and Rubrik have two high level tiers. The frontend tier connects to vSphere through VMware APIs. It discovers and streams virtual machine data. Then there is a backend storage tier where virtual machine data is stored.

For Veeam the front-end is a standalone backup server and its possible backup proxies. The proxies (thanks to VMware hot-add) enable limited level of scale-out for the frontend, but this approach leeches resources from production and increases complexity. The backend is one or more backup repositories. There is nothing special about the repository; it is a plain file system. Although Veeam claims to have deduplication built-in, it is perhaps the most primitive in the industry and works only across virtual machines from the same backup job.

Rubrik is a scale-out solution where the frontend and backend are fused together from users’ perspective. You buy Rubrik bricks where each brick consists of four nodes. These are the compute and storage components that cater to both frontend in streaming virtual machines from vSphere via NBD or SAN transport (kudos to Rubrik for ditching hot-add!) and backend, which is a cluster file system that spans nodes and bricks. Rubrik claims to have global deduplication across all its cluster file system namespace.

Historically, the real innovation from Veeam was the commercial success of powering on virtual machines directly from the backup storage. Veeam may list several other innovations (e.g. they may claim that they ‘invented’ agentless backups, but it was actually done by VMware in its APIs) in their belt but exporting VMs directly from backup is something every other vendor followed afterwards and hence kudos go to Veeam on that one. But this innovation may backfire and may help Veeam customers to transition to Rubrik seamlessly.

Veeam customers are easy targets for Rubrik for a few reasons.

  • One of the cornerstones of Veeam’s foundation is its dependency on vStorage APIs from VMware; it is not a differentiator because all VMware partners have access to those APIs. Unlike other backup vendors, Veeam didn’t focus on building application awareness and granular quiescence until late in the game
  • Veeam is popular in smaller IT shops and shadow projects within large IT environments. It is a handy backup tool, but it is not perceived as a critical piece in meeting regulatory specs and compliance needs. It had been marketed towards virtual machine administrators; hence higher-level buying centers do no have much visibility. That adversely affects Veeam’s ‘stickiness’ in an account.
  • Switching from one backup application to another had been a major undertaking historically. But that is not the case if customers want to switch from Veeam to something else. Earlier days, IT shops needed to standup both solutions until all the backup images from the old solution would hit the expiration dates. Or you have to develop strategies to migrate old backups into the new system, a costly affair. When the source is Veeam with 14 recovery points per VM by default, you could build workflows that spin up each VM backup in a sandbox and let the new solution back it up as if it is a production copy. (Rubrik may want to work on building a small migration tool for this)
  • Unlike Veeam that started stitching support for other hypervisors and physical systems afterwards, Rubrik has architected its platform to accommodate future needs. That design may intrigue customers when VMware customers are looking to diversify into other hypervisors and containers.

The fine print is that Rubrik is yet to be proven. If the actual product delivers on the promises, it may have antiquated Veeam. The latter may be become a good case study for business schools on not building a product that is dependent too much on someone else’s technology.

Thanks to #VFD5 TechFieldDay for sharing Rubrik’s story. You can watch it here: Rubrik Technology Deep Dive

Disclaimer: I work for Veritas/Symantec, opinions here are my own.

Getting to know the Network Block Device Transport in VMware vStroage APIs for Data Protection

When you backup a VMware vSphere virtual machine using vStorage APIs for Data Protection (VADP), one of the common ways to transmit data from VMware data store to backup server is through Network Block Device (NBD) transport. NBD is a Linux-like module that attaches to VMkernel and makes the snapshot of the virtual machine visible to backup server as if the snapshot is a block device on network. While NBD is quite popular and easy to implement, it is also the least understood transport mechanisms in VADP based backups.

NBD is based on VMware’s Network File Copy (NFC) protocol. NFC uses VMkernel port for network traffic. As you already know, VMkernel ports may also be used by other services like host management, vMotion, Fault Tolerance logging, vSphere Replication, NFS, iSCSI an so on. It is recommended to create specific VMkernel ports that attach to dedicated network adapters if you are using a bandwidth intensive service. For example, it is highly recommended to dedicate an adapter for Fault Tolerance logging.

Naturally, the first logical solution to drive high throughput from NBD backups would be to dedicate a bigger pipe for VADP NBD transport. Many vendors put this as the best practice but that alone won’t give you performance and scale.

Let me explain this using an example. Let us assume that you have a backup server streaming six virtual machines from an ESXi host using NBD transport sessions. The host and backup server are equipped with 10Gb adapters. In general a single 10Gb pipe can deliver around 600 MB/sec. So you would expect that each virtual machine would be backed up at around 100 MB/sec (600 MB/sec divided into 6 streams for each virtual machine), right? However, in reality each stream would have access to much lower share of bandwidth because VMkernel automatically caps each session for stability. Let me show you the actual results from a benchmark that we conducted where we measured performance as we increased the number of streams.

NBD Transport and number of backup streams
NBD Transport and number of backup streams

As you can see, by the time the number of streams has reached 4 (in other words, four virtual machines were simultaneously getting backed up), each stream is able to deliver just 55 MB/sec and the overall throughput is 220 MB/sec. This is nowhere near the available bandwidth of 600 MB/sec.

The reasoning behind this type of bandwidth throttling is straightforward. You don’t want VMkernel to be strained by serving this type of copy operations while it has better things to do. VMkernel’s primary function is to orchestrate VM processes. VMware engineering (VMware was also a partner in this benchmark, we submitted the full story as a paper for VMworld 2012) confirmed this behavior as normal.

This naturally puts NBD as a second-class citizen in backup transport world, doesn’t it? The good news is that there is a way to solve this problem! Instead of backing up too many virtual machines from the same host, just make your backup policy/job configuration to distribute the load over multiple hosts. Unfortunately, in environments with 100s of hosts and 1000s of virtual machines, it may be difficult to do it manually. Veritas NetBackup provides VMware Resource Limits as part of its Intelligent Policies for VMware backup where you can limit the number of jobs at VMware vSphere object levels, which is quite handy in this type of situations. For example, I ask customers to limit number of jobs per ESXi host to 4 or less using such intelligent policies and resource limit setting. Thus NetBackup can scale-out its throughput by tapping NBD connections from multiple hosts to keep its available pipe fully utilized while limiting the impact of NBD backups on production ESXi hosts.

Thus Veritas NetBackup moves NBD to first class status in protecting large environments even when the backend storage isn’t on Fiber Channel SAN. For example, NetBackup’s NBD has proven its scale in NetApp FlexPod, VCE VBLOCK, Nutanix and VMware EVO (VSAN). Customers could enjoy the simplicity of NBD and scale-out performance of NetBackup in these converged platforms.


Taking VMware vSphere Storage APIs for Data Protection to the Limit: Pushing the Backup Performance Envelope; Rasheed, Winter et al. VMworld 2012

Full presentation on Pushing the Backup Performance Envelope

Checkmate Amazon! Google Nearline may be the Gmail of cold storage

April Fools’ Day 2004: Google announced Gmail, a free search based e-mail service with storage capacity of 1 gigabyte per user1. The capacity was unbelievably high when compared to other free Internet e-mail providers of that time. Hotmail and Yahoo! were giving 2-4MB per user. The days when inbox management used to be a daily chore are no more. The initial press release from the search giant differentiated it’s offering from others on three S’s: Search, Storage and Speed.

Google Nearline may be the Gmail of cold storage
Google Nearline may be the Gmail of cold storage

I wish Google waited a couple more weeks to announce Google Cloud Storage Nearline. It would have been fun to see it announced on April Fools’ Day. Nearline to a business today is how Gmail was to a consumer a decade ago.

Search: Google doesn’t talk about search in the context of Nearline. But nuts don’t fall that far away from the tree. Google wants your business to dump all your cold data in its cloud. It has the resources to adopt a loss leader strategy to help you keep data at lower cost in its cloud. Later you may be offered data mining and analytics as a service where Google would really shine and make money. The economies of scale will benefit both Google and you. Does anyone remember the search experience in Hotmail a decade ago?

Storage: Sorry, you aren’t getting the storage for free but it is cheap. It is a penny per month per gigabyte for data at rest. Instead of declaring a price war with Amazon’s Glacier, Google decided to match its pricing while differentiating itself from Glacier radically with simplicity and access. Unlike Amazon, the cold and standard storage from Google uses the same method of access thereby eliminating operational overhead or programming needs.

Speed: Amazon went old school with Glacier. It is designed look and feel like tape. It takes a few days for you to retrieve data, analogous to getting tapes shipped to you from an offsite location. This is where Google directly poked Amazon. Google is offering an average 3-second response time for data requests! Do you recall how Gmail JavaScript based coding made Hotmail to look like a turtle reloading entire web pages for each action?

Let’s come back to April Fools’ Day again. It happens to be the day after World Backup Day. The cold storage today is backup for most businesses. One of the strategic partnerships that Google made for Nearline launch is impeccable. According to Veritas/Symantec, NetBackup manages half of world’s enterprise data. It is not surprising why Google wanted Veritas to be in the Nearline bandwagon2. The best data pumps for business data is NetBackup and that relationship is a strategic win for Google right off the bat.

  1. Google Gets the Message, Launches Gmail
  2. Access, Agility, Availability: NetBackup and Google Cloud Storage Nearline

Dear Competitor “C”, all that snaps are not snapshots!

Benchmarking for truth
Benchmarking for truth

Common sense tells us that the creation of recovery points for applications from storage snapshots should be faster than the traditional methods of backing up the entire dataset. The storage solutions in the market have matured to provide space efficient recovery points through snapshots. A backup and recovery solution can make use of storage snapshots to create recovery points and provide additional values like information life cycle management and content indexing.

The faster you create a recovery point, the better the possibility of achieving aggressive recovery point objectives (RPOs). For example, if it takes 10 minutes to create a recovery point, the best possible RPO is also 10 minutes. Storage snapshots are great candidates for achieving such aggressive recovery points. This is the reason industry analysts vouch for storage snapshot integration in backup and recovery solutions.

However, a competitor to Symantec NetBackup (let us call this vendor as Competitor ‘C’) had been fooling industry analysts for a few years. Competitor ‘C’ positions itself as a ‘leader’ in storage snapshot integration. It received some brownie points for ticking the checkboxes in supporting multiple storage vendors. Symantec had commissioned an independent third party benchmarking company to validate the truth in this vendor’s capability. The result had been shocking.

Check out my official Symantec blog for the gory details.

Disclaimer: The blogs in MrVray.com are reflections of my own opinions.

VMware EVO: The KFC of SDDC

EVO is the KFC of SDDC
EVO is the KFC of SDDC

VMware EVO is bringing to software-defined data centers the same type of business model that Kentucky Fried Chicken had brought to restaurants decades ago. VMware is hungry to grow and is expanding its business to new territories. Colonel Sanders’s revolutionary vision to sell his chicken recipe and brand through franchise model is now coming to IT infrastructure as ready-to-eat value meals.

Most of the press reports and analyst blogs are focused on VMware’s arrival into converged infrastructure market. Of course, vendors like Nutanix and SimpliVity will certainly lose sleep as the 800-pound gorilla has set its eyes on converged infrastructure market. However, VMware’s strategy is much deeper than taking over the converged infrastructure market from upstarts, it is a bold attempt to disrupt the business model of selling IT infrastructure stacks while keeping public cloud providers away from enterprise IT shops.

Bargaining power of supplier: Have you noticed the commanding power of VMware in EVO specifications? Partners like Dell and EMC are simply the franchisees of VMware’s infrastructure recipe and brand. It is no secret that traditional servers and storage are on the brink of disruption because buyers wouldn’t pay premium for brand names much longer. It is the time for them to let go of individuality and become delivery model for a prescriptive architecture (franchise model) from a stronger supplier in the value chain.

Software is now the king, no more OEM: In the old world where hardware vendors owned brand power and distribution chains, software vendors had to make OEM deals to get their solutions to the market in those hardware vehicles. Now the power is shifting to software. The software vendor prescribes (a softened term that actually stands for ‘dictates’) how infrastructure stacks should be built.

Short-term strategy, milk the converged infrastructure market: This is the most obvious hint VMware has given; reporters, bloggers and analysts have picked up this obvious message. As more and more CIOs are looking to reduce capital and operational costs, the demand for converged systems is growing rapidly. Even the primitive assembled-to-order type solutions from VCE and NetApp-Cisco are milking the current demand for simplified IT infrastructure stacks. Nutanix leads the pack in relatively newer and better hyper-convergence wave. VMware’s entry into this market validates that convergence is a key trend in modern IT.

Long-term strategy, own data center infrastructure end-to-end while competing with public clouds: The two of three key pillars of VMware strategy are enabling software-defined data centers and delivering hybrid clouds. Although SDDC and hybrid cloud would look like two separate missions, the combination is what is needed to fight Amazon and other public cloud solutions from taking over the workloads from IT shops. The core of VMware’s business is selling infrastructure solutions for on-prem data centers. Although VMware positions itself as the enabler of service providers, it understands that the bargaining power of customers would continue to stay low if organizations stick to on-prem solutions. This is where SDDC strategy fits. By commoditizing infrastructure components (compute, storage and networking) and shifting the differentiation to infrastructure management and service delivery, VMware wants to become the commander in control for SDDCs (just like how Intel processors dictated direction for PCs in the last two decades). EVO happens to be that SDDC recipe it wants to franchise to partners so that customers could taste the same SDDC no matter who their current preferred hardware vendors are. Thus EVO is the KFC of SDDC. It is not there as a Nutanix killer, VMware also wants to take shares from Cisco (Cisco UCS is almost #1 in server market, Cisco is #1 in networking infrastructure), EMC Storage (Let us keep the money in the family, the old man’s hardware identity is counting its days) and other traditional infrastructure players. At the same time, VMware wants to transform vCloud Air (the rebranded vCloud Hybrid Service) as the app store for EVO based SDDCs to host data services in cloud. It is a clever plan to keep selling to enterprises and hide them away from the likes of Amazon. Well played, VMware!

So what will the competitive action from Amazon and other public cloud providers? Amazon has resources to build a ready-to-eat private Fire Cloud for enterprises that can act as the gateway to AWS. All this time, Amazon focused mainly on on-prem storage solutions that extend to AWS. We can certainly expect the king of public clouds do something more. It is not a question of ‘if’; rather it is the question of ‘when’.

EMC’s Hardware Defined Control Center vs. VMware’s Software Defined Data Center

EMC trying to put the clock back from software-defined storage movement
EMC trying to put the clock back from software-defined storage movement

EMC’s storage division appears to be in old yeller mode. It knows that customers would eventually stop paying a premium for branded storage. The bullets to put branded storage out of its misery are coming from software defined storage movement led by its own stepchild VMware. But the old man is still clever and pretending to hangout with the cool kids to stay relevant while trying to survive as long as there are CIOs willing to pay premium for storage with a label.

Software-defined storage is all about building storage and data services on top of commodity hardware. No more vendor locked storage platforms on proprietary hardware. This movement offers high performance at lower cost by bringing storage closer to compute. Capacity and performance are two independent vectors in software-defined storage.

TwinStrata follows that simplicity model and had helped customers extend the life of existing investments with true software solutions. The data service layer offers storage tiering where that last tier could be a public cloud. EMC wants the market to believe that its acquisition of TwinStrata is an attempt to embrace software-defined storage movement. But the current execution plan is a little backward. EMC’s plan is a bolted-on type of integration for TwinStrata IP on top of legacy VMAX storage platform. That means EMC wants to keep the ‘software-defined’ IP closer to its proprietary array itself. The goal is, of course, to prolong the life of VMAX in the software-defined world. While it defeats the rationale behind software-defined storage movement, it may be the last straw to pull the clock back a little.

Hopefully there is another project where EMC will seriously consider building a true software-defined storage solution from the acquired IP without the deadweight of legacy platforms. Perhaps transform ViPR from vaporware to something that really rides the wave of software-defined movement?

Is the perfect storm headed toward purpose-built storage systems?

Is the era of storage systems (arrays) facing disruption? Do the expensive monolithic chassis sellers need to find new ways to make money? Do the investors betting on newer storage array startups need to cash in now? Although it may feel unlikely in the near term, the perfect storm may not be that far away.

Is the perfect storm headed toward purpose-built storage systems?
Is the perfect storm headed toward purpose-built storage systems?

Let us think about how storage arrays came to solve problems for IT. There were two distinct transformations in this industry:

More information in Symantec Connect’s Storage and Availability blog

The Big Hole in EMC Big Data backup story

It is one of the crucial roles for the marketing team in any organization to communicate the value of its products and services. It is not uncommon (pardon the double negative) for organizations to show the best side of its story while deliberately hiding the weaker aspects through fine prints. The left side of the picture below is the snapshot of breakfast cereal (General Mills’ Total) that came with my breakfast order in Sheraton while travelling on business.

EMC appears to have a Big Hole in its Big Data Backup
EMC appears to have a Big Hole in its Big Data Backup

Note that General Mills had claimed 100% of daily value of 11 vitamins and minerals but with an asterisk. The claim is true only if I consume 53g serving, but the box has only 33g!

Although I may have felt a bit taken back as a consumer, I enjoyed giving a bit of hard time to my General Mills friends and I moved on. This is a small transaction.

What if you were responsible for a transaction worth tens of thousands of dollars and were pitched a glass half-full story like this? It does happen. That General Mills cereal box is what came to my mind when I saw this blog from EMC on protecting Big Data (Teradata) workloads using EMC ‘Big Data backup solution’.

General Mills had the courtesy put the fine print that part of the vitamins and minerals are missing from its box. EMC’s blog didn’t really call out what was missing from its ‘box’ aka Data Domain device to protect Teradata workload using Teradata Data Stream Architecture. In fact it is missing the real brain of the solution: NetBackup!

First a little bit of history and some naked truth. Teradata had been working with NetBackup for over a decade to provide data protection for its workloads. In fact, Teradata sells the NetBackup Agent for Teradata for its customers. This agent pushes the data stream to NetBackup media servers. This is where the real workload aware intelligence (the real brain for this Big Data backup) is built. Once NetBackup media server receives the data stream it can store it on any supported storage: NetBackup Deduplication Pool, NetBackup Advanced Disk Pool, NetBackup OpenStorage Pool or even on a tape storage unit! When it comes to NetBackup OpenStorage Pool, it does not matter who the OpenStorage partner is; it can be EMC Data Domain, Quantum DXi,… The naked truth is that the backend devices are dumb storage devices from the view of NetBackup Agent for Teradata (the Teradata BAR component depicted in the blog).

EMC’s blog appears to have been designed to mislead the reader. It tends to imply that there is some sort of special sauce built natively into Data Domain (or Data Domain Boost) for Teradata BAR stream. The blog is trying to attach EMC to Big Data type workloads through marketing. May I say that the hole is quite big in EMC’s Big Data backup story!

I am speculating that EMC had been telling this story for a while in private engagements with clients. Note that the blog is simply displaying some of EMC’s slides that are marked ‘confidential’. The author forgot to remove it before publishing it. In closed meetings with joint customers of Teradata and NetBackup, a slide like this will create the illusion that Data Domain has something special for Teradata backup. Now the truth just leaked!

NetBackup Accelerator vs. Simpana DASH Full

I want to start this blog with a note.

I mean no disrespect to CommVault as a company or its engineers innovating its products. Being an engineer myself by trade, I do understand that innovations are triggered by market demands and there is always room for improvements in any product. This blog is entirely my own opinions.

As most of you guys reading this blog know, I also write for official Symantec blogs. I recently got an opportunity to take readers of Symantec Connect on a deep dive into one of the major features in NetBackup 7.6 for VMware vSphere and vCloud environments. It is primarily targeted for users of NetBackup who knows its nuts and bolts. A couple of employees from a CommVault read the blog. It is natural in competitive intelligence world to look for weak spots or things that can be selectively pointed out to show parity. It is part of their job and I respect it. However it appeared that they wanted to claim parity for Simpana with NetBackup Accelerator for VMware based on two statements (tweets, to be precise!). While asking to elaborate, the discussion went on a rat hole with statements made out of context and downright unprofessional. Hence here I go with an attempt to compare Simpana 10 with NetBackup 7.6 on the very topic discussed in official blog.

Claims to equate parity with NetBackup Accelerator for VMware

  1. (Not explicitly stated) Simpana supports CBT
  2. Simpana had ‘block detection’ for over a year
  3. Simpana does synthetics

The attempt here is to check all the boxes to claim parity while at times people do miss the big picture! At times they were equating apples to oranges. Hence I am going to attempt to clarify this as much as possible using Simpana language for the benefit those two employees.

Simpana supports CBT: Of course, every major vendor supports it. It is an innovation from VMware. The willingness to support a feature from vStorage APIs is important to protect VMware virtual machines.

What sets NetBackup 7.6 apart from Simpana 10 in this case is that Simpana’s implementation of CBT is limited to recovering an entire VM or individual files from the VM. If you have enterprise applications (e.g. Microsoft Exchange, Microsoft SQL Server etc.), you must stream data through an agent inside the guest to protect those applications and perform granular recovery. The value of CBT is to minimize data processing and movement load on production VMs while performing backups. A virtual machine’s operating system binaries and related files are typically static and CBT won’t add much value there. The real value comes from daily changes to disk blocks by applications! That means ZERO value in Simpana to protect enterprise applications with its implementation of vSphere CBT.

Simpana had block detection for over a year,  Simpana does synthetics: The employee is trying to add a check box for Simpana next to NetBackup’s capability to make use of Symantec V-Ray to detect deleted blocks. Nice try!

First and foremost, the block optimization technique described in my blog is present in NetBackup since 2007, with version 6.5.1 when Symantec announced support for VMware Virtual Infrastructure 3. Congratulations on trying to claim that Simpana had this capability after half a decade! But wait…. We are talking about apple and orange here.

This technique had been available for both full and incremental backup schedules. It works no matter where backups are going to, disk, deduplicated disk, tape or cloud. NetBackup’s block optimization happens closer to the data source. Thus it detects deleted blocks at the backup host so that the deleted blocks never appear in SAN or LAN traffic to the backup storage. That is optimization for processing-power, interconnect-bandwidth and storage!

CommVault employee was in a hurry to equate this to something Simpana caught up recently.  This is what I believe he is referring to. (I am asking him to tweet back if there is anything else).  Quoted from Simpana 10 online documentation.

DASH Full is a read optimized Synthetic Full operation which does not require traditional full backups to be performed. Once the first full backup is completed, changed blocks are protected during incremental or differential backups. A DASH Full will run in place of traditional full or synthetic full. This operation does not require movement of data. It will simply update indexing information and the deduplication database signifying that a full backup has been completed. This will significantly reduce the time it takes to perform full backups.

There are so many things I want to say about this, but I am trying to be concise here with bullet points.

  • What Simpana has here is an equivalent of NetBackup OpenStorage Optimized Synthetics that was introduced in NetBackup 6.5.4 (in 2009). While NetBackup still supports this capability, Symantec had taken this to the next level with NetBackup Accelerator. For the record, NetBackup Accelerator is also backed by Optimized Synthetics and hence the so-called ‘block detection’ is there in NetBackup since 2009.
  • The optimization I was talking about was the capability to detect deleted blocks from the CBT data stream while CommVault is touting about data movement within backup storage!
  • The DASH full requires incremental backups and separate schedules for synthetic backups. NetBackup Accelerator eliminates this operational inefficiency by synthesizing full image inline using the resources needed for an incremental backup.
  • If you are curious about how NetBackup Accelerator in general is different from Optimized Synthetics (or DASH Full), this blog would help.
  • Last but not the least, did I say that NetBackup Accelerator for VMware works with enterprise applications as well? Thus both CBT and deleted blocks detection (both relevant to applications that does the real work inside VM) adds real value for NetBackup Accelerator