Software-defined storage does not have to be a rollercoaster ride

Photo Credit:  Chuck Quigley
Software-defined storage does not have to be a rollercoaster ride

Thanks to VMware’s vision on software-defined data center (SDDC), software-defined anything is one of the leading buzzwords today. Software-defined storage (SDS) is not an exception. SDS feels like a big shift in storage world, but good news is that the transition is much smoother than it sounds. Let us take a closer look. What is SDS? There are many vendor specific definitions and interpretations for SDS. Industry analysts have their own versions too. Hence let us focus on what matters the most. The characteristics and benefits expected in general from SDS. There are the four pillars.

  1. Abstraction:Data pane and control pane are separated. In other words, the storage management is decoupled from the actual storage itself. Customer benefit: flexibility to solve for storage and management needs independently
  2. Backend heterogeneity: Storage is served by any kind of storage from any vendor including commodity storage. Customer benefits: Freedom of choice for storage platforms, avoid lock-ins.
  3. Frontend heterogeneity:Storage is served to any kind of consumers (operating systems, hypervisors, file services etc.) Customer benefit: Freedom of choice for computing platforms, avoid lock-ins.
  4. Broker for storage services: SDS brokers storage services, no matter where data is placed and how it is stored, through software that in turn will translate those capabilities into storage services to meet a defined policy or SLA. Customer benefits: Simplified management, storage virtualization, and value-added data services through vendor or customer innovations.

Three out of four pillars are needed to qualify as a software-defined storage solution. Pillars 1 & 4 are must haves. Once you have those two you need 2 or 3. The reality is that SDS movement had started long time ago. Let us use some examples to understand the SDS implementations.

Oracle Automatic Storage Management (ASM): Although Oracle seldom markets ASM as an SDS solution, it happened to be a great example of SDS. It is purpose-built for Oracle databases. It features all the four pillars and storage is entirely managed by the application owner (Oracle DBA). The pillar 2 is questionable here. It does have that pillar because the solution runs on multiple OS platforms. However it serves just one type of workload, hence that pillar is not truly delivering frontend heterogeneity.

Veritas InfoScale: Formerly known as Veritas Storage Foundation, Veritas InfoScale is perhaps the most successful, heterogeneous, and general purpose SDS solution. While it is still widely in use, it is a host based SDS solution (the pillars are built on top of the operating system) and hence not a good fit for virtualized world.

VMware Virtual Volumes (VVOLs): VMware VVOL is purpose-built for VMware vSphere. Hence it lacks pillar 3. VVOLs shine well with the other three pillars. A virtual infrastructure admin could manage everything from a single console.

Now that we covered the characteristics of SDS, let us look at the bigger picture as an IT architect. The great thing about SDS solutions is the interoperability to build the right solution for workloads so as to solve for constantly changing storage needs. You can be quite creative (and of course, even go crazy!) with the type of things you can build with SDS Lego blocks.

You can deploy Oracle ASM on top of Veritas InfoScale so that DBAs could benefit from both. ASM enables Oracle DBAs to manage their storage while InfoScale brings centralized management for storage administrators.

How about that virtual server environment where Veritas InfoScale is falling short? Bring in storage LUNs directly into vSphere hosts for a VMFS experience where they enjoy the benefits provided by VMware. Are virtual machine infrastructure administrators getting ready to manage storage on their own? Give them the array plugin for VMware vSphere Web Client. Or prepare them for VASA provider from storage vendor to get ready for VVOLs!

The main takeaway is simply this. SDS is a blessing for IT architects to solve storage puzzles elegantly. It had been here for long time, it is also constantly evolving with market inspired innovations. Transitioning to SDS is relatively smooth.

Disclaimer: The opinions here are my own. It does not reflect those of my current or previous employers.

VMware EVO: The KFC of SDDC

EVO is the KFC of SDDC
EVO is the KFC of SDDC

VMware EVO is bringing to software-defined data centers the same type of business model that Kentucky Fried Chicken had brought to restaurants decades ago. VMware is hungry to grow and is expanding its business to new territories. Colonel Sanders’s revolutionary vision to sell his chicken recipe and brand through franchise model is now coming to IT infrastructure as ready-to-eat value meals.

Most of the press reports and analyst blogs are focused on VMware’s arrival into converged infrastructure market. Of course, vendors like Nutanix and SimpliVity will certainly lose sleep as the 800-pound gorilla has set its eyes on converged infrastructure market. However, VMware’s strategy is much deeper than taking over the converged infrastructure market from upstarts, it is a bold attempt to disrupt the business model of selling IT infrastructure stacks while keeping public cloud providers away from enterprise IT shops.

Bargaining power of supplier: Have you noticed the commanding power of VMware in EVO specifications? Partners like Dell and EMC are simply the franchisees of VMware’s infrastructure recipe and brand. It is no secret that traditional servers and storage are on the brink of disruption because buyers wouldn’t pay premium for brand names much longer. It is the time for them to let go of individuality and become delivery model for a prescriptive architecture (franchise model) from a stronger supplier in the value chain.

Software is now the king, no more OEM: In the old world where hardware vendors owned brand power and distribution chains, software vendors had to make OEM deals to get their solutions to the market in those hardware vehicles. Now the power is shifting to software. The software vendor prescribes (a softened term that actually stands for ‘dictates’) how infrastructure stacks should be built.

Short-term strategy, milk the converged infrastructure market: This is the most obvious hint VMware has given; reporters, bloggers and analysts have picked up this obvious message. As more and more CIOs are looking to reduce capital and operational costs, the demand for converged systems is growing rapidly. Even the primitive assembled-to-order type solutions from VCE and NetApp-Cisco are milking the current demand for simplified IT infrastructure stacks. Nutanix leads the pack in relatively newer and better hyper-convergence wave. VMware’s entry into this market validates that convergence is a key trend in modern IT.

Long-term strategy, own data center infrastructure end-to-end while competing with public clouds: The two of three key pillars of VMware strategy are enabling software-defined data centers and delivering hybrid clouds. Although SDDC and hybrid cloud would look like two separate missions, the combination is what is needed to fight Amazon and other public cloud solutions from taking over the workloads from IT shops. The core of VMware’s business is selling infrastructure solutions for on-prem data centers. Although VMware positions itself as the enabler of service providers, it understands that the bargaining power of customers would continue to stay low if organizations stick to on-prem solutions. This is where SDDC strategy fits. By commoditizing infrastructure components (compute, storage and networking) and shifting the differentiation to infrastructure management and service delivery, VMware wants to become the commander in control for SDDCs (just like how Intel processors dictated direction for PCs in the last two decades). EVO happens to be that SDDC recipe it wants to franchise to partners so that customers could taste the same SDDC no matter who their current preferred hardware vendors are. Thus EVO is the KFC of SDDC. It is not there as a Nutanix killer, VMware also wants to take shares from Cisco (Cisco UCS is almost #1 in server market, Cisco is #1 in networking infrastructure), EMC Storage (Let us keep the money in the family, the old man’s hardware identity is counting its days) and other traditional infrastructure players. At the same time, VMware wants to transform vCloud Air (the rebranded vCloud Hybrid Service) as the app store for EVO based SDDCs to host data services in cloud. It is a clever plan to keep selling to enterprises and hide them away from the likes of Amazon. Well played, VMware!

So what will the competitive action from Amazon and other public cloud providers? Amazon has resources to build a ready-to-eat private Fire Cloud for enterprises that can act as the gateway to AWS. All this time, Amazon focused mainly on on-prem storage solutions that extend to AWS. We can certainly expect the king of public clouds do something more. It is not a question of ‘if’; rather it is the question of ‘when’.

EMC’s Hardware Defined Control Center vs. VMware’s Software Defined Data Center

EMC trying to put the clock back from software-defined storage movement
EMC trying to put the clock back from software-defined storage movement

EMC’s storage division appears to be in old yeller mode. It knows that customers would eventually stop paying a premium for branded storage. The bullets to put branded storage out of its misery are coming from software defined storage movement led by its own stepchild VMware. But the old man is still clever and pretending to hangout with the cool kids to stay relevant while trying to survive as long as there are CIOs willing to pay premium for storage with a label.

Software-defined storage is all about building storage and data services on top of commodity hardware. No more vendor locked storage platforms on proprietary hardware. This movement offers high performance at lower cost by bringing storage closer to compute. Capacity and performance are two independent vectors in software-defined storage.

TwinStrata follows that simplicity model and had helped customers extend the life of existing investments with true software solutions. The data service layer offers storage tiering where that last tier could be a public cloud. EMC wants the market to believe that its acquisition of TwinStrata is an attempt to embrace software-defined storage movement. But the current execution plan is a little backward. EMC’s plan is a bolted-on type of integration for TwinStrata IP on top of legacy VMAX storage platform. That means EMC wants to keep the ‘software-defined’ IP closer to its proprietary array itself. The goal is, of course, to prolong the life of VMAX in the software-defined world. While it defeats the rationale behind software-defined storage movement, it may be the last straw to pull the clock back a little.

Hopefully there is another project where EMC will seriously consider building a true software-defined storage solution from the acquired IP without the deadweight of legacy platforms. Perhaps transform ViPR from vaporware to something that really rides the wave of software-defined movement?

Is the perfect storm headed toward purpose-built storage systems?

Is the era of storage systems (arrays) facing disruption? Do the expensive monolithic chassis sellers need to find new ways to make money? Do the investors betting on newer storage array startups need to cash in now? Although it may feel unlikely in the near term, the perfect storm may not be that far away.

Is the perfect storm headed toward purpose-built storage systems?
Is the perfect storm headed toward purpose-built storage systems?

Let us think about how storage arrays came to solve problems for IT. There were two distinct transformations in this industry:

More information in Symantec Connect’s Storage and Availability blog

The Big Hole in EMC Big Data backup story

It is one of the crucial roles for the marketing team in any organization to communicate the value of its products and services. It is not uncommon (pardon the double negative) for organizations to show the best side of its story while deliberately hiding the weaker aspects through fine prints. The left side of the picture below is the snapshot of breakfast cereal (General Mills’ Total) that came with my breakfast order in Sheraton while travelling on business.

EMC appears to have a Big Hole in its Big Data Backup
EMC appears to have a Big Hole in its Big Data Backup

Note that General Mills had claimed 100% of daily value of 11 vitamins and minerals but with an asterisk. The claim is true only if I consume 53g serving, but the box has only 33g!

Although I may have felt a bit taken back as a consumer, I enjoyed giving a bit of hard time to my General Mills friends and I moved on. This is a small transaction.

What if you were responsible for a transaction worth tens of thousands of dollars and were pitched a glass half-full story like this? It does happen. That General Mills cereal box is what came to my mind when I saw this blog from EMC on protecting Big Data (Teradata) workloads using EMC ‘Big Data backup solution’.

General Mills had the courtesy put the fine print that part of the vitamins and minerals are missing from its box. EMC’s blog didn’t really call out what was missing from its ‘box’ aka Data Domain device to protect Teradata workload using Teradata Data Stream Architecture. In fact it is missing the real brain of the solution: NetBackup!

First a little bit of history and some naked truth. Teradata had been working with NetBackup for over a decade to provide data protection for its workloads. In fact, Teradata sells the NetBackup Agent for Teradata for its customers. This agent pushes the data stream to NetBackup media servers. This is where the real workload aware intelligence (the real brain for this Big Data backup) is built. Once NetBackup media server receives the data stream it can store it on any supported storage: NetBackup Deduplication Pool, NetBackup Advanced Disk Pool, NetBackup OpenStorage Pool or even on a tape storage unit! When it comes to NetBackup OpenStorage Pool, it does not matter who the OpenStorage partner is; it can be EMC Data Domain, Quantum DXi,… The naked truth is that the backend devices are dumb storage devices from the view of NetBackup Agent for Teradata (the Teradata BAR component depicted in the blog).

EMC’s blog appears to have been designed to mislead the reader. It tends to imply that there is some sort of special sauce built natively into Data Domain (or Data Domain Boost) for Teradata BAR stream. The blog is trying to attach EMC to Big Data type workloads through marketing. May I say that the hole is quite big in EMC’s Big Data backup story!

I am speculating that EMC had been telling this story for a while in private engagements with clients. Note that the blog is simply displaying some of EMC’s slides that are marked ‘confidential’. The author forgot to remove it before publishing it. In closed meetings with joint customers of Teradata and NetBackup, a slide like this will create the illusion that Data Domain has something special for Teradata backup. Now the truth just leaked!

NetBackup Accelerator vs. Simpana DASH Full

I want to start this blog with a note.

I mean no disrespect to CommVault as a company or its engineers innovating its products. Being an engineer myself by trade, I do understand that innovations are triggered by market demands and there is always room for improvements in any product. This blog is entirely my own opinions.

As most of you guys reading this blog know, I also write for official Symantec blogs. I recently got an opportunity to take readers of Symantec Connect on a deep dive into one of the major features in NetBackup 7.6 for VMware vSphere and vCloud environments. It is primarily targeted for users of NetBackup who knows its nuts and bolts. A couple of employees from a CommVault read the blog. It is natural in competitive intelligence world to look for weak spots or things that can be selectively pointed out to show parity. It is part of their job and I respect it. However it appeared that they wanted to claim parity for Simpana with NetBackup Accelerator for VMware based on two statements (tweets, to be precise!). While asking to elaborate, the discussion went on a rat hole with statements made out of context and downright unprofessional. Hence here I go with an attempt to compare Simpana 10 with NetBackup 7.6 on the very topic discussed in official blog.

Claims to equate parity with NetBackup Accelerator for VMware

  1. (Not explicitly stated) Simpana supports CBT
  2. Simpana had ‘block detection’ for over a year
  3. Simpana does synthetics

The attempt here is to check all the boxes to claim parity while at times people do miss the big picture! At times they were equating apples to oranges. Hence I am going to attempt to clarify this as much as possible using Simpana language for the benefit those two employees.

Simpana supports CBT: Of course, every major vendor supports it. It is an innovation from VMware. The willingness to support a feature from vStorage APIs is important to protect VMware virtual machines.

What sets NetBackup 7.6 apart from Simpana 10 in this case is that Simpana’s implementation of CBT is limited to recovering an entire VM or individual files from the VM. If you have enterprise applications (e.g. Microsoft Exchange, Microsoft SQL Server etc.), you must stream data through an agent inside the guest to protect those applications and perform granular recovery. The value of CBT is to minimize data processing and movement load on production VMs while performing backups. A virtual machine’s operating system binaries and related files are typically static and CBT won’t add much value there. The real value comes from daily changes to disk blocks by applications! That means ZERO value in Simpana to protect enterprise applications with its implementation of vSphere CBT.

Simpana had block detection for over a year,  Simpana does synthetics: The employee is trying to add a check box for Simpana next to NetBackup’s capability to make use of Symantec V-Ray to detect deleted blocks. Nice try!

First and foremost, the block optimization technique described in my blog is present in NetBackup since 2007, with version 6.5.1 when Symantec announced support for VMware Virtual Infrastructure 3. Congratulations on trying to claim that Simpana had this capability after half a decade! But wait…. We are talking about apple and orange here.

This technique had been available for both full and incremental backup schedules. It works no matter where backups are going to, disk, deduplicated disk, tape or cloud. NetBackup’s block optimization happens closer to the data source. Thus it detects deleted blocks at the backup host so that the deleted blocks never appear in SAN or LAN traffic to the backup storage. That is optimization for processing-power, interconnect-bandwidth and storage!

CommVault employee was in a hurry to equate this to something Simpana caught up recently.  This is what I believe he is referring to. (I am asking him to tweet back if there is anything else).  Quoted from Simpana 10 online documentation.

DASH Full is a read optimized Synthetic Full operation which does not require traditional full backups to be performed. Once the first full backup is completed, changed blocks are protected during incremental or differential backups. A DASH Full will run in place of traditional full or synthetic full. This operation does not require movement of data. It will simply update indexing information and the deduplication database signifying that a full backup has been completed. This will significantly reduce the time it takes to perform full backups.

There are so many things I want to say about this, but I am trying to be concise here with bullet points.

  • What Simpana has here is an equivalent of NetBackup OpenStorage Optimized Synthetics that was introduced in NetBackup 6.5.4 (in 2009). While NetBackup still supports this capability, Symantec had taken this to the next level with NetBackup Accelerator. For the record, NetBackup Accelerator is also backed by Optimized Synthetics and hence the so-called ‘block detection’ is there in NetBackup since 2009.
  • The optimization I was talking about was the capability to detect deleted blocks from the CBT data stream while CommVault is touting about data movement within backup storage!
  • The DASH full requires incremental backups and separate schedules for synthetic backups. NetBackup Accelerator eliminates this operational inefficiency by synthesizing full image inline using the resources needed for an incremental backup.
  • If you are curious about how NetBackup Accelerator in general is different from Optimized Synthetics (or DASH Full), this blog would help.
  • Last but not the least, did I say that NetBackup Accelerator for VMware works with enterprise applications as well? Thus both CBT and deleted blocks detection (both relevant to applications that does the real work inside VM) adds real value for NetBackup Accelerator

Software Defined Storage: Next Big Thing? Or is it already here?

Software Defined Storage
Software Defined Storage

  Walk into a technology trade show with a bottle of tequila and a shot glass. Take a shot each time you hear the phrase ‘software defined’, you would need a cab to get back home. The new buzzword is ‘software defined’ and storage vendors are making ways to fit it in. It doesn’t matter whether it is one of those established players or upstarts. If it isn’t software defined, it is not cool.

What is SDS? There are many vendor specific definitions and interpretations for SDS. Industry analysts like Gartner and IDC have their own versions too. Hence I would just say what the attributes and values are expected in general from SDS.

Abstraction: Data pane and control pane are separated. In other words, the storage management is decoupled from the actual storage itself.

Backend heterogeneity:  Storage is served by any kind of storage from any vendor including commodity storage.

Frontend heterogeneity: Storage is served to any kind of consumers (operating systems, hypervisors, file services etc.)

SDS is a broker: This was the statement from Gartner. SDS brokers storage services, independent of where data is placed and how it is stored, through software that in turn will translate those capabilities into storage services that meet a defined policy or SLA.

Logical volume managers in operating systems had provided the attributes 1 and 2 since 80s. The cross-platform volume manager from Veritas (acquired by Symantec) brought 3 into the mix with the introduction of portable data containers (also known as cross-platform data sharing disks) in 2004. Other notable examples fitting the requirements 1,2 and 3 are IBM’s SAN Volume Controller (SVC) offering and NetApp’s V-series.

Now let us take a look at the final attribute to fit something in SDS, the broker role. Surprisingly, Veritas Volume Manager (VxVM) meets that requirement as well. Storage services like file service, deduplication, storage-tiering etc. are provided by Veritas Volume Manager independent of where and how it is stored.

Let me elaborate this further to prove this point. VxVM abstracts storage from any array or commodity storage. This is old news. What is new about the latest version of VxVM (part of Symantec Storage Foundation 6.1 powered by Veritas) is that it can now provide a single name space (data service) across multiple nodes without the requirement to have disks shared via a storage area network. This completes the final requirement to fit the attributes required by SDS.

The point of the blog is not to underestimate the significance of SDS, rather to conclude that SDS may already be here depending on your definition and interpretation! Let us think beyond vendor presentations. How is EMC’s ViPR different from what NetApp V-Series had been offering for years? Should Flash be on hybrid array (e.g. Nimble Storage), all flash (e.g. Pure Storage) or close to system (e.g. Nutanix)? Or should you adopt something that gives the flexibility to choose (Symantec Storage Foundation 6.1, pardon my pitch). It truly depends on your business requirements, but I would say that Flash Anywhere might provide flexibility while storage industry is looking for the winner.