Can we have Backup-Less Systems?

By: In: Data Management On: Mar 03, 2011

In his blog last week, David Chapa of Enterprise Strategy Group (ESG) posed a question: “Is DR the new Backup?” This understandably got a lot of backup people excited and generated a lot of thoughtful discussion. David upped the ante this week with “Stop Backing up Data.” The main point he makes is about the need to focus on business protection and only backup data that is critical for protecting the business. However, the provocative title of his later blog had me thinking about a point of view in the industry that claims that backup as a separate process may not be necessary at all.

Like David, I too spent time at Cheyenne Software, and subsequently earned my livelihood from backup at CA, Mimosa and now Iron Mountain. The very mention of this topic is going to give my friends an aneurism and get them talking about naiveté, the invincibility of youth and so on.

I don’t know (didn’t get invited to Japan!) the context of the #HDSday notion tweeted by Andrew Reichman (@reichmanIT) of Forrester that “backup has to die because it’s such a high operational cost.” I think many will agree that Cloud backup can substantially reduce both the operational and infrastructure costs associated with on-premise backup. But let me take the point at its surface and explore if systems can be developed that eliminate backup.

Eliminating Backup for Exchange 2010

The foremost example of a system that is designed not to require backups is Microsoft Exchange 2010. For a business critical system, Exchange 2010 interestingly went against conventional wisdom in advocating two bold ideas:

  1. Use of direct attach Serial SCSI (SAS) or even SATA instead of requiring expensive SANs and Disk Arrays
  2. Using native mechanisms to protect Exchange and does not require thirdparty backup (or even Microsoft DPM) and the accompanying backup hardware and operational hassles

In hindsight, designing highly scalable storage systems using commodity DAS was similar to what Google, Facebook and others were doing and hence not as controversial, but we will leave that discussion for another time.

As a proof point, Microsoft IT has rolled out Exchange 2010 to support 515 office locations in 102 countries with more than 180,000 users. Their system eliminates backups completely (“Exchange Server 2010 Design and Architecture at Microsoft”). The key capabilities in Exchange 2010 that support high availability in Exchange 2010 are:

  • Database Availability Groups (DAGs) with replication of databases (at least 3 copies across multiple locations) using log shipping (shameless plug –Iron Mountain NearPoint also uses log shipping for capture of data for backup and archiving)
  • Lagged Database Copy (remote copy that lags behind by some specified period before a log is applied to that database)
  • Single Item Recovery
  • Deleted Item Retention Policies

While a majority of Exchange 2010 customers are backing up their systems using third-party solutions, there are some customers who seem to be comfortable adopting this backup-less approach. The feature that enables these customers to eliminate separate backup of their Exchange 2010 is the lagged data base copy feature that prevents corruption to be propagated to a copy if detected within the lag period. Plus a significant investment in an operational and support team.

Are there applications that are better suited to being Backup-less?

What if I don’t need to get back to an earlier point in time? Will replication suffice?

The challenge with replication is that it just as efficiently replicates errors and corruption as it replicates data. Applications with built-in asynchronous or time-delayed mechanisms for making copies across sites may be tempted to avoid backup, but this is fraught with risk and a lot will depend on the quality of the processes and people running operations.

The “Belts and Suspenders” approach–if the belt breaks, the suspenders will keep the pants on–of a separate backup process for protecting systems  is a best practice that has provided confidence in system recoverability for decades. While this approach introduces an additional process that can be operationally expensive, having a separate and independent backup process improves recoverability. Redundant systems successfully address issues with hardware failures, but a software bug can result in system failure if the systems have identical software. Having systems with different software implementations or different models will protect you from these bugs. 

A lot of things can go wrong – administrator mistakes, storage software bugs, propagation of corruption through replication, etc. An investment in an independent process for backup or disaster recovery gives you peace of mind for when that Black Swan event happens. 

For applications with Write-Once-Read-Many (WORM) data characteristics and separate processes for disposition, one could imagine a system with replication that is backup-less. The expectation would be that a separate storage system can rigorously enforce the write-once capabilities. However a bug in the software enforcing the write-once storage or the disposition process can make data unrecoverable.

What about Tape?

The recent Gmail failure illustrates how replication can’t protect from storage software that has bugs and runs amok corrupting all the replicas. Also disks from the same batch (see “JBOD versus RAID”) that have problems can fail at the same time. Google had to ultimately go to offline tapes, a different media altogether, (“Gmail back soon for everyone”) to recover user email.

Tape is well suited for long term retention of data. If your need is to keep data for a long time for governance, compliance or discovery reasons, tape will be a cost-effective part of your recovery system. Tape is extremely durable. You can play Frisbee with tapes – and still hope to recover data from them!

Can backup or disaster recovery be built into a system? I would say yes. Would it require a different process, technology and people? Yes to that one too. What do you think?

As one commenter on David’s blog said, This decade will be interesting.

← You’re in the minority if you’re backing up data just for recovery Virtualization will flounder if virtual machines can't be recovered →

7 Comments

  1. hrushi
    March 3, 2011 at 6:41 pm

    backup, replication, high availability and disaster recovery – yes, they can be part of one suite, but cannot replace one from other. They are unique by their usage and importance.


    • TM Ravi
      T. M. Ravi
      March 7, 2011 at 1:45 pm

      Hrushi – Thanks for your comment. Replication I think is just more of an enabling technology. To your question, can Backup, DR, High Availability and I will add Archiving be addressed by the same solution? There is a lot of overlap between these functions and no one wants four agents capturing data on their system. Many of these functions do add significant cost and operational overhead. That is why people are exploring if they can come together.

      Ravi


  2. Gerard Nicol
    March 3, 2011 at 6:47 pm

    When many of us were first exposed to backup we had to backup every hour to removable disk packs just to take checkpoints so we could recover from disk or program failure.

    Back then there was no high availability, just backup and DR.

    Now we have HA, Backup and DR and they are completely different things most people haven’t reconciled the difference and scarily many of those people are the ones making the decisions.

    I know Iron Mountain are trying to push people down the online DR road, but the reality is that most of the world’s DR is done to tape and will remain on tape for the foreseeable future.

    High availability will move off conventional disk to solid state and backup will remain where it has been for the past 20 years on disk and tape.


    • TM Ravi
      T. M. Ravi
      March 7, 2011 at 1:24 pm

      Gerard – Thank you for your comments. I do agree that DR goes beyond backup of data and has to deal with issues like application recovery, system recovery, re-enabling end-user access to data, etc.
       
      I do believe that the next wave of new backup technology adoption will be online (and potentially continuous) backup to the cloud. The backup media I do agree will be disk and/or tape.
       
      Definite yes to “High availability will move off conventional disk to solid state and backup will remain where it has been for the past 20 years on disk and tape.”
       
      Ravi


  3. Hemant Joshi
    March 4, 2011 at 2:14 am

    If Applications like Exchange have inbuilt backup functionality which can be controlled with certain set of policies such as number of replicas, location of replicas, time lag allowed for different set of replicas, medium to be used for different set of replicas, etc, it will be very good from Application Adminstrator’s point of view. The question will be whether these application vendors can provide rich backup functionality provided by the third part backup vendors.


    • TM Ravi
      T. M. Ravi
      March 7, 2011 at 1:31 pm

      Great to hear from you, Hemant. A number of folks argue that a lot of the “rich backup functionality” is really complexity introduced by the current way of doing backup. Also in a number of scenarios you may never have to go back in time.

      However with all the technologies that you describe if you still need human intervention (such as a decision to stop application of the delayed logs) then it will not be a fool proof process.

      Ravi


  4. Mike Johnson
    April 15, 2011 at 2:58 pm

    Wether you choose to store and replicate your data on local DAS, NAS, LAN/WAN or in a private or public cloud, a true ‘backup-less’ solution
    must include a removable, long-term archival storage component such as
    Blu-ray optical disc storage.


Leave A Comment

*

About the author

TM Ravi

T. M. Ravi is chief marketing officer for Iron Mountain responsible for marketing and strategy for the company’s cloud, on-premises and hybrid information management solutions that span data protection, archiving, eDiscovery and compliance. Ravi joined Iron Mountain through the acquisition of Mimosa Systems, the leader in enterprise content archiving, where he was founder, president and chief executive officer. Before Mimosa, Ravi was founder and CEO of Peakstone Corporation that provided performance management solutions for Fortune 500 companies. Prior to his role at Peakstone, Ravi was vice president of marketing at Computer Associates (CA), Ravi he was responsible for the core line of CA enterprise management products, including CA Unicenter as well as the areas of application, systems and network management, software distribution, help desk, security, and storage management. Ravi joined CA through the $1.2 billion acquisition of Cheyenne Software, the market leader in storage management and antivirus solutions. At Cheyenne Software, he was the vice president responsible for the company's successful Windows NT business with products, such as ARCserve backup and InocuLAN antivirus. Earlier in his career, Ravi worked in Hewlett-Packard's Information Architecture Group, where he did product planning for client/server and storage solutions. Ravi earned a MS and PhD from UCLA and a Bachelors of Technology from Indian Institute of Technology (IIT), Kanpur, India.