It Is The Data That Matters: Children, Pets and Cattle

By: In: Data Management On: Apr 03, 2013
It Is The Data That Matters: Children, Pets and Cattle

I recently ran across a presentation entitled Data Centre Evolution which was delivered by CERN (European Organization for Nuclear Research) at the Swiss Distributed Computing Day in November of 2012.  The slides discussed CERN’s IT infrastructure and their data center challenges.  While CERN is unique, the difficulties they face are not.  I was particularly intrigued with their characterization of today’s computing architectures.

CERN asks the question, “Do you use a pet or cattle computing service model,” and they provide insights into the two alternatives.  This slide (#17 from the deck linked above) summarizes their position:
 

 

It Is The Data That Matters: Children, Pets and Cattle

CERN makes a compelling case about how IT should think about computing. The organization is implementing a combination of virtualization technology and OpenStack to deliver a private cloud or “cattle” solution. However, as the data center moves to the “disposable” computing model what does it mean for our data?

Data is the lifeblood of the data center.  We purchase hundreds of thousands of dollars of computing and storage hardware not because we admire the pretty lights and fancy GUIs, but because we care about our data.  The challenge we face is not just about data availability but also accessibility and speed, and so it is no surprise that there are many companies whose sole mission is to accelerate data access often using SSD (solid state disk) or other similar technologies.  In my view, the whole premise of the “cattle” model comes down to enhanced data access.  Once you implement a dynamic computing architecture that is independent of underlying hardware, data availability, accessibility and performance will improve which enhances the applications and service levels that IT delivers.

So bringing the conversation back to the topic of pets and cattle, these two viewpoints illustrate contradicting computing philosophies.  We love our pets. We give them cute names like “Lucky” or “Puffy” (actual names of my wife’s cats growing up, but I digress) and nurture them to health when they are ill.  This equates to those application servers that are vital to business and must be running at all times.  In contrast, we are not attached to our “cattle” servers.  They are commodities that are readily replaced with new ones when they fail.  This model is commonly seen in cloud environments which often leverage large numbers of low cost servers.  But what about our data?

In my view, corporate data is analogous to children.  Simply put, I love my children; they are irreplaceable and there is nothing I would not do to take care of them.  If something really bad happened to them, it would be devastating.  For a corporation, data is similarly irreplaceable and the loss of which would be devastating.  Think about it what would happen to your favorite online retailer if they lost all their data?  It would likely go out of business.  We do everything we can protect our children from potential issues and should do the same with our data.  The concepts of pets and cattle represent different ways to access and manage our most precious asset – data.

There is good news for the data center.  There are established practices for protecting our data with traditional backup, recovery and DR solutions which insulate us from outages.  While no process is perfect, these strategies go a long way toward ensuring that corporate information is available even in cases of extreme outage or disaster.  As a data center manager you must ensure that you follow these practices consistently and reliably and Iron Mountain can help.

In conclusion when looking at your computing infrastructure, you must consider whether to follow the pet or cattle philosophy of computing or perhaps a combination.  Regardless of the choice, remember that just like your children, your data is everything.

← My Epic Struggle with the Copier: QBOS_0403 Compliance Week Webcast, March 28th →

4 Comments

  1. Patrick Osborne
    April 3, 2013 at 2:28 pm

    Good blog Jay! It will be very interesting when we move from cattle to some other autonomous entity that manages itself, but is too complex/numerous to label. Like bees, chickens or farm-raised salmon…


  2. Jay Livens
    Jay Livens
    April 3, 2013 at 4:11 pm

    Patrick,

    Thank you for the comment. There is no doubt that we will see a continued proliferation of servers. Things like Hadoop can drive increasing usage as can technologies like HP’s Moonshot project which will massively grow server density. YOu bring a great point that these new technologies will likely drive further paradigm shift in server management.

    JL


  3. Tim Bell
    April 9, 2013 at 1:40 pm

    Data can follow similar patterns to cloud servers. My pension fund data must be preserved no matter what. The pension fund should also be backed up with clear history and point in time recovery, ideally off-site. My Squid cache can disappear without causing any concern.

    Even with the modern world of object store replications, our precious data needs to be protected against software problems, h/w corruption or human error.

    Thus, at CERN, we have SQL databases as reliable stores with recovery logs, AFS file servers with daily backups and tape to look after the physics data.


    • Jay Livens
      Jay Livens
      April 10, 2013 at 12:11 am

      Tim,

      Thank you for the comment. You make a great point and I concur that there are additional nuances to the value of information. Some data like your Squid cache have limited value while others like your pension fund details are indeed critical. I see customers using different approaches to protecting the different data types including tape-based solutions, snapshots, disk-based backup and replication.

      It sounds like SQL is critical for your environemnt and it is good to hear that you have a robust protection strategy. In general, offsiting is also a critical component for protection and I imagine that you are sending your SQL data to a remote location as well.


Leave A Comment

*

About the author

Jay Livens

Jay Livens has over 10 years of experience working in the data protection industry and is responsible for product and solutions marketing for Iron Mountain’s Data Backup and Recovery products and services. Prior to joining Iron Mountain, Jay was responsible for HP’s disk-based backup portfolio, enterprise tape and StoreOnce technology in the Americas. Jay previously worked at SEPATON a provider of high performance virtual tape libraries and deduplication technology. Mr. Livens also holds an MBA from MIT’s Sloan School of Management.