The Petabyte Challenge…

June 16, 2009 at 8:53 pm 1 comment

Managing large amounts of data is very challenging technical endeavor, but for many cloud storage providers the volume of data they are tasked with protecting and maintaining is so large that it presents unique challenges.

One of those particularly daunting issues is “silent data corruption.” This topic is discussed by Zetta CTO Jeff Whitehead in a recent blog entry. Whitehead’s excellent description of the problem and how to analyze it includes a calculator to help estimate the probability of random disk failures – this should be required reading for any system administrator or architect of a cloud storage solution. If you’ve built one and this is news to you – you (and your customers) are in trouble…

We included a large excerpt below (Jeff: let us know if you would prefer we take it down):

IT professionals are well aware of many challenges related to scaling storage: capital required to house data, manage backups, data center space, power and cooling. One area many IT professionals haven’t had time to look at, however, is how increasing data footprints translate into increased risk of data loss or data corruption. To put this in context, IDC recently reported that data volumes will increase by a “factor of almost five,” while “total IT budgets worldwide will only grow by a factor of 1.2 and IT staff by a factor of 1.1.” In this context of constraints, being asked to do more with less, without special attention to data risk management, risk inevitably increases.

I believe that many IT professionals and CIO’s will be very surprised to see that while Data Loss (ie, simultaneous drive failures) may not be very probable, Data Corruption (the data on disk is no longer what was originally written out by the application) is shockingly likely, and has caused outages for even some of the most technologically advanced high end environments.

The objective of this blog is to introduce or reintroduce the concept of “Mean Time To Data Loss (MTTDL),” whereby IT professionals, CIOs, and risk managers can create a probabilistic model for evaluating the reliability and probability of data loss for your current environment, and also compare and contrast with how Zetta is advancing the state of the art for cost effective data protection.

MTTDL is a tool, and to be effective one must understand its limitations. The inputs to the model are as follows:

The number of hard drives (data set size/system performance)
The reliability of each hard drive
The probability of reading a given hard drive correctly without error (see prior blog about silent data corruption)
The redundancy encoding of the system
The rebuild rate.
Mean Time to Data Loss is in many respects a best case scenario, because it ignores risks to data integrity such as fire, natural disaster, human error, and other common causes of storage failures. It also ignores autocorrelation¸ or drives failing at the same time due to similar workload, similar manufacturing batches, firmware issues, or the like. Despite these limitations, MTTDL is still one of the better tools for evaluating the data protection features of a storage system.


Entry filed under: Uncategorized. Tags: , , .

Update on Carbonite Incident – More Info From CEO… NBC (Network Backup Corp) Brings Cloud Backup Patent Suit

1 Comment Add your own

  • 1. Jeff Whitehead  |  June 17, 2009 at 1:13 am

    Thanks for the write up! I’ve been following your blog for a while, was happy to see the trackback.




Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Trackback this post  |  Subscribe to the comments via RSS Feed

RSS News about cloud-based backup

  • Alibaba offers cloud services in Asia retail push August 16, 2018
  • Global Cloud Backup Market to grow at a CAGR of +25% during forecast period 2018-2023, top ... August 16, 2018
  • How Do You Back Up Your Data in the Cloud? August 16, 2018
  • Rupee (RUP) Total Cap at $472038 as Price Rises to $0.019508 August 16, 2018
  • Cloud Backup and Recovery Market To Grow at a CAGR of 14.00% by Extensive market growth ... August 16, 2018
  • Google revamps storage service, offers cheaper prices for more space August 16, 2018
  • Alibaba Cloud Eyes Expansion Into Southeast Asia With New Products, Partners August 16, 2018
  • ManageEngine Updates Allow Users to Back Up and Restore Email August 16, 2018
  • Alibaba to tap burgeoning Asia-Pacific retail with nine new cloud products August 16, 2018
  • ManageEngine RecoveryManager Plus Adds Exchange Online Point-in-Time Restoration ... August 16, 2018

%d bloggers like this: