Business Intelligence Network business intelligence resources

Blog: David Loshin

« Meet me in Las Vegas for TDWI | Main | Costly Bundler Blunder »

Doubles Your Pleasure, Double Your Fun

There is an oft-quoted statistic about the growth rate of data volumes that I wanted to use in some context, and I started searching for a source. I googled "data volumes" +"double every" to see what I could find, and to my surprise, lots of hits, but it is difficult to pin down the exact parameters. Lots of folks are using the statistic:

"Data doubles every year"
"The amount of stored data from corporations nearly doubles every year"
"...the amount of data stored by businesses doubles every year to 18 months."
"In his book “Simplicity,” business management expert and author Bill Jensen indicates that the most conservative estimates show business information doubling every three years, while some estimates say data doubles every year. "
"Unstructured data doubles every three months"

I am still following links from the first page of results, and we are doubling our data every 3 to 18 months.

"Reed's Law states that the volume of data doubles every 12 months. "

OK, so there is actually a law about it. Hold on a second, according to wikipedia this law is about the utility of (social) networks, so perhaps the law doesn't apply in all jurisdictions.

Anyway, these may all be references to a UC Berkeley study on the growth of data , which said that the amount of information stored on media such as hard disk drives doubled between 2000 and 2003.

So let's look at this a little more carefully - we have a scientific study that looks not at the creation of data, but rather the use of storage media to hold what is out there. And out there is a lot of stuff needing a lot of storage, like images, music, videos, etc. Things that have information yet from which are still a challenge to extract data. Also, consider that for each thing out there, there are likely to be a lot of copies! I am sure that a scan of all the TiVos in the country would demonstrate that lots of people are still catching up on older episodes of 24 and American Idol.

I need to refine my question a little bit, then, but I am afraid it will be difficult to track down defensible sources for it. I am more interested in knowing about the growth rate for data that can be integrated into an actionable information environment. I may not care about the bits comprising that specific episode of 24 that is sitting on millions of DVRs, but as an advertiser, I might be interested in profiling which households have watched which episodes and at what kind of time shift.

Anyone have any ideas?

  Posted by David Loshin on January 23, 2008 10:48 AM |

Post a comment