We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.

Blog: Richard Hackathorn Subscribe to this blog's RSS feed!

Richard Hackathorn

Welcome to my blog stream. I am focusing on the business value of low latency data, real-time business intelligence (BI), data warehouse (DW) appliances, use of virtual world technology, ethics of business intelligence and globalization of business intelligence. However, my blog entries may range widely depending on current industry events and personal life changes. So, readers beware!

Please comment on my blogs and share your opinions with the BI/DW community.

About the author >

Dr. Richard Hackathorn is founder and president of Bolder Technology, Inc. He has more than thirty years of experience in the information technology industry as a well-known industry analyst, technology innovator and international educator. He has pioneered many innovations in database management, decision support, client-server computing, database connectivity, associative link analysis, data warehousing, and web farming. Focus areas are: business value of timely data, real-time business intelligence (BI), data warehouse appliances, ethics of business intelligence and globalization of BI.

Richard has published numerous articles in trade and academic publications, presented regularly at leading industry conferences and conducted professional seminars in eighteen countries. He writes regularly for the BeyeNETWORK.com and has a channel for his blog, articles and research studies. He is a member of the IBM Gold Consultants since its inception, the Boulder BI Brain Trust and the Independent Analyst Platform.

Dr. Hackathorn has written three professional texts, entitled Enterprise Database Connectivity, Using the Data Warehouse (with William H. Inmon), and Web Farming for the Data Warehouse.

Editor's Note: More articles and resources are available in Richard's BeyeNETWORK Expert Channel. Be sure to visit today!

Over the last year, vendors have been mentioning (alluding to, to be more precise) customers who are building (and maybe even operating) data warehouses containing more than a petabyte of data. An analyst that I follow, Cust Monash, had a recent article that remarked that the "petabyte barrier is crumbing" and that "the 100-terabyte mark is almost old hat".

I sensed the same. However, I have a concern with this trend toward bigger...and hence better.

Do we really need all that data? Can we really manage all that data? Does all that data really result in meaningful business value? Can we, as finite human beings, really comprehend all that data?

I am not concerned about the technical side of the equation. Over time, the storage devices, I/O data channels, MPP grids, parallel query optimizers, and so forth will evolve to handle petabytes and more.

I am concerned about the soft side of he equation. It is a 'simple' matter of information complexity!

As an illustration, if we add more and more rows to a customer table, do we add more value that the business could potentially realize? Likewise, if we add more and more columns to a customer table, do we add more value that the business could potentially realize? If we add more and more tables to the data warehouse, do we add more value that the business could potentially realize? By the phrase 'more and more' I am implying several orders of magitude more!

As IT professionals, we need to take a step (or many steps) backward and look the entire end-to-end process. More information is only better information when the organization can assimulate the information and change its behavior to improve business performance. If the organization can not assimulate the information, then more information is not better. If the organization does not change its behavior, then more information is irrelevant. If the organization does not improve its improveness, then more information has no value or even a negative value.

Therefore, I conclude that petabyte data warehouses will be a liability for most organizations. It will be like giving a hot sport car to a teenage driver. They can not handle the extreme capability. At best, it will be a huge waste of resources. At worst, it may result in the demise of those involved. There will be exceptions. I would like to talk with those who survive their petabyte experience.

Posted August 25, 2008 10:31 AM
Permalink | No Comments |

Leave a comment