We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.

Blog: Barry Devlin Subscribe to this blog's RSS feed!

Barry Devlin

As one of the founders of data warehousing back in the mid-1980s, a question I increasingly ask myself over 25 years later is: Are our prior architectural and design decisions still relevant in the light of today's business needs and technological advances? I'll pose this and related questions in this blog as I see industry announcements and changes in way businesses make decisions. I'd love to hear your answers and, indeed, questions in the same vein.

About the author >

Dr. Barry Devlin is among the foremost authorities in the world on business insight and data warehousing. He was responsible for the definition of IBM's data warehouse architecture in the mid '80s and authored the first paper on the topic in the IBM Systems Journal in 1988. He is a widely respected consultant and lecturer on this and related topics, and author of the comprehensive book Data Warehouse: From Architecture to Implementation.

Barry's interest today covers the wider field of a fully integrated business, covering informational, operational and collaborative environments and, in particular, how to present the end user with an holistic experience of the business through IT. These aims, and a growing conviction that the original data warehouse architecture struggles to meet modern business needs for near real-time business intelligence (BI) and support for big data, drove Barry’s latest book, Business unIntelligence: Insight and Innovation Beyond Analytics, now available in print and eBook editions.

Barry has worked in the IT industry for more than 30 years, mainly as a Distinguished Engineer for IBM in Dublin, Ireland. He is now founder and principal of 9sight Consulting, specializing in the human, organizational and IT implications and design of deep business insight solutions.

Editor's Note: Find more articles and resources in Barry's BeyeNETWORK Expert Channel and blog. Be sure to visit today!

bp-napkin.jpg"Seven Faces of Data - Rethinking data's basic characteristics" - new White Paper by Dr. Barry Devlin.

We live in a time when data volumes are growing faster than Moore's Law and the variety of structures and sources has expanded far beyond those that IT has experience of managing.  It is simultaneously an era when our businesses and our daily lives have become intimately dependent on such data being trustworthy, consistent, timely and correct.  And yet, our thinking about and tools for managing data quality in the broadest sense of the word remain rooted in a traditional understanding of what data is and how it works.  It is surely time for some new thinking.

A fascinating discussion with Dan Graham of Teradata over a couple of beers in February last at Strata in Santa Clara ended up in a picture of something called a "Data Equalizer" drawn on a napkin.  As often happens after a few beers, one thing led to another...

The napkin picture led me to take a look at the characteristics of data in the light of the rapid, ongoing change in the volumes, varieties and velocity we're seeing in the context of Big Data.  A survey of data-centric sources of information revealed almost thirty data characteristics considered interesting by different experts.  Such a list is too cumbersome to use and I narrowed it down based on two criteria.  First was the practical usefulness of the characteristic: how does the trait help IT make decisions on how to store, manage and use such data?  What can users expect of this data based on its traits?  Second, can the trait actually be measured?

The outcome was seven fundamental traits of data structure, composition and use that enable IT professionals to examine existing and new data sources and respond to the opportunities and challenges posed by new business demands and novel technological advances.  These traits can help answer fundamental questions about how and where data should be stored and how it should be protected.  And they suggest how it can be securely made available to business users in a timely manner.

So what is the "Data Equalizer"?  It's a tool that graphically portrays the overall tone and character of a dataset, IT professionals can quickly evaluate the data management needs of a specific set of data.  More generally, it clarifies how technologies such as relational databases and Hadoop, for example, can be positioned relative to one another and how the data warehouse is likely to evolve as the central integrating hub in a heterogeneous, distributed and expanding data environment.

Understanding the fundamental characteristics of data today is becoming an essential first step in defining a data architecture and building an appropriate data store.  The emerging architecture for data is almost certainly heterogeneous and distributed.  There is simply too large a volume and too wide a variety to insist that it all must be copied into a single format or store.  The long-standing default decision--a relational database--may not always be appropriate for every application or decision-support need in the face of these surging data volumes and growing variety of data sources.  The challenge for the evolving data warehouse will be to ensure that we retain a core set of information to ensure homogeneous and integrated business usage.  For this core business information, the relational model will remain central and likely mandatory; it is the only approach that has the theoretical and practical schema needed to link such core data to other stores.

"Seven Faces of Data - Rethinking data's basic characteristics" - new White Paper by Dr. Barry Devlin (sponsored by Teradata)

Posted November 17, 2011 6:07 AM
Permalink | No Comments |

Leave a comment