Blog: Barry Devlin Subscribe to this blog's RSS feed!

Barry Devlin

As one of the founders of data warehousing back in the mid-1980s, a question I increasingly ask myself over 25 years later is: Are our prior architectural and design decisions still relevant in the light of today's business needs and technological advances? I'll pose this and related questions in this blog as I see industry announcements and changes in way businesses make decisions. I'd love to hear your answers and, indeed, questions in the same vein.

About the author >

Dr. Barry Devlin is among the foremost authorities in the world on business insight and data warehousing. He was responsible for the definition of IBM's data warehouse architecture in the mid '80s and authored the first paper on the topic in the IBM Systems Journal in 1988. He is a widely respected consultant and lecturer on this and related topics, and author of the comprehensive book Data Warehouse: From Architecture to Implementation.

Barry's interest today covers the wider field of a fully integrated business, covering informational, operational and collaborative environments and, in particular, how to present the end user with an holistic experience of the business through IT. These aims, and a growing conviction that the original data warehouse architecture struggles to meet modern business needs for near real-time business intelligence (BI) and support for big data, drove Barry’s latest book, Business unIntelligence: Insight and Innovation Beyond Analytics, now available in print and eBook editions.

Barry has worked in the IT industry for more than 30 years, mainly as a Distinguished Engineer for IBM in Dublin, Ireland. He is now founder and principal of 9sight Consulting, specializing in the human, organizational and IT implications and design of deep business insight solutions.

Editor's Note: Find more articles and resources in Barry's BeyeNETWORK Expert Channel and blog. Be sure to visit today!

Laurel-Hardy.jpgIn the Information part of Information Technology, Big Data is the Big Hit of 2011.  It's also a wonderful phrase to play with: take the "big", place it in front of a few other words and suddenly you have a strapline... or a blog title!  So, is it a big change for IT, or is it just big hype?

There's no doubt in my mind that big data describes a real and novel phenomenon; unfortunately, there are also many existing and well-understood phenomena in the world of business intelligence and data warehousing that are getting sucked into marketing stories and, indeed, even into respectable articles about big data.


The recent McKinsey Quarterly article "Are you ready for the era of 'big data'?" (registration required) opens with the following example: "The top marketing executive at a sizable US retailer recently [discovered that a] major competitor was steadily gaining market share across a range of profitable segments...  [This] competitor had made massive investments in its ability to collect, integrate, and analyze data from each store and every sales unit and had used this ability to run myriad real-world experiments.  At the same time, it had linked this information to suppliers' databases, making it possible to adjust prices in real time, to reorder hot-selling items automatically, and to shift items from store to store easily.  By constantly testing, bundling, synthesizing, and making information instantly available across the organization... the rival company had become a different, far nimbler type of business.  What this executive team had witnessed first hand was the game-changing effects of big data" [my emphasis].

With all due respect to the authors, I believe that anybody who has been involved in business intelligence over the past ten years will be underwhelmed by this story.  It is almost entirely a scenario, and a common one, at that, describing a pervasive data warehousing implementation and operational BI excellence.  I suspect that the reason this example was tagged as big data was because of the reference to running myriad real-world experiments.  This is a behavior often associated with big data; however, on its own, it is generally not a sufficient characteristic.  

The remainder of the article provides many interesting examples and possible consequences, both beneficial and cautionary, of using big data.  For the business executive, it clearly whets the appetite.  But, from an IT perspective, it misses a key aspect--a viable definition of what big data really is.  This is hardly surprising; big data has reached the point on the hype curve where definitions are considered unnecessary.  We all seem to have an assumed definition that neatly meets our needs, be it selling a product or initiating a project.  Hear me clearly, though.  Despite the hype, there is something real going on here.  And it's fundamentally about the underlying characteristics of the information involved; characteristics that differ significantly from the data we in IT have stored and used over the years.

I contend that there are four types of information that together make up big data:
1.    Machine-generated data, such as RFID data, physical measurements and geolocation data, from monitoring devices
2.    Computer log data, such as clickstreams
3.    Textual social media information from sources such as Twitter and Facebook
4.    Multimedia social and other information from the likes of Flickr and YouTube

They are as different from traditional transactional data (the mainstay of BI) as they are from one another.  They have little in common, beyond their volume.  How business extracts value from them and how IT processes them vary widely.

While closely related to traditional BI and data warehousing, big data projects require additional and often very different skills in business and IT.  Their value is first to drive innovative change in business processes; only afterwards can their use become ongoing and operational.  These are topics I'll return to in the coming months.  But, in the meantime, join me for my webinar "Big Data Drives Tomorrow's Business Intelligence" on 25th October for further insights in this rapidly evolving area.



Posted October 21, 2011 11:03 AM
Permalink | 1 Comment |

1 Comment

Barry, I fully agree with your thoughts, Big Data is not just extending an existing BI deployment, it is far more than that. Unstructured data is a large provider of this growth, and traditional BI tools simply do not manage or analyse unstructured data. So yes a different approach, different skill sets and different software solutions are typically required.

How we ingest, manage, analyse, gain insight and present from this is critical to the success of working with Big Data.

Leave a comment