We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.


Blog: Barry Devlin Subscribe to this blog's RSS feed!

Barry Devlin

As one of the founders of data warehousing back in the mid-1980s, a question I increasingly ask myself over 25 years later is: Are our prior architectural and design decisions still relevant in the light of today's business needs and technological advances? I'll pose this and related questions in this blog as I see industry announcements and changes in way businesses make decisions. I'd love to hear your answers and, indeed, questions in the same vein.

About the author >

Dr. Barry Devlin is among the foremost authorities in the world on business insight and data warehousing. He was responsible for the definition of IBM's data warehouse architecture in the mid '80s and authored the first paper on the topic in the IBM Systems Journal in 1988. He is a widely respected consultant and lecturer on this and related topics, and author of the comprehensive book Data Warehouse: From Architecture to Implementation.

Barry's interest today covers the wider field of a fully integrated business, covering informational, operational and collaborative environments and, in particular, how to present the end user with an holistic experience of the business through IT. These aims, and a growing conviction that the original data warehouse architecture struggles to meet modern business needs for near real-time business intelligence (BI) and support for big data, drove Barry’s latest book, Business unIntelligence: Insight and Innovation Beyond Analytics, now available in print and eBook editions.

Barry has worked in the IT industry for more than 30 years, mainly as a Distinguished Engineer for IBM in Dublin, Ireland. He is now founder and principal of 9sight Consulting, specializing in the human, organizational and IT implications and design of deep business insight solutions.

Editor's Note: Find more articles and resources in Barry's BeyeNETWORK Expert Channel and blog. Be sure to visit today!

September 2013 Archives

The fifth in a series of posts introducing the concepts and messages of my forthcoming book, "Business unIntelligence--Insight and Innovation Beyond Analytics and Big Data", now available!

Pillars.jpgBI designers and vendors are very comfortable with layers of data. Data warehouse architectures dating back to the early 1990s almost all follow the same model. A layer of operational systems, an EDW layer for data reconciliation and a layer of data marts that support the users. Sometimes, there are additional layers. On slides, the picture is often turned on its side, but they're still layers. Irrespective of their orientation, layers pass data serially from one to the next, each performing its particular role before passing the data on. Architecturally, the thinking is sound. Layering enables logical separation of function and responsibility. It drives progressive refinement of data for specialized purposes. Historically, layering in the data warehouse was physically implemented in response to the performance limitations of relational databases.  Different workloads demanded different data structures and processing approaches, leading to multiple data copies.

However, layering of data also has serious drawbacks. It introduces delays in getting data from its initial source to its final destination. In a world where business and customers are increasingly demanding real-time reaction, such delays are unacceptable. Layering adds complexity to development and maintenance. Where market uncertainty demands ever more flexible systems, such complexity is a killer. Layering of data, it seems, has had its day.

As we move from a world of traditional data to modern, unbounded information, the questions about layering become even more pressing. Does it make sense to push social media or sensor data through the layers of the data warehouse? Clearly, the answer from big data advocates is a resounding NO! The speed, scalability and affordability of solutions like Hadoop and NoSQL stores appear compelling for these new types of information. For anybody who understands the importance of data quality and consistency and has seen the disastrous results when they are ignored, the simplistic lure of "Hadumping" is deeply worrisome. But, if layering is no longer a viable approach to ensuring information quality, what is the alternative?

Although it may sound simplistic, the answer is to move from layers to pillars. The main illustration shows the three key pillars of the REAL logical architecture of Business unIntelligence. The pillars are fed in parallel, rather than sequentially. The central, process-oriented data pillar contains traditional operational and core informational data, fed from legally binding transactions (both OLTP and contractual). It is centrally placed because it contains core business information, including traditional operational data and informational EDW and data marts in a single logical component. Machine-generated data and human-sourced information are placed as pillars on either side. The leftmost pillar focuses on real-time and well-structured data, while the one on the right emphasizes the less-structured and, at times, less timely information. In this architecture, metadata--more correctly labeled as context-setting information--is explicitly included as a part of the information resource and spans the three pillars. While these pillars are independent, their content is also coordinated and made consistent, as far as necessary based on the core information in the process-mediated pillar, the context-sensitive information stored in all three pillars and the assimilation function shown in the center. From an implementation point of view, these three pillars require relatively differentiated types of storage and processing. While the actual platforms required may change as technology evolves, the distinctly different characteristics of the domains suggest that there will be three (or perhaps more) different optimal performance points on the spectrum of available technology at any time.

Thumbnail image for Thumbnail image for Business unIntelligence Cover.jpgIn my next blog, I'll explore those interesting clipped boxes at the bottom of the picture...

I will be further exploring the themes and messages of "Business unIntelligence: Insight and Innovation Beyond Analytics and Big Data" over the coming weeks.  You can order the book at the link above; it will be available in early October. For starters, check out my presentation at the BrightTALK Business Intelligence and Big Data Analytics Summit, recorded on September 11th. And my upcoming webinar: "Beyond BI is... Business unIntelligence" on Thursday, Sep 26, 2013 1pm BST.

Posted September 25, 2013 4:07 AM
Permalink | No Comments |
The fifth in a series of posts introducing the concepts and messages of my forthcoming book, "Business unIntelligence--Insight and Innovation Beyond Analytics and Big Data", now available!

Pillars.jpgBI designers and vendors are very comfortable with layers of data. Data warehouse architectures dating back to the early 1990s almost all follow the same model. A layer of operational systems, an EDW layer for data reconciliation and a layer of data marts that support the users. Sometimes, there are additional layers. On slides, the picture is often turned on its side, but they're still layers. Irrespective of their orientation, layers pass data serially from one to the next, each performing its particular role before passing the data on. Architecturally, the thinking is sound. Layering enables logical separation of function and responsibility. It drives progressive refinement of data for specialized purposes. Historically, layering in the data warehouse was physically implemented in response to the performance limitations of relational databases.  Different workloads demanded different data structures and processing approaches, leading to multiple data copies.

However, layering of data also has serious drawbacks. It introduces delays in getting data from its initial source to its final destination. In a world where business and customers are increasingly demanding real-time reaction, such delays are unacceptable. Layering adds complexity to development and maintenance. Where market uncertainty demands ever more flexible systems, such complexity is a killer. Layering of data, it seems, has had its day.

As we move from a world of traditional data to modern, unbounded information, the questions about layering become even more pressing. Does it make sense to push social media or sensor data through the layers of the data warehouse? Clearly, the answer from big data advocates is a resounding NO! The speed, scalability and affordability of solutions like Hadoop and NoSQL stores appear compelling for these new types of information. For anybody who understands the importance of data quality and consistency and has seen the disastrous results when they are ignored, the simplistic lure of "Hadumping" is deeply worrisome. But, if layering is no longer a viable approach to ensuring information quality, what is the alternative?

Although it may sound simplistic, the answer is to move from layers to pillars. The main illustration shows the three key pillars of the REAL logical architecture of Business unIntelligence. The pillars are fed in parallel, rather than sequentially. The central, process-oriented data pillar contains traditional operational and core informational data, fed from legally binding transactions (both OLTP and contractual). It is centrally placed because it contains core business information, including traditional operational data and informational EDW and data marts in a single logical component. Machine-generated data and human-sourced information are placed as pillars on either side. The leftmost pillar focuses on real-time and well-structured data, while the one on the right emphasizes the less-structured and, at times, less timely information. In this architecture, metadata--more correctly labeled as context-setting information--is explicitly included as a part of the information resource and spans the three pillars. While these pillars are independent, their content is also coordinated and made consistent, as far as necessary based on the core information in the process-mediated pillar, the context-sensitive information stored in all three pillars and the assimilation function shown in the center. From an implementation point of view, these three pillars require relatively differentiated types of storage and processing. While the actual platforms required may change as technology evolves, the distinctly different characteristics of the domains suggest that there will be three (or perhaps more) different optimal performance points on the spectrum of available technology at any time.

Thumbnail image for Thumbnail image for Business unIntelligence Cover.jpgIn my next blog, I'll explore those interesting clipped boxes at the bottom of the picture...

I will be further exploring the themes and messages of "Business unIntelligence: Insight and Innovation Beyond Analytics and Big Data" over the coming weeks.  You can order the book at the link above; it will be available in early October. For starters, check out my presentation at the BrightTALK Business Intelligence and Big Data Analytics Summit, recorded on September 11th. And my upcoming webinar: "Beyond BI is... Business unIntelligence" on Thursday, Sep 26, 2013 1pm BST.

Posted September 25, 2013 4:07 AM
Permalink | No Comments |
The fourth in a series of posts introducing the concepts and messages of my forthcoming book, "Business unIntelligence--Insight and Innovation Beyond Analytics and Big Data", available in mid-October.

m3.jpgIn my third article in this series, I defined information as "the recorded and stored symbols and signs we use to describe the world and our thoughts about it, and to communicate with each other. Information is mostly digital but also includes paper, books and analogue recordings". I also pointed out that data is best thought of as a rather specialized form of information, optimized for numerical/logical processing by computers. These observations have interesting implications for data modeling and knowledge management. They emerge for the modern meaning model, m3, shown on the right (click to enlarge) which I offer as a successor to Ackoff's DIKW pyramid.

Let me start with a perhaps contentious statement. Knowledge exists only in the human mind and body. If you accept that, you'll need to reconsider most of what you've read about knowledge management. People mostly resist having the insides of their heads managed. Much of what is proposed in knowledge management programs is actually information management. The real value of knowledge management is that it causes us to consider the relationship between recorded information and its interior, mental representation. The distinction between explicit and tacit knowledge is key. We can see that explicit knowledge and (mostly soft) information can be interconverted rather easily through the traditional methods of learning and documentation. Tacit knowledge, a more insightful and personal experience of the world, can only be more recently, and still imperfectly, captured as information by videoing.  More often, we must try to articulate it first as explicit knowledge and then document it.

Modeling, as traditionally defined and done, resides in the bottom layer of the model. This is rather disappointing, because a real understanding of how the business works requires us to look at explicit knowledge, at least, and often the tacit knowledge that business users carry about how things really work. The m3 model thus demands a rethinking of the boundaries of and approaches to modeling. We need to move from data modeling to information modeling and even to knowledge modeling if we are to make real progress in reinventing business in the technologically based, emerging world.

The top layer of this model, meaning, offers even greater challenges to the typical IT thinker. Meaning is a wholly subjective world. Rationality has a role, but a much narrower one than allowed in Western thought. Furthermore, modern research has discovered that meaning is highly influenced by our interpersonal interactions; we are, after all, social primates. And of even more importance to creating a new model for supporting decision making, meaning, or the stories we tell ourselves, is a significant contributor to the decisions we make. Herein lies the reason why BI has often under-delivered in improved decision making: rational "intelligence" and hard information play a much smaller role than suggested by the theories of decision making used in management schools and business for more than fifty years now. Time for a change, I think?

Thumbnail image for Business unIntelligence Cover.jpgI will be further exploring the themes and messages of "Business unIntelligence: Insight and Innovation Beyond Analytics and Big Data" over the coming weeks.  You can pre-order the book at the site above; it will be available in mid-October. For starters, check out my presentation at the BrightTALK Business Intelligence and Big Data Analytics Summit, recorded on September 11th. And my upcoming webinar: "Beyond BI is... Business unIntelligence" on Thursday, Sep 26, 2013 1pm BST.

Posted September 17, 2013 2:40 AM
Permalink | No Comments |
The fourth in a series of posts introducing the concepts and messages of my forthcoming book, "Business unIntelligence--Insight and Innovation Beyond Analytics and Big Data", available in mid-October.

m3.jpgIn my third article in this series, I defined information as "the recorded and stored symbols and signs we use to describe the world and our thoughts about it, and to communicate with each other. Information is mostly digital but also includes paper, books and analogue recordings". I also pointed out that data is best thought of as a rather specialized form of information, optimized for numerical/logical processing by computers. These observations have interesting implications for data modeling and knowledge management. They emerge for the modern meaning model, m3, shown on the right (click to enlarge) which I offer as a successor to Ackoff's DIKW pyramid.

Let me start with a perhaps contentious statement. Knowledge exists only in the human mind and body. If you accept that, you'll need to reconsider most of what you've read about knowledge management. People mostly resist having the insides of their heads managed. Much of what is proposed in knowledge management programs is actually information management. The real value of knowledge management is that it causes us to consider the relationship between recorded information and its interior, mental representation. The distinction between explicit and tacit knowledge is key. We can see that explicit knowledge and (mostly soft) information can be interconverted rather easily through the traditional methods of learning and documentation. Tacit knowledge, a more insightful and personal experience of the world, can only be more recently, and still imperfectly, captured as information by videoing.  More often, we must try to articulate it first as explicit knowledge and then document it.

Modeling, as traditionally defined and done, resides in the bottom layer of the model. This is rather disappointing, because a real understanding of how the business works requires us to look at explicit knowledge, at least, and often the tacit knowledge that business users carry about how things really work. The m3 model thus demands a rethinking of the boundaries of and approaches to modeling. We need to move from data modeling to information modeling and even to knowledge modeling if we are to make real progress in reinventing business in the technologically based, emerging world.

The top layer of this model, meaning, offers even greater challenges to the typical IT thinker. Meaning is a wholly subjective world. Rationality has a role, but a much narrower one than allowed in Western thought. Furthermore, modern research has discovered that meaning is highly influenced by our interpersonal interactions; we are, after all, social primates. And of even more importance to creating a new model for supporting decision making, meaning, or the stories we tell ourselves, is a significant contributor to the decisions we make. Herein lies the reason why BI has often under-delivered in improved decision making: rational "intelligence" and hard information play a much smaller role than suggested by the theories of decision making used in management schools and business for more than fifty years now. Time for a change, I think?

Thumbnail image for Business unIntelligence Cover.jpgI will be further exploring the themes and messages of "Business unIntelligence: Insight and Innovation Beyond Analytics and Big Data" over the coming weeks.  You can pre-order the book at the site above; it will be available in mid-October. For starters, check out my presentation at the BrightTALK Business Intelligence and Big Data Analytics Summit, recorded on September 11th. And my upcoming webinar: "Beyond BI is... Business unIntelligence" on Thursday, Sep 26, 2013 1pm BST.

Posted September 17, 2013 2:40 AM
Permalink | No Comments |
orthrus.jpgMuch has happened while I've been heads down over the past few months finishing my book. Well, Business unIntelligence - Insight and Innovation Beyond Analytics and Big Data" went to the printer last weekend and should be in the stores by mid-October. And, I can rejoin the land of the living. One of the interesting news stories in the meantime was Cisco's acquisition of Composite Software, which closed at the end of July. Mike Flannagan, Senior Director & General Manager, IBT Group at Cisco and Bob Eve, Data Virtualization Marketing Director and long-time advocate of virtualization at Composite turned up at the BBBT in mid-August to brief an eclectic bunch of independent analysts, including myself. 

The link-up of Cisco and Composite is, I believe, going to offer some very interesting technological opportunities in the market, especially in BI and big data. 

BI has been slow to adopt data virtualization. In fact, I was one of the first to promote the approach with IBM Information Integrator (now part of InfoSphere), some ten years ago when I was still with IBM. The challenge was always that virtualization seems to fly in the face of traditional EDW consolidation and reconciliation via ETL tools. I say seems because the two approaches are more complementary than competitive. Way back in the early 2000s, it was already clear to me that there were three obvious use cases: (i) real-time access to operational data, (ii) access to non-relational data stores, and (iii) rapid prototyping. The advent of big data and the excitement of operational analytics have confirmed my early enthusiasm. No argument - data virtualization and ETL are mandatory components of any new BI architecture or implementation.

So, what does Cisco add to the mix with Composite? One of the biggest challenges for virtualization is to understand and optimize the interaction between databases and the underlying network. When data from two or more distributed databases must be joined in a real-time query, the query optimizer needs to know, among other things, where the data resides, the volumes in each location, the available processing power of each database, and the network considerations for moving the data between locations. Data virtualization tools typically focus on the first three database concerns, probably as a result of their histories.  However, the last concern, the network, increasingly holds the key to excellent optimization.  There are two reasons. First, processer power continues to grow, so database performance has proportionately less impact.  Second, Cloud and big data together mean that distribution of data is becoming much more prevalent.  And growth in network speed, while impressive, is not in the same ballpark as that of processing, making for a tighter bottleneck.  And who better to know about the network and even tweak its performance profile to favor a large virtualization transfer than a big networking vendor like Cisco? The fit seems just right.

For this technical vision to work, the real challenge will be organizational, as is always the case with acquisitions.  Done well, acquisitions can be successful.  Witness IBM's integration of Lotus and Netezza, to name but two. Of course, strategic management and cultural fit always count. But, the main question usually is: Does the buyer really understand what the acquired company brings and is the buyer willing to change their own plans to accommodate that new value? It's probably too early to answer that question. The logistics are still being worked through and the initial focus is on ensuring that current plans and revenue targets are at least maintained. But, if I may offer some advice on the strategy... 

The Cisco network must recognize that the query optimizer in Composite will, in some sense, become another boss. The value for the combined company comes from the knowledge that resides in the virtualization query optimizer about what data types and volumes need to be accommodated on the network. This becomes the basis of how to route the data and how to tweak the network to carry it. In terms of company size, this may be the tail wagging the dog. But, in terms of knowledge, it's more like the dog with two heads. The Greek mythological image of Kyon Orthros, "the dog of morning twilight" and the burning heat of mid-summer, is perhaps an appropriate image.  An opportunity to set the network ablaze.

Posted September 7, 2013 12:21 AM
Permalink | No Comments |
PREV 1 2

   VISIT MY EXPERT CHANNEL

Search this blog
Categories ›
Archives ›
Recent Entries ›