Blog: Barry Devlin Subscribe to this blog's RSS feed!

Barry Devlin

As one of the founders of data warehousing back in the mid-1980s, a question I increasingly ask myself over 25 years later is: Are our prior architectural and design decisions still relevant in the light of today's business needs and technological advances? I'll pose this and related questions in this blog as I see industry announcements and changes in way businesses make decisions. I'd love to hear your answers and, indeed, questions in the same vein.

About the author >

Dr. Barry Devlin is among the foremost authorities in the world on business insight and data warehousing. He was responsible for the definition of IBM's data warehouse architecture in the mid '80s and authored the first paper on the topic in the IBM Systems Journal in 1988. He is a widely respected consultant and lecturer on this and related topics, and author of the comprehensive book Data Warehouse: From Architecture to Implementation.

Barry's interest today covers the wider field of a fully integrated business, covering informational, operational and collaborative environments and, in particular, how to present the end user with an holistic experience of the business through IT. These aims, and a growing conviction that the original data warehouse architecture struggles to meet modern business needs for near real-time business intelligence (BI) and support for big data, drove Barry’s latest book, Business unIntelligence: Insight and Innovation Beyond Analytics, now available in print and eBook editions.

Barry has worked in the IT industry for more than 30 years, mainly as a Distinguished Engineer for IBM in Dublin, Ireland. He is now founder and principal of 9sight Consulting, specializing in the human, organizational and IT implications and design of deep business insight solutions.

Editor's Note: Find more articles and resources in Barry's BeyeNETWORK Expert Channel and blog. Be sure to visit today!

September 2008 Archives

As the worldwide banking crisis continues to escalate, one has to wonder-where was the Business Intelligence in all of this? What happened to Data Quality and Data Management?

First, we had the interesting revelation that the individual banks and lending institutions all seemed to be blissfully unaware of the extent to which they were exposed by lending in the sub-prime mortgage market. It's difficult to imagine how the information available to decision makers in these companies could have been so scarce or so uninformative. Most, if not all, financial institutions have had extensive and expensive data warehouses in place for many years now. Business Intelligence should easily have warned of the dangers. Was the increasing level of risk unmeasured, overlooked or simply ignored?

More recently, we've had the spectacle of banks being unwilling to extend short-term lending facilities to one another for fear that the borrowing institutions could go belly-up in the next few days! Could the lenders not know? Unfortunately, in this case, the answer is probably that they couldn't. Despite the fact that the worldwide financial market is tightly and instantly interconnected at a transaction level, the truth is that the underlying data remains disconnected and dispersed. Data Management and Data Quality have simply not been considered. Proper business governance in the financial markets as a whole is impossible without a well-defined and credible data foundation.

So, assuming that we can survive the crisis without a meltdown, what has been happening should be a clarion call to Data Management professionals in the financial industry particularly but also beyond. We need to recognize the interconnected and increasingly fragile web of data dependencies that hold the business world together. It's time to get out there and apply the principles we know and preach already. And we had better get moving quickly.


Posted September 28, 2008 7:52 PM
Permalink | No Comments |

Enterprise BI shops and data quality departments regard spreadsheets largely as the work of the devil. Against all the rules of information quality, data in spreadsheets is manipulated by users at will and in private. Then the resulting data and function is distributed, shared and further played around with, until it's anybody's guess whether the results presented at the end bear any relationship to the truth. Data that was pure and clean as it came out of the data warehouse, data mart or approved BI report is now potentially as contaminated as nuclear waste.

And yet, check in with the users. Indeed, check in with yourself. Why is Excel so popular? Because it makes it easy to play with the data, check out hypotheses, get answers otherwise unavailable, and so on. And once you've gotten the answer through the spreadsheet, chances are you won't get the time or the resources to recreate the process in a more auditable, quality-conscious way. It's a real and spreading problem. But, what to do?

This week I had the opportunity to preview a new product called Lyza that's due to launch on Sept. 22. In fact, you can download it and play with it already. Scott Davis, the CEO of Lyzasoft Inc. explained that they had spent a lot of time investigating how business analysts, the power users of spreadsheets, actually work. This is usually a good idea, because you find out what the users really need, and which of your assumptions are right or wrong. It will probably come as no surprise that most analysts approach their work in a highly unstructured and iterative way, pulling bits of relevant data into Excel from a variety of known sources - both official data marts and reports as well as unofficial files, spreadsheets, etc. they happen to have created before or borrowed from trusted colleagues. And they do it in Excel, because that's the only way they can.

What Lyza does is to provide an easier, more intuitive way of pulling data together from diverse sources, combining and manipulating it and creating results and reports for distribution to the business. Well, that's all fine and dandy for the business analysts you may say, but how does it help the BI and data quality departments address the data contagion? The answer is that Lyza tracks and saves an audit trail of every action and every step of the analysis process that the user is building as well as enabling snapshots of the results to be cached and preserved for posterity. Now the data quality folks are beginning to smile. And the BI department? Well, they're less sure: they like the added traceability, but this is still outside their comfortable data mart zone.

However, we could look at it in a different way. We could imagine that Lyza provides a new type of data mart - a "playmart" - a sand box where power analysts can experiment with data and perform all sorts of analyses in a safe, well-managed environment. Now, if only we could evaluate the analysts' logic and productionalize those analyses and reports that are going to be reused and built upon in the future.

Scott's initial answer was that you can certainly do all this within Lyza itself. But a bit of further probing convinced me that the metadata that Lyza stores to describe the analysis processes is probably sufficient to enable the creation of ETL scripts for your ETL engine of choice. This would certainly require further investigation and automation, but it seems like the bones of the idea are there. In this case, the playmart could address a set of business analysts' needs that have been long ignored by the BI departments and by BI vendors as well.

The only real fly in the ointment is whether Lyza will be able to convince the spreadsheet jockeys to get off their current Excel rocking horse and jump on the bright new Lyza pony in the playmart! (And that sentence would work so much better if only Lyza had chosen a mustang for their logo rather than a gecko.)


Posted September 19, 2008 4:54 PM
Permalink | No Comments |

In my last post, I shared some thoughts inspired by the Decision Intelligence article written by Claudia Imhoff and Colin White. There, I suggested we need to really begin to consider all information as a single resource for the whole business. This entails stepping beyond our traditional IT-bounded view of our systems and looking at them with a renewed business vision. If we do this, it will also quickly becomes clear that our view of process needs reworking too.

Claudia and Colin have drawn a box on the left of their architecture picture that arises directly from the insight that operational BI really is a different beast from the traditional BI we've all known and loved over the past 20 years or so. When you deeply consider the implications of building an operational BI system, as Claudia and Colin clearly have, it becomes obvious that operational BI has many of the characteristics of traditional operational or transaction-processing systems. Therefore, from a systems architecture point of view, you put them in the same box, in this case called "Business process intelligence".

There are also some differences, of course. The most important is how the business users interact with these two related types of system. The value proposition of operational BI is that human decision-making skills can improve operational processes. How? Well, there are two very distinct threads here.

One is the proposition that we can apply advanced analytics technology automatically to parts of the operational process. Fraud detection is a good example. Applying advanced analytics on the fly to credit card transactions gives better detection of fraudulent transactions. Note that this type of operational BI is almost completely invisible to the business users: they see the results of more fraud detected or less false positives, but how that happened is both unknown and uninteresting.

The second thread brings users very directly into the loop. Here, the operational BI technology is made part of the users' visible process. Business users are presented with decision support technology that displays trends or exceptions in near real-time data, so that they can potentially choose a different course of action to that embedded in the normal flow. In effect, business users get to change the business process on the fly, rather than doing little more than data input as was previously the case.

Now, keeping this in mind, here's the million dollar question. What's the difference between an operational system and an informational system; how do you distinguish between an operational process and an informational process? In the good old days, it was easy! The operational side was nearly or actually real-time, dealt with individual transactions or data elements according to a predefined process where the users had minimal freedom to act intelligently. Informational systems, in contrast, were centered around users who were expected to make intelligent decisions based on historical data without any clear process to turn those decisions into action.

So, what is the answer today? When we in BI start building operational BI and the operational world starts implementing adaptive SOA-based systems, the distinction between operational and informational more-or-less disappears. This puts operational BI and operational systems together in one box of the architecture. But the deeper and probably longer-term implications of this bold step have not been explicitly called out. In fact, these implications are obscured by the naming of the new architecture as "Decision intelligence", because the top level of this architecture is no longer confined to the world that was formerly BI; it actually becomes the single, common process or interface through which all business users will interact with the underlying IT systems.

Is that scary? Absolutely! But it is a clear and logical consequence of the paths that BI and operational systems are currently on. It means that we in BI are no longer in total control of our destiny. But the same is true of the operational systems. And, although I've not covered it here, collaborative systems (e-mail, office support, etc) are also being drawn inexorably into the same converged path.

It's time we all started to talk to one another! And that does imply that decision intelligence may be too narrow a term for us all to agree on. May I propose again the "Highly Evolved Business"?


Posted September 14, 2008 12:26 PM
Permalink | No Comments |

Claudia Imhoff and Colin White have a lengthy history of insightful and provocative contributions to the development of Business Intelligence. Their recent article, Decision Intelligence, is no exception. Their thesis is that the IT support needed for decision-making, now known as "Business Intelligence", today extends far beyond the traditional domain of data warehousing and is in need of a new architecture and a new name - Decision Intelligence.

I fully agree. I've been using the terms "Highly Evolved Business" and "Business Insight" over the past year or so to express exactly the same thought. Indeed, Claudia, Colin and I have discussed this whole idea already at length and are very much on the same page. But I hadn't seen their architecture picture before, and it gives me the opportunity to discuss the whole topic from a higher perspective in this and the next post.

Under Decision intelligence, the architecture shows three vertical blocks called "Business process intelligence", "Business data intelligence" and "Business content intelligence". The meanings of these blocks are fairly obvious, but take a look at the linked article for a full explanation. My thought is that they are almost too obvious: they closely reflect our current arrangement of systems building blocks in the IT world.

Let's first examine the data and content blocks. Today, if you look at typical enterprise implementations, you will certainly see databases and separate content stores. You'll also notice independent systems built upon these separate stores. But, if you step back from the storage and processing issues, it's pretty difficult to distinguish between the two categories. Try explaining the difference to a business user!

Take an example of a clinician who's trying to make a treatment decision. She's looking at a chest x-ray - content in our terms. And she's also looking at the "structured data" that goes with it: this x-ray is of a 45 year old male, smoker of 20 cigarettes a day for the past 30 years who has been admitted with shortness of breath. Does she see unstructured content and structured data that must somehow be combined in her decision making? I'd argue not. She simply sees a set of information she's using.

And some of the old barriers between the storage of structured data and unstructured content are breaking down. Where is the EXIF data (structured metadata) of a photo stored? Yes, in the JPG file along with the unstructured content. Where do e-mail systems store the structured metadata about sender, subject, date sent, etc? Sure, in the database with the unstructured e-mail body content.

I could make a similar argument about the lack of distinction between real-time data (or operational) data and historical (data warehouse) data.

My point is that if we want to create a new vision for the future, we need to start seeing the world through non-IT eyes. It's all information. It's a single concept; a single category of "stuff". And we in IT need to start creating the tools and methods that allow us to create, manage and make available all information in a coherent and consistent way. At a conceptual level, that has to be the goal and that should be our first pictorial representation.

Keep that thought in mind. I'll come back to next time when I look at the process side of the picture.


Posted September 1, 2008 10:51 AM
Permalink | No Comments |