We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.

Blog: Barry Devlin Subscribe to this blog's RSS feed!

Barry Devlin

As one of the founders of data warehousing back in the mid-1980s, a question I increasingly ask myself over 25 years later is: Are our prior architectural and design decisions still relevant in the light of today's business needs and technological advances? I'll pose this and related questions in this blog as I see industry announcements and changes in way businesses make decisions. I'd love to hear your answers and, indeed, questions in the same vein.

About the author >

Dr. Barry Devlin is among the foremost authorities in the world on business insight and data warehousing. He was responsible for the definition of IBM's data warehouse architecture in the mid '80s and authored the first paper on the topic in the IBM Systems Journal in 1988. He is a widely respected consultant and lecturer on this and related topics, and author of the comprehensive book Data Warehouse: From Architecture to Implementation.

Barry's interest today covers the wider field of a fully integrated business, covering informational, operational and collaborative environments and, in particular, how to present the end user with an holistic experience of the business through IT. These aims, and a growing conviction that the original data warehouse architecture struggles to meet modern business needs for near real-time business intelligence (BI) and support for big data, drove Barry’s latest book, Business unIntelligence: Insight and Innovation Beyond Analytics, now available in print and eBook editions.

Barry has worked in the IT industry for more than 30 years, mainly as a Distinguished Engineer for IBM in Dublin, Ireland. He is now founder and principal of 9sight Consulting, specializing in the human, organizational and IT implications and design of deep business insight solutions.

Editor's Note: Find more articles and resources in Barry's BeyeNETWORK Expert Channel and blog. Be sure to visit today!

March 2011 Archives

Following on from a previous discussion on the need for integrated information in modern business, and the focus on a consolidated process in my last post, I'd like to complete the picture here with a look at the role of people in the Business Intelligence of the future.

Traditional BI typically sees people in two roles.  First, they are the largely passive and individual receivers of information via reports or even dashboards.  Once that information is delivered, the classic BI tool bows out; even though the real value of the process comes only when the decision is made and the required action initiated.  Furthermore, the traditional BI process fails to link the action taken to results that can be measured in the real world, overlooking the vital concept of sense and respond described in my last post.  These two glaring omissions, among others, in the current BI world lead directly to the relatively poor proven return on investment in many BI projects.

Second, BI treats people as largely disconnected providers of "information requirements" to the BI development process, often leading to failed or, at best, disappointing BI solutions for real user needs.  Agile BI approaches do address this problem to some extent.  However, the real issue is that the process of innovative decision making is largely iterative with final information needs often differing radically from initial ideas on what the requirements may be.  The path from such innovative analyses to ongoing production reporting using the discovered metrics is also poorly understood and seldom delivered by most current BI tools.

The good news is that many of these issues are central to the techniques and tools of Web 2.0, social networking and related developments.  However, simply adding a chat facility or document sharing to an existing BI tool is insufficient.  To truly deliver BI for the People, we require a significant rethinking of the framework in which BI is developed and delivered.  This framework includes (1) a well-managed and bounded social environment where users can safely experiment with information and collaborate on their analyses, (2) support for peer review and promotion to production of analyses of wider or longer-term value, and (3) an adaptive, closed-loop environment where changes in the real-world can be directly linked to actions taken and thus to the value of the analyses performed.

Today's users are of a different generation to those BI has previously supported.  Gereration Y (born 1980-2000, or thereabouts, depending on which social scientist you follow) is the first generation to have grown up with pervasive electronic connectivity and collaboration in their personal lives.  They bring these expectations, as well as some very different social norms, into the business world, and are now beginning to assume positions of decision-making responsibility in their organizations.  They are set to demand radical changes in the way we make and support decisions in business.

I'll be discussing these issues at three seminars I'm presenting on the transformation of BI into Enterprise IT Integration in Europe: a half day each in Copenhagen and Helsinki (4 and 5 April) and a full two-day deep dive in Rome (11-12 April).  

Posted March 29, 2011 3:56 AM
Permalink | 3 Comments |

In my last post, I mentioned the Business Information Resource (BIR) as the single, logically integrated store of all information--from soft to hard, internal and external, enterprise and personal--used by the business. The BIR is the foundation of the Business Integrated Insight architecture I propose as a new way of addressing the needs of modern decision makers; needs that extend far beyond the data typically stored in a traditional data warehouse environment. However, enterprise IT integration does not stop at the data, of course. It extends to the processes of the business, as well as to the people who comprise the enterprise as a whole.

As a result, the BI2 architecture comprises of three components, arranged as layers from the top down: people, process and information. Each layer builds upon the one below it and is, in turn, influenced by the needs of the layer above. And just as the information layer challenges the current assumptions about how information is organized and used today, the other two layers raise similar challenges. So, let's think a little about the process layer, which I call the Business Function Assembly (BFA).

While BI departments and practitioners invariably focus on the information of the enterprise, my experience is that business people pay little attention to information in itself. They focus almost entirely on the activities they need to fulfill their responsibilities and the sequence and linkage between these tasks. In particular, today, one of their most pressing concerns is how to be highly adaptive to unanticipated changes in the market. I use Stephan Haeckel's Sense and Respond concepts to think about this need and have come to two game-changing conclusions about how we need to deal with process going forward.

First, applying sense and respond to the traditional division of IT into operational, informational and collaborative environments makes it clear that these divisions must be dismantled. Every activity, when considered from an adaptive enterprise viewpoint is described in exactly the same terms, has the same steps and has the same interactions with both people and information. The tripartite operational / informational / collaborative division is an old way of delivering IT solutions when computing was much less powerful and business much slower and less complex. Modern business needs and, especially, their integration can only be satisfied if we deal explicitly with their commonality.

Second, delivering highly flexible business processes requires that business people have direct input to and control over the processes they use. Up to now, IT has a separate and distinct set of activities (often called application development processes) that were designed to build IT systems support for business activities and processes. This division of work is defunct! All processes are really business processes and increasingly business people can and want to do them themselves. In BI, we've seen this first in spreadsheets, where users develop their own BI reports and analyses. On the internet, people increasingly "mash-up" their own applications; this approach is also becoming prevalent in the business arena.

The bottom line is that business process implementation and automation is undergoing radical change and IT departments and, in particular, BI practitioners need to urgently address how this change affects their systems and roles and what they must do about it. I'll be discussing these issues at three seminars I'm presenting on the transformation of BI into Enterprise IT Integration in Europe: a half day each in Copenhagen and Helsinki (4 and 5 April) and a full two-day deep dive in Rome (11-12 April).

Posted March 23, 2011 3:01 AM
Permalink | No Comments |
Autumn has arrived here in Cape Town and with it a slight chill in the air.  In a couple of weeks, I head up to Europe where spring is warming the land.  It's a time of transitions.  And so it is too for the IT industry and BI in particular.

I'm presenting three seminars on the transformation of BI into Enterprise IT Integration in Europe: a half day each in Copenhagen and Helsinki (4 and 5 April) and a full two-day deep dive in Rome (11-12 April).  I started teaching these classes in early 2010, and as I review and update the contents, I'm struck by two impressions.  First, a lot has changed in the BI / Data Warehousing market in the past year.  Second, the new architecture, BI2, that I've been developing since 2009 continues to closely map the emerging technology. That's hugely gratifying!

So, what are the big changes?  One is the exploding interest in "Big Data".  Now, as I've discussed elsewhere, the name is a total misnomer--the stuff is neither necessarily big nor data in the strict meaning of the words--but, the interesting thing is how the concepts and technologies around this area have energized thinking about BI.  Pre-defined reports have become even less interesting in the face of innovative data mining and predictive analytics in the big data environment.  And the information we can use to gain analytic value has expanded in variety and volumes beyond our wildest dreams.

Another important change has been the acquisitions of innovative start-ups, particularly in the analytic appliances field by both BI and other-than-BI giants--Netezza by IBM, Vertica by HP and Sybase by SAP to name a few. The acquisitions highlight the importance of these emerging technologies in a sea change in the way BI will be and increasingly is being delivered.

The third change is at the hardware level.  While much more is going on, two aspects are important to BI: the increasing number of cores in commodity processors and the move towards silicon-based data storage, whether on solid state disks or in memory.  Both of these trends suggest that internal database design approaches established since the 1970s are due for a big shake-up.

None of these trends is particularly new.  But their speed-up and convergence in the past year or so are rapidly pushing the BI environment to a tipping point--the architecture we've successfully used for over twenty years is rapidly being made irrelevant!  That architecture, dating from a 1988 paper that I co-authored, was based on business needs and technological solutions that were appropriate for their time.  Today's bus-tech ecosystem is so different that I have been arguing for a number of years that we need to look beyond our current silos of operational systems, BI and office/collaborative systems to a new architecture that crosses all of IT.  This architecture, which I call Business Integrated Insight, proposes that we treat the entire information asset of the business as a single, logical entity--the Business Information Resource (BIR).

Such a change of focus has widespread implications for the entire IT support infrastructure.  The challenges are big, but the technology as it is emerging is enabling this change.  My European seminars describe the new architecture in depth, but, more importantly, link what needs to be done to current technologies and provides practical guidance on how to begin the transformation.

Posted March 15, 2011 9:15 AM
Permalink | No Comments |
Wow!  Is this some sort of record?  No, I'm not talking about the acquisition just yet.  I'm actually referring to my posting two blog entries within a couple of days.

Given my time zone in Cape Town, I saw the announcement yesterday evening and have had a night to sleep on it, and let a few other analysts give their first impressions, before having to come up with a considered opinion.  Of course, that doesn't make it any more than an opinion!

I've been somewhat negative in the past about a number of acquisitions, particularly SAP's purchase of Sybase and HP's acquisition of Vertica, particularly around the implications for innovation in the broad business intelligence space.  That opinion stems from my view of relative sizes, market positions, skill sets, cultures, and ambitions of the buyer and the bought.  In the above cases, particularly, I felt the acquisitions would have a negative impact on innovation, not only in the short-term (which, I believe, is the case in most all acquisitions), but also in the medium and longer term.  I'm considerably more sanguine regarding innovation and other matters when it comes to the Teradata  / Aster Data deal.

In their analyst call yesterday, the two companies provided very little information on the technical integration or roadmaps of their products and focused on a very broad market positioning.  To me, this is unsurprising.  Having been through the acquisition process myself in a large company, I know that, for legal reasons, only a very limited number of people are privy to the discussions before the announcement.  This is unfortunate, because broader company wisdom and collaborative input on architectural and technical possibilities, organizational approaches, and more is generally excluded from the initial decision.  This is hardly good strategic decision-making practice; gut feel, high-level vision and personal contacts probably play too strongly.

That said, what I know about the two companies involved leads me to believe that a sensible and, indeed, exciting roadmap can be developed over the coming weeks before the deal closes.  Of course, I'm making some assumptions about the thinking involved.  If it's missing, I happily offer this post as strategic input :-)

Before going further, there is a general market problem around nomenclature that makes it difficult to discuss the technical information aspects of this deal.  I'm referring, of course, to the ill-defined and much abused phrases "big data", "structured data", "unstructured information", and so on.  I've been trying to introduce the terms "hard" and "soft information" for some time now but I'm beginning to feel that these may also prove inadequate.  Sigh!  Another piece of definitional work needs to be done...

With that, back to the business in hand.  I see this acquisition as being all about positioning the traditional world of enterprise data warehousing and the emerging world of so-called big data.  Big data is not necessarily big at all and, more to the point, includes a dog's dinner of information from different sources and with disparate characteristics.  For the moment, let's call it "nontraditional information" and characterize it as the information that people like to process with MapReduce, Hadoop and its associated menagerie of open source tools and techniques.  Teradata and Aster Data, like most database vendors, have a strong interest in the emerging MapReduce market.  Both have strong partnerships with Cloudera.  But, the most interesting point for Teradata, I suspect, is Aster Data's patent-pending integrated SQL-MapReduce function.  Porting that function into the Teradata database (assuming that's possible) would provide much-needed, seamless integration between the traditional EDW and nontraditional information.

Aster Data's other key selling point has been the concept of bringing the function to the data, arguing that creating multiple copies of data for analysis through different tools is an expensive and dangerous process.  Their answer has been in-database analytics.  The concept of minimizing copies of data is powerful, and is an approach that I have been preaching for some years.  It is also part of Teradata's philosophy.  Therefore, and again subject technical feasibility, I imagine that Teradata will look to port in-database analytics into the Teradata DBMS.

None of this is to say that the Aster Data database will disappear soon.  But, it would make sense that its longevity would depend on its ability to attract further customers whose requirements, technical or otherwise, could not be met through at Teradata DBMS upgraded with the functions mentioned above.

Beyond the practicalities of porting function from Aster Data into the Teradata platform--which, at the end of the day is only code, after all--what I see in this acquisition is two companies coming together with a shared understanding of data warehousing, analytics, and the emerging "big data" environment, who have demonstrated ongoing innovation in their separate existences.  Provided they abandon the Treaty of Torsedillas kind of positioning described by Curt Monash, I believe this acquisition has a better than average chance of succeeding.

Posted March 4, 2011 8:09 AM
Permalink | 2 Comments |
"Quickly Watson, get your service revolver!"  Is Watson about to put business intelligence out of its misery?  Is the good doctor about to surpass Sherlock Holmes in his ability to solve life's enduring mysteries?  Or are we in jeopardy of falling into another artificial intelligence rabbit hole?

Yes, I know.  Although I haven't found a reference to prove it, I'm pretty sure that IBM Watson, the computer that recently won "Jeopardy!" is named after one of the founding fathers of IBM--Thomas J. Watson Sr. or Jr.--rather than Sherlock Holmes' sidekick.  But, the questions above remain highly relevant.

IBM Watson is, of course, an interesting beast.  The emphasis in the popular press has been on the physical technology specs--10 refrigerator-sized cabinets containing approximately 3,000 CPU cores, 15 TB of RAM and 500 GB of disk running at about 80 teraflops, and cooled by two industrial air-conditioning units.  But, in comparison to some of today's "big data" implementations, IBM Watson is pretty insignificant.  eBay, for example, is running up to 20 petabytes of storage.  As of 2010, Facebook's Hadoop cluster was running on 2300 servers with over 150,000 cores and 64 TB of memory between them.  The world's current (Chinese) supercomputer champion is running at 2.5 petaflops.

On the other hand, a perhaps more telling comparison is to the size and energy consumption of the human brain that Watson beat, but certainly did not outclass, in the quiz show!

However, what's really more interesting from a business intelligence viewpoint is the information stored, the architecture employed and the effort expended in optimizing the processing and population of the information store.  

We know from the type of knowledge needed in Jeopardy! and, indeed, from the possible future applications of the technology discussed by IBM that the raw information input to the system was largely unstructured, or soft information, as I prefer to call it.  During the game, Watson was disconnected from the Internet, so its entire knowledge base was only 500 GB in size.  This suggests the use of some very effective artificial intelligence and learning techniques to condense a much larger natural language information base to a much more compact and usable structure prior to the game.  Over a period of more than four years, IBM researchers developed DeepQA, a massively parallel, probabilistic, evidence-based architecture that enables Watson to extract and structure meaning from standard textbooks, encyclopedias and other documents.  When we recall that the natural language used in such documents contains implicit meaning, is highly contextual, and often ambiguous or imprecise, we can begin to appreciate the scale of the achievement.  A wide variety of AI techniques, such as temporal reasoning, statistical paraphrasing, and geospatial reasoning, were used extensively in this process.

Dr. David Ferrucci, leader of the research project, states that no database of questions and answers was used nor was a formal model of the world created in the project.  However, he does say that structured data and knowledge bases were used as background knowledge for the required natural language processing.  It makes sense to me that such knowledge, previously gathered from human experts, would be needed to contextualize and disambiguate the much larger natural language sources as Watson pre-processed them.  Watson's success in the game suggests to me that IBM have succeeded in using existing human expertise, probably gathered in previous AI tools, to seed a much larger automated knowledge mining process.  If so, we are on the cusp of an enormous leap in our ability to reliably extract meaning and context from soft information and to use it in ways long envisaged by proponents of artificial intelligence.

What this means for traditional business intelligence is a moot point.  Our focus and experience is directed mainly towards structured, or hard, data.  By definition, such data has already been processed to remove or minimize ambiguity in context or content by creating and maintaining a separate metadata store, as I've described elsewhere.  

However, there is no doubt that the major growth area for business intelligence over the coming years is soft information, which, according to IDC is growing at over 60% compound annual growth rate, about three times as fast as hard information, and which already accounts for over 95% of the information stored in enterprises.  It is in this area, I believe, that Watson will make an enormous impact as the technology, already based on the open-source Apache UIMA (Unstructured Information Management Architecture), moves from research to full-fledged production.  There already exists a significant pent-up demand to gain business advantage by mining and analyzing such information.  Progress in releasing the value tied up in soft information has been slowed by a lack of appropriate technology.  That is something that Watson and its successors will certainly change.

While I have focused so far on the knowledge/information aspects of Watson--that being probably the most relevant aspect for BI experts, there is one other key feature of the technology that should be emphasized.  That is Watson's ability to parse and understand the sort of questions posed in everyday English with all their implicit assumptions and inherent context.  Despite appearances to the contrary in the game show, Watson was not responding to the spoken questions from the quiz master; the computer had no audio input, so the exact same questions were passed to it as text as were heard by the human contestants.  In fact, speech recognition technology has also advanced significantly to the stage where very high levels of accuracy can be achieved.  (As an aside, I use this technology myself extensively and successfully for all my writing...)  The opportunities that this affords in simplifying business users' communication with computers are immense.

It seems likely that over the next few years this combination of technologies will empower business users to ask the sort of questions that they've always dreamed of, and perhaps haven't even dreamed of yet.  They will gain access, albeit indirectly, to a store of information far in excess of what any human mind can hope to amass a lifetime.  And they will receive answers based directly on the sum total of all that information, seeded by the expertise of renowned authorities in their respective fields and analyzed by highly structured and logic-based methods.

Of course, there is the danger that if a given answer happens to be incorrect, it is difficult to see how the business user would discover that error or be able to figure out why it had been generated.

And that, as Sherlock Holmes never said is far from "Elementary, my dear Watson!"

Posted March 3, 2011 7:34 AM
Permalink | No Comments |