Blog: Barry Devlin Subscribe to this blog's RSS feed!

Barry Devlin

As one of the founders of data warehousing back in the mid-1980s, a question I increasingly ask myself over 25 years later is: Are our prior architectural and design decisions still relevant in the light of today's business needs and technological advances? I'll pose this and related questions in this blog as I see industry announcements and changes in way businesses make decisions. I'd love to hear your answers and, indeed, questions in the same vein.

About the author >

Dr. Barry Devlin is among the foremost authorities in the world on business insight and data warehousing. He was responsible for the definition of IBM's data warehouse architecture in the mid '80s and authored the first paper on the topic in the IBM Systems Journal in 1988. He is a widely respected consultant and lecturer on this and related topics, and author of the comprehensive book Data Warehouse: From Architecture to Implementation.

Barry's interest today covers the wider field of a fully integrated business, covering informational, operational and collaborative environments and, in particular, how to present the end user with an holistic experience of the business through IT. These aims, and a growing conviction that the original data warehouse architecture struggles to meet modern business needs for near real-time business intelligence (BI) and support for big data, drove Barry’s latest book, Business unIntelligence: Insight and Innovation Beyond Analytics, now available in print and eBook editions.

Barry has worked in the IT industry for more than 30 years, mainly as a Distinguished Engineer for IBM in Dublin, Ireland. He is now founder and principal of 9sight Consulting, specializing in the human, organizational and IT implications and design of deep business insight solutions.

Editor's Note: Find more articles and resources in Barry's BeyeNETWORK Expert Channel and blog. Be sure to visit today!

August 2010 Archives

Yesterday's Information Management webinar, "Information Overload? 3 ways real-time information is changing decision management" hosted by Eric Kavanaugh, got me into a philosophical mood.  Robin Bloor of the Bloor Group and David Olson of Progress Software provided a fascinating overview of the development of the complex event processing (CEP) market and its increasing importance for business competitiveness and, perhaps, even survival.

Robin's message was twofold.  From a business viewpoint, decision making is moving from reactive to predictive, driven by competition in business to best understand upcoming market opportunities right to the edge of what can be foreseen.  In technology, the exponential growth in speed of processors and, indeed, the majority of the hardware infrastructure is driving or enabling application architectures from batch orientation, through transaction processing and into real-time event handling.  David provided some interesting examples of how CEP is being used by his customers in manufacturing, logistics and airlines to monitor business events in real time, to react earlier to changing circumstances and to drive process improvement.  Their bottom line: CEP is a major architectural transition that is rapidly becoming mainstream; if you're not on board, you risk severe competitive disadvantage.

From one point of view, the message makes sense.  It's yet another twist of the screw towards speedier decision making.  Operational BI promotes the use of near real-time data either copied into the warehouse or accessed in transaction-processing systems via federated query.  CEP goes one step further and says let's access and analyze the data as it flows through the network; we need to make decisions before we land the data on a disk, if we even land it at all.  In the financial markets, with the data volumes and reaction speeds involved, once the technology became available the approach seemed like a no-brainer.  In financial systems, CEP enables high-value applications such as fraud detection in credit card transactions.

However, looking at some of the applications presented in the webinar, operational BI has also been used effectively to solve similar business issues.  The boundary between the more traditional operational BI approach and CEP depends on the required speed of decision making and the volumes of events involved.  CEP certainly extends the high-end of pattern recognition and trend detection to higher speeds and volumes.  And in the middle range, it provides another set of implementation options beside operational BI.

So, what were my philosophical musings?  Robin presented a very interesting scale of human decision-making timescales, from months and years at one extreme to one tenth of a second at the other.  That latter number is the fastest human reaction time and, by the way, slower than a cobra's strike!  CEP and, to some extent operational BI, operate in the range of decisions speeds faster than one tenth of a second: that is, entirely beneath human radar.  While it is clear that some decisions--collision avoidance on the highway, for example--naturally fall in this timescale, my concern is the implications that arise from pushing more and more decisions into this realm and, by definition, beyond human oversight.  We've already seen the consequences of this approach in the financial markets, where computer-based trading has driven wild, unpredictable and potentially dangerous swings in the markets.  Decision-making algorithms are only as good as the assumptions that have been encoded in them, which depend, in turn, on the knowledge available and the business requirements--both explicit and implicit--when they were created.  It is really sensible to design systems that unnecessarily exclude human wisdom?

The current business mindset that competitiveness is next to godliness is, in many cases, driving decision making into tighter and tighter circles, removing wisdom, insight, intuition and basic humanity from the loop.  Are we prepared to learn any lessons from the recent financial market fluctuations?  And, it is wise to arbitrarily remove more and more important decision making from human oversight just because the technology is available?  Just asking... 

Posted August 27, 2010 7:05 AM
Permalink | 4 Comments |
Pervasive Software presented at the Boulder BI Brain Trust (BBBT) last Friday, August 13.  What caught my attention was their DataRush product and technology, and particularly the technological driver behind it.  For a brief overview of the other aspects of Pervasive covered, check out Richard Hackathorn's blog on the day.

But, back to DataRush.  DataRush was originally conceived as a redesign of Pervasive's data integration tool, acquired from Data Junction in 2003.  However, it was soon recognized that the underlying function could be applied to other data-intensive tasks such as analytics.  Pervasive CTO, Mike Hoskins, described DataRush as a toolkit and engine that enables ordinary programmers to create parallel-processing applications simply and easily using data flow techniques to design them and without having to worry about the complexities of parallel-processing design, such as timing and synchronization between parallel tasks.

Now, of course, there's nothing new about parallel processing or the inherent difficulties it presents to programmers.  It's been at the heart of large-scale data warehousing, particularly through the use of MPP (massively parallel processing) systems, for a number of years.  Mike's point, however, was that parallel processing is about to go mainstream.  The technology shift enabling that has been underway for a few years now--the growing availability of multi-core processors and servers since the mid-2000s.  4-core processors are already common on desktop machines, while processors with 32 cores and more are already available for servers.  Multiply that by the number of sockets in a typical server, and you have massive parallelism in a single box--if you can use it.  The problem is that with existing applications designed for serial processing, the only benefit to be gained from such multi-core servers at present it in supporting multiple concurrent users or tasks or in what's known as "embarrassingly parallel" applications where there are no inter-task dependencies.  DataRush's claim to fame is that it moves data-intensive parallel processing from high-end, expensive and complex MPP clusters and specialist programmers to commodity, inexpensive and simple SMP multi-core servers and ordinary developers.

Of course, Pervasive is not alone in trying to tackle the issues involved in software development for parallel-processing environments.  But their approach, coming from the large-scale data integration environment, makes a lot of sense in BI.

However, to see the really significant implications, we need to see this development in the context of other technological advances.  There is the emergence of solid-state disks (SSDs) and the growing sizes and dropping costs of core memory that remove or reduce the traditional disk I/O bottleneck.  The decades-old supremacy of traditional relational databases is being challenged by a variety of different structures, some broadly relational and others distinctly not.  Add to this the explosive growth of data volumes, especially soft or "unstructured" information.  Pervasive, along with other small and medium-sized software vendors, is pushing information processing to an entirely new level.


Posted August 18, 2010 7:58 AM
Permalink | No Comments |
Wow - what a saga!  I published the first article of the series "From Business Intelligence to Enterprise IT Architecture" in March and part 5 has just appeared, part 6 is under construction and I can see at least another couple to come.  So, what's it all about?

Business Integrated Insight* (BI2 - BI to the power of 2) is an architectural effort I've committed to undertake in order to extend and update the long-serving data warehouse / business intelligence architecture. Why?  It's very clear to me that the business requirements and expectations for business intelligence have expanded dramatically over the past ten years.  And that expansion has been into areas more usually thought of as operational applications and collaborative systems.  Both of these areas have also encroached into BI.  So, requirement boundaries have blurred significantly, and an architecture that was defined in the 1980s based on the then current definition of decision support could clearly do with a substantial overhaul.

One approach could be to try to address the changed needs from within the boundaries of data warehousing, as Bill Inmon, for example, has done with DW 2.0â„¢.  However, in my opinion, this effort has too narrow a focus.  Today's business users and decision-makers come from an entirely different generation to those that were the target audience of the original data warehouse efforts in the 1980s (Devlin & Murphy, 1988).  Modern business users are tech savvy, internet-aware, social networking animals.  Technology boundaries such as operational / informational / collaborative are simply not their scene.  My belief is that any new architecture for business intelligence must, by definition, extend over the entire IT infrastructure for business.

Is this a tall order?  Well, it's certainly ambitious.  But so was data warehousing way back in the mid-1980s when relational databases were still young.  But, the technology available today has advanced by leaps and bounds since then, especially in the last few years.  Take a look at columnar / parallel databases such as Netezza, Vertica, ParAccel, Aster Data, Infobright and more, not to mention Oracle's Exadata V2.  While the hype is about query speed and large data volumes, the potential is surely for a paradigm shift (there! I've said it) in data storage architectures.  The same drive is evident in the mushrooming interest in unstructured information, which I prefer to call soft information.  I wrote recently about Attivio, who are building a bridge between the worlds of unstructured and structured information.  From beyond the database world SOA-like and Web 2.0 technologies are also fundamentally changing the way the old operational and collaborative environments are structured.

These technologies and tools, and more, will enable the leap to BI2 over the coming few years.  And if you know of particularly innovative solutions that are coming to market, I'd love to hear about them!

*For the record, I coined the term Business Integrated Insight and drew the first architectural diagram in an August 2009 white paper sponsored by Teradata


Posted August 5, 2010 12:16 PM
Permalink | No Comments |