Blog: Barry Devlin Subscribe to this blog's RSS feed!

Barry Devlin

As one of the founders of data warehousing back in the mid-1980s, a question I increasingly ask myself over 25 years later is: Are our prior architectural and design decisions still relevant in the light of today's business needs and technological advances? I'll pose this and related questions in this blog as I see industry announcements and changes in way businesses make decisions. I'd love to hear your answers and, indeed, questions in the same vein.

About the author >

Dr. Barry Devlin is among the foremost authorities in the world on business insight and data warehousing. He was responsible for the definition of IBM's data warehouse architecture in the mid '80s and authored the first paper on the topic in the IBM Systems Journal in 1988. He is a widely respected consultant and lecturer on this and related topics, and author of the comprehensive book Data Warehouse: From Architecture to Implementation.

Barry's interest today covers the wider field of a fully integrated business, covering informational, operational and collaborative environments and, in particular, how to present the end user with an holistic experience of the business through IT. These aims, and a growing conviction that the original data warehouse architecture struggles to meet modern business needs for near real-time business intelligence (BI) and support for big data, drove Barry’s latest book, Business unIntelligence: Insight and Innovation Beyond Analytics, now available in print and eBook editions.

Barry has worked in the IT industry for more than 30 years, mainly as a Distinguished Engineer for IBM in Dublin, Ireland. He is now founder and principal of 9sight Consulting, specializing in the human, organizational and IT implications and design of deep business insight solutions.

Editor's Note: Find more articles and resources in Barry's BeyeNETWORK Expert Channel and blog. Be sure to visit today!

Recently in EDW Category

Business users of information have increasingly high expectations these days.  Not only do they want relevant information, irrespective of source and consistent across multiple sources, but they also want it up to the minute.  Such demands require a new approach to Enterprise IT Architecture, and nothing less!

And yet, the question of how to create such a consistent, integrated information resource almost begs a simplistic answer.  If that's what you want, you must stop creating duplicates of existing information that have to be managed to consistency in ever shorter time windows, and you must eliminate--or, at the very least, substantially reduce--existing data duplication.

The original data warehouse architecture from 1988 showed the way. It proposed a logically single data store--the Business Data Warehouse--modeled at the enterprise level as the consistent and integrated source of all information for decision making. This simplicity was ultimately lost with the emergence of the layered architecture (with multiple data marts fed from an enterprise data warehouse), due to a combination of database performance and enterprise modeling issues.

Nonetheless, the approach remains valid for the current much-expanded needs for integration. First, model all the information according an enterprise-level model and then implement as far as possible in alignment to that model. This is the approach proposed in a new architecture, Business Integrated Insight (BI2), which for the first time gathers all the information of the enterprise, hard and soft; operational, informational and collaborative into a single component called the Business Information Resource...

Read more in my article just published and if you're in the vicinity, come to my two-day seminar in Rome on 15-16 April :-)

Posted April 1, 2010 11:16 AM
Permalink | No Comments |
Columnar databases, especially those with an MPP approach, have been notching up impressive query performance figures, showing gains of 100X and more on the traditional players.  Such figures make great press releases, but they do place the emphasis on using these databases as data marts rather than the enterprise data warehouse (EDW) component of the data warehouse architecture.  Focusing on data marts and very specific business intelligence applications makes a lot of sense for new market entrants and smaller players in the DW space, allowing quick wins and easily understood sales messages.

But, I have been convinced for some time now of the much greater potential such performance unleashes in the broader and more complex EDW environment.  And the vendors have been fairly quiet about this part of the market so far, maybe preferring to leave such more technically and politically complex projects to the big guys.  So, it was good to see Vertica's 4.0 announcement last week beginning to address the EDW market with its emphasis on "enterprise ready" and a number of interesting new features and expansions of old functions.

Robust workload and resource management for mixed workloads is a prerequisite for an EDW.  Vertica's introduction of administrator-defined resource pools with memory-usage, priority and concurrency settings and the assignment of users to these pools is a big step in this direction.  A rework of the optimizer in support of this and other features suggests that Vertica are serious about this support.

Also introduced in V4.0 is a newly optimized single record lookup on primary keys.  While aimed at a particular financial analysis use case, this function shows that the database can do more than just crunch columns.  Added to the FlexStore feature introduced in V3.5 where newly loaded data is kept in row format in memory for some period of time, I believe we're seeing the database's growing ability to handle the sort of record-level processing often needed in EDWs.  The new time-series support in V4.0 also plays directly in EDW needs.

Time and customer experience will, of course, prove if I'm correct, but it seems to me that Vertica is beginning to test my assertion that columnar, MPP databases can be applied to EDWs.  And further that their performance characteristics offer the possibility of re-architecting the EDW / data mart divide.

Posted March 3, 2010 11:17 AM
Permalink | No Comments |