Blog: Barry Devlin Subscribe to this blog's RSS feed!

Barry Devlin

As one of the founders of data warehousing back in the mid-1980s, a question I increasingly ask myself over 25 years later is: Are our prior architectural and design decisions still relevant in the light of today's business needs and technological advances? I'll pose this and related questions in this blog as I see industry announcements and changes in way businesses make decisions. I'd love to hear your answers and, indeed, questions in the same vein.

About the author >

Dr. Barry Devlin is among the foremost authorities in the world on business insight and data warehousing. He was responsible for the definition of IBM's data warehouse architecture in the mid '80s and authored the first paper on the topic in the IBM Systems Journal in 1988. He is a widely respected consultant and lecturer on this and related topics, and author of the comprehensive book Data Warehouse: From Architecture to Implementation.

Barry's interest today covers the wider field of a fully integrated business, covering informational, operational and collaborative environments and, in particular, how to present the end user with an holistic experience of the business through IT. These aims, and a growing conviction that the original data warehouse architecture struggles to meet modern business needs for near real-time business intelligence (BI) and support for big data, drove Barry’s latest book, Business unIntelligence: Insight and Innovation Beyond Analytics, now available in print and eBook editions.

Barry has worked in the IT industry for more than 30 years, mainly as a Distinguished Engineer for IBM in Dublin, Ireland. He is now founder and principal of 9sight Consulting, specializing in the human, organizational and IT implications and design of deep business insight solutions.

Editor's Note: Find more articles and resources in Barry's BeyeNETWORK Expert Channel and blog. Be sure to visit today!

May 2012 Archives

AIW.pngIn 1988, I published the first data warehouse architecture.  Its aim was to provide consistent, integrated data to business users in support of cross-enterprise decision making.  Quality and consistency were the key drivers; at that time the major issues were that operational / transactional systems were highly inconsistent and direct access to them was discouraged for reasons of performance and security.  Business users were happy to get whatever consistent view they could, and, in general, wanted to see a stable representation of the business on a monthly, weekly or occasionally daily basis.  This architecture has remained a foundation of business intelligence ever since.

21 years later, in 2009, I introduced Business Integrated Insight (BI2).  With emerging needs like near real-time decision making in operational BI and increasing use of non-traditional data coming from Web 2.0 and other sources, this new architecture had to address a far wider scope than the original data warehouse.  While consistency and integrity remain important considerations, today's business needs are far more about instant access to the ever-changing ebb and flow of trends in sales, manufacturing and more.  It was becoming clear that a new, over-arching architecture was required to cover all the information, processes and people of the business.

Now, three years later, it's clear that traditional BI is racing to keep up with developments in big data, data virtualization and the cloud, mobile computing as well as social networking and collaboration.  All these topics were incorporated in BI2 from the outset.  Now, as the technology moves to the mainstream, we can and must to dive deeper in these specific areas.  Big data leads clearly to the impossibility of routing all information through an enterprise data warehouse (EDW).  But, how will that impact our need for consistency and integrity?  I envisage we will move from the old adage of "a single version of the truth" to multiple versions depending on users' needs, with one particular version that I call core "business information" being the source of truth for external reporting and financial governance needs.  

Data virtualization has also become big news in recent years.  In many ways, it's a technology whose time has come.  With the explosion of data volumes and varieties, users need ways to combine data on the fly with confidence and performance.  Data virtualization addresses these needs and is increasingly overlapping with function we traditionally associate with ETL.  The result, data integration, as it's sometimes called, enables us to envisage a future where data is made available to users as they need it, whether real-time or integrated and historicized.  

And, against the background of all this upheaval in data and infrastructure, we also see a new breed of technology-savvy business users moving into positions of power.  These so-called millennials are demanding seamless, mobile access to the information they need, as well as the ability to play with it as required.  The rule of IT over the data and application resources of the organization is coming to an end.  But, that's not to say that IT has no future role.  In fact, I see more of a fully symbiotic partnership between business and IT emerging, a partnership I call the "biz-tech ecosystem".

My 2012 BI2 Seminar in Rome on 11-12 June explores these new directions and provides guidance on their introduction in your existing data warehouse environment.  It also introduces the Advanced Information Warehouse, shown above, as the next step on your journey from a traditional data warehouse to comprehensive business integrated insight.

Posted May 28, 2012 6:20 AM
Permalink | No Comments |
just-say-nosql.pngBusiness intelligence (BI) has long been associated with relational databases and the SQL language.  From the earliest days of data warehousing, the qualities of the relational model have been highly valued in the quest for data consistency and quality.  In addition, it was assumed that business users are comfortable with tables of information.  This has been proven true, especially by spreadsheets, much to IT's chagrin.  Tables are also the lingua franca of BI tools and simple Select / Where queries are familiar to many users.  But, whatever the rationale, the association of BI and SQL is deeply embedded in the minds of most practitioners. So, the question arises--what about NoSQL; how does this relate to BI?  Can it be of use in data warehousing?

Good questions.  But first, you need to know what flavor of NoSQL you're speaking about.  For brevity, I'll focus only on one of the five or so varieties: document-oriented data stores.  (If you are interested in the others, the bigger picture--and a trip to Rome--I propose my two-day seminar there on 11-12 June!)  As I discovered about a year ago in a fascinating conversation with Max Schireson, president of 10gen / MongoDB, in this context a document is neither about e-mail contents nor Word documents; it refers to a particular data structure where records consist of an arbitrary set of fields, each identified by a name and value pair, structured in JSON (JavaScript Object Notation) or similar language.  For more details, refer to my white paper.  So, let me release you from your suspense now.  Can this be of use in BI? The short answer is yes.  But to fully grasp the extent, I'd like to introduce you to two MongoDB customers and how they are easing into BI using NoSQL.

I spoke to David Chancogne, CTO of Traackr, a web business measuring the influence of people who blog, tweet and otherwise contribute to the impression the general public forms of brands, products and more on the web.  The goal is to assist marketers and advertizing agencies track and target such influencers more effectively.  Traackr has built a MongoDB database of the contents of blogs, tweets, etc. and gives its customers reports and analyses of the top influencers in their areas of interest.  Is this BI?  In its broadest sense, yes.  The scope is very specific and the queries pre-defined, but this is still BI at its most basic.  Did Chancogne think of it as BI?  Actually not, it's simply his business to provide analytics to his customers.  Probing a little deeper, I discovered that Traackr is continually trying to optimize its algorithm to rate influence.  They do this by extracting data from their database and playing with it in--wait for it---Excel!  More BI, but like many a start-up business before them, the choice of Excel was more through familiarity and ease-of-use.  Generic BI tools that run against a JSON data store, such as Pentaho's NoSQL solution, Nucleon Software's BI Studio, are beginning to appear that allow generic querying on the data without extracting it to Excel.

A conversation with Julian Browne led to further interesting insights.  Browne is the architect of Priority Moments (a location-aware customer loyalty program that offers discounts at affiliated retailers) at O2, the second-largest provider of mobile/cell phone services in the UK, with more than 20 million customers.  MongoDB was chosen as the platform for this service largely to deal with the complexity and variability of their product catalog.  The challenge is that there exists a bewildering variety of product sets that can be offered to different customers, and changes constantly at the whim of marketing.  The absence of a predefined schema, a key characteristic of document-oriented data stores, was a compelling argument for the technology choice.  But, what of BI?  Customer loyalty programs are prime BI territory, of course, and in this case tracking of uptake of offers is vital.  As with Traackr, initial BI was provided through hand-crafted Java programming, although there is growing interest in using the emerging BI tools.  Of more interest, however, is the experimental use of a specific feature of the database that allows a query to be left open and as records arrive in the database, they automatically appear in the result, which can be routed to a live HTML5 graph(1) giving real-time feedback to monitor program activity.

How would we summarize the situation regarding BI for document-oriented NoSQL databases?  What we see is a fairly recent database technology with its query facilities being used for basic, predefined BI.  As might be expected, more generic tooling for building queries is appearing.  The type of BI supported is focused, application-specific querying and reporting--the type associated with data marts in traditional BI.  This is exactly as we saw in the emergence of BI against relational databases.  Note that some of the querying is being performed against the live operational sources.  Again, we see the similarity with early reporting approaches with similar concerns about performance impacts on operations.  MongoDB addresses this through the creation of eventually consistent replicas.  Nonetheless, the demand for real-time BI continues to grow and certain classes of operational analytics will need such real-time or near real-time access.

Where NoSQL does not play a role in BI is also important.  Enterprise data warehouses (EDW), with their focus on creating consistent, integrated, historical stores of core business information are set to remain squarely in the relational database world.  But, where operational needs drive the choice of a NoSQL document-oriented data store, it is clear that BI can flourish in this environment too.  See my latest white paper, "Business Intelligence--NoSQL... No Problem", for further details.


(1)  For background on this approach, see hummingbird and data-driven documents.

Posted May 17, 2012 3:37 AM
Permalink | No Comments |
knives.jpgDonald Farmer, now of Qliktech, offered to the Boulder BI Brain Trust (BBBT) last week that what we in "BI" do is better described as decision support rather than business intelligence.  The comment was greeted by a flurry of Tweets and Grunts of agreement.  It's an observation I've also made, and for similar reasons.  In essence, BI tools support decision making; to attribute intelligence--business or otherwise--to software seems somewhat presumptuous.  And yet, there is a further problem with the term business intelligence.  It implies a level of rationality in decision making that is beyond the reality most of us encounter.  This implication is carried even further as various analysts and vendors begin to talk about business analytics as if it will be the ultimate solution to all business decision-making needs.

There are facts, we are told.  And if we have all the facts and we apply comprehensive analytics, we will discover the past, understand the present and predict the future.  We are told this is the scientific method; the truth is in the numbers.  Is this a valid way of interpreting the way the world works?  I would argue that it is so far from reality that we are in danger of creating a fantasy world worthy of Tolkien.

History, it is said, is named thus because it is "his story".  History, they say, is written by the victors.  The implications are far reaching.  Yes, there are indeed facts, but it's the stories we weave around the facts that are what really matter.  To quote Liz Greene(1): "Mehmet the Conqueror invaded Constantinople in 1453.  That is an historical fact.  But depending on which history book we read, Mehmet was either a redeemer or a cruel tyrant, a warrior for the True Faith or a vile heretic."  In terms of the story we tell ourselves about this incident and its value as guide for the future, which part of the quote is more relevant - the historical fact or its interpretation?  And if you haven't yet looked at the endnote for the source of this quote, do so now.  And be brutally honest with yourself.  Do your beliefs about astrology affect the weight you attach to the quote?  And when I tell you that Dr. Liz Greene is also a fully trained and qualified Jungian psychoanalyst, how does that cause you to re-evaluate your judgement.

Our current obsession with analytics is dangerous.  It's based a number of simplifications, misconceptions and downright errors.  It is a simplification that business is an entirely rational, fact-driven process.  It is a misconception that given sufficient data you can predict the future.  It is a downright error to assume that in the future, business can be entirely (or even largely) driven by business analytics.

Does that mean we should abandon analytics?  Of course not.  There are facts to gather that have so far remained undetected.  These facts can influence our interpretations.  If they are indeed relevant to the story at hand.  And if we allow them to do so.  And if our business users can avoid statistical errors such as confusing correspondence with causality.  There are many examples already of significant benefits to be gained for businesses who adopt analytics.

The questions of relevance and abuse of statistics are ones of good analytic practice and education of users.  I have no doubt that, as we move beyond the hype phase, these issues will be addressed.  The issue of interpretation is much more difficult to tackle.  Because it is at the heart of how we imagine our decision makers behave.  Our focus on intelligence--rational and logical--obscures two other keys aspects of decision making:  intent and intuition.  Intent we ignore and intuition we dismiss.  All decision making includes the intent of the decision maker.  That intention drives everything from what data is gathered, through how it is evaluated, all the way to the final choice of action.  How many decisions are post-justified by careful data selection and evaluation?  If a decision maker is motivated by personal gain (and they do exist, you know), won't analytics be enlisted to support that goal?  And regarding intuition, it is evident that not all decision contexts are wholly driven by measurable or predictable metrics.  Low prices may be important, but so too are ambience, history, ethos and personal relationships when customers choose where to shop.  Data measures for the latter are hard to define and capture.  The intuition of an experienced manager is needed in such circumstances.  Target's decision to focus marketing of maternity products to women in the early stages of pregnancy was based on sound analytics according to a story in the New York Times, but the reaction of prospective customers was intuitively obvious.

The bottom line is that we focus exclusively on big data and analytics at our peril.  We need to move beyond traditional concepts of business intelligence and decision support.  I see our goal as supporting full-spectrum business insight.

(1) Greene, L., "Apollo's Chariot - The Meaning of the Astrological Sun", CPA Press, (2001)


Posted May 2, 2012 6:56 AM
Permalink | No Comments |