Blog: Barry Devlin Subscribe to this blog's RSS feed!

Barry Devlin

As one of the founders of data warehousing back in the mid-1980s, a question I increasingly ask myself over 25 years later is: Are our prior architectural and design decisions still relevant in the light of today's business needs and technological advances? I'll pose this and related questions in this blog as I see industry announcements and changes in way businesses make decisions. I'd love to hear your answers and, indeed, questions in the same vein.

About the author >

Dr. Barry Devlin is among the foremost authorities in the world on business insight and data warehousing. He was responsible for the definition of IBM's data warehouse architecture in the mid '80s and authored the first paper on the topic in the IBM Systems Journal in 1988. He is a widely respected consultant and lecturer on this and related topics, and author of the comprehensive book Data Warehouse: From Architecture to Implementation published by Addison-Wesley in 1997.

Over the past few years, Barry has extended his interest to cover the wider field of a fully integrated business, covering informational, operational and collaborative environments and, in particular, how to present the end user with an holistic experience of the business through IT.

Barry has worked in the IT industry for more than 25 years, mainly as a Distinguished Engineer for IBM in Dublin, Ireland. He is now founder and principal of 9sight Consulting, specializing in the human, organizational and IT implications and design of deep business insight solutions.

Editor's Note: Find more articles and resources in Barry's BeyeNETWORK Expert Channel and blog. Be sure to visit today!

Recently in Business intelligence Category

crystal ball.jpgIt's that time of year when every analyst worth his or her salt is making predictions for the coming year.  Acquisitions.  Big data.  Mobile BI.  Cloud.  Social media.  Predictive analytics... hey! Wait a minute!

My question is: how many of these predictions about BI 2012 are based on the use of predictive analytics?  My hunch is... none.  Perhaps I'm being unfair?  Is it predictive analytics to use all those surveys of buying intentions as input?  What about using trend numbers for market share over the past few years? 

So, here is the Wikipedia definition: "Predictive analytics is an area of statistical analysis that deals with extracting information from data and using it to predict future trends and behavior patterns. The core of predictive analytics relies on capturing relationships between explanatory variables and the predicted variables from past occurrences, and exploiting it to predict future outcomes."  What do you think?  How is the fit?

I'm pushing this so hard because I have long observed that in the most important decisions--whether in personal life or in business--we often trust our "gut feel", our intuition over measurable facts.  How many of us are purely rational in our decision making?  Even if we do make a list of pros and cons, because our weighting system is at best invented on the spot or at worst non-existent, we often end up at least as confused as when we started.  Or is this just my experience?  Well, yes, because I haven't carried out that survey...

Am I saying that BI is useless?  No, not at all.  Simply, that like every other tool, it's good for some things but not for others.  And if you can accept that, it follows that we need to expand the scope of what we do in decision making support.  The name "business intelligence" seems far too rational and limited for what the type of support that most decision makers need when making decisions.  For me, one of the most important trends in the past year or two has been the focus on the social aspect of decision making, as pioneered by Lyza, but now promoted by pretty much every BI vendor.  However, social networking support means a lot more than allowing comments or discussions on reports, dashboards and various visualizations.  It actually means providing a comprehensive, coherent environment where all interactions around a particular decision are recorded and tracked.  And, the last time I checked, much of that interaction occurs before the data crunching starts, and another chunk long after the BI tool has been shut down.  Decision making support needs to go far wider than the majority of BI vendors even imagine.

So, am I predicting that 2012 will be the year when BI tools finally "get it" that they are not the center of the decision maker's universe?  When BI vendors stop adding social networking bells and whistles to their tools and instead figure out how to be part of a larger Enterprise 2.0 effort unifying all social interactions around conversations, documents, analyses and more within the organization?

No, I don't think I can predict the future.  What I can say is that it's up to you to make your future, and one area where technology, both hardware and software, is rapidly improving is social networking support.  It would therefore be valuable to give some serious thought and focus in 2012 to how you could support collaborative decision making in the coming few years.


Posted December 19, 2011 10:15 AM
Permalink | 2 Comments |
data searching boy.pngI wrote a White Paper for an interesting, UK-based start-up, NeutrinoBI, back in October on the topic of freeform search in BI.  So, a new paper by Marti A. Hearst, "'Natural' Search User Interfaces", in the November issue of "Communications of the ACM" caught my attention.  I was particularly interested because Hearst has been one of the main proponents of faceted search, an approach that is relatively unsuccessful in BI.  I wondered if I had missed some new developments in the field.

In the event, Hearst's paper is somewhat futuristic.  Nonetheless, he makes some fascinating observations about how users' interaction with computers is undergoing significant change.  The first trend is a move towards voice interaction, driven in particular by smart phones.  There have been dramatic improvements in voice recognition over the past few years, and the ability of software to recognize and interpret simple commands allows users to speak more naturally.  This is driving a move away from single keyword or key phrase searches, even when typed into Google, for example.  However, more work is needed to enable reliable identification of context and the sequential inquiry approach often favored by users.  Natural language-like query has been a topic of ongoing interest in BI, and NeutrinoBI's product supports this in its query interface.  

Hearst also notes the growing importance of social search, in collaboration, crowdsourcing and basic social communication.  This is a topic I'm currently developing in a new White Paper with Lyzasoft and to which I'll return in a future post.

The final point that Hearst makes is on the emergence of video as a means of communication, replacing text-based approaches among some younger users of the Web.  This poses some significant challenges for both search and analytics in the future.  In recent years, text analytics has become an important tool for sentiment analysis and so on.  "Video analytics" is still in its infancy.

Returning to NeutrinoBI, let's take a look at the exciting concept of freeform search.  Freeform search, in one sentence, enables Google-like search of highly-structured information as is found in data warehouses and data marts.  The secret of moving from keyword search of content to freeform search lies in understanding and representing the structure of relational data in a way that supports intelligent parsing of free text searches.  This structure originates in data modeling and is carried into relational database design and associated metadata--which columns occur in which tables, identification of primary and secondary keys and foreign-key relationships between tables.  Structuring, whether into normalized or multidimensional schemata, also constrains allowable values in certain columns and relationships between them.

The result is a conceptual hierarchical structure that is hardwired in the database or application in the multidimensional approach.  For example, geographical location is often expressed in the form of regions, which are comprised of countries, which are further comprised of states / provinces, which break down to counties, each of which contains a known and limited set of values.  In the case of freeform search, a process known as hierarchical value decomposition is used to automatically analyze database structures and create the indexes and metadata--the information context, unique and specific to the structure and content of the underlying data--needed to extract meaning from the key words entered by the user at search time.  The result is a system that can be easily used by business people in their own terminology and can be constructed rapidly and efficiently by IT.  Further details can be found in my White Paper "Freedom from Facets: Discovering the data you really need".

On the Web, a search interface is already the norm.  As Hearst describes, it is evolving into something closer to the way humans interact.  What we see on the Web usually transfers into the enterprise environment.  The function that  NeutrinoBI is already delivering shows how these developments can be applied to BI and, in the process, is beginning to provide business users with a powerful, yet simple, way to get the information they need.


Posted November 30, 2011 5:10 AM
Permalink | No Comments |
Sometimes, business intelligence stories can be complex and technical.  Sometimes, even the success stories of a million shaved off expenses here or a 3% improvement in profit margin there can be a bit repetitive.  However, it is the success stories that can prove the business value and quality that characterize the best BI projects.  As one of the judges of the annual South African BI Excellence and Innovation Awards (closing date for entries is 30 November), I'd like to share with you a novel approach to making your entry stand out.  For those of you not eligible to enter, the approach also works when trying to raise funding from the business for a new BI project.  However, you may need to be very brave, even foolhardy; sometimes, you need to tell the as-is horror story to prove just how much better the to-be situation will be.

This story that I'm about to summarize comes courtesy of Teradata's Bill Franks, who posted it about 10 days ago on the Smart Data Collective blog, where it's attracting lots of attention.  Entitled "The Dire Consequences of Analytics Gone Wrong: Ruining Kids' Futures", Bill recounts the story of a local school who invested for the first time in an analytics package designed (allegedly) to detect cheating in English essays.  The package was run for the first time against a set of essays submitted at the start of term by a class of highly motivated and high performing students.  The package promptly reported that each and every student was a cheater, with pervasive copying and plagiarism throughout the group.  The school failed all of the students on the assignment and was about to note the offense on the students' records, an action that would have had severe consequences for their future educational chances, until the parents stepped in...

I leave you to check the rest of the story on Bill's post.  But the bottom line was that the school and the teachers trusted the results of the package more than their own prior experience of the students.  A result of stunning implausibility from the software was accepted without question, without any resort to reason or common sense.  Apparently, the school backed down on noting the offense on the students' records; but it stood by the decision to fail every student in the class on the essay.

As BI practitioners, I trust you get the message.  Analytics in the hands of naive users can be more dangerous than a Kalashnikov.  Self-service BI may speed up delivery, but how reliable are the conclusions?  There's a good reason why assault rifles are not available on open supermarket shelves (in most countries!).  I'm not against self-service BI, but I do have a problem with willful and uneducated business users.  BI must complement common sense, not to substitute for it.  It's up to IT to ensure the users are informed of the strengths and weaknesses of new tooling.

To return to your entries for the BI Awards, to be presented at the BI Summit in Johannesburg on 28 February next, I hope you don't have a horror story of such magnitude, and if you did, I imagine your CEO would be reluctant to have it told in public.  But do remember that the difference between the before and after pictures is a strong indication to the judges of the value of the project to the business.  BI Excellence, beyond technical architecture, also includes data governance and user education.  BI Innovation is about changing the way users make business decisions.  Good decisions based on reliable information.

Posted November 21, 2011 5:32 AM
Permalink | No Comments |
CognitiveComputingWatson.jpgI've previously written about IBM Watson, its success in "Jeopardy!" and some of the future applications that its developers envisaged for it.  IBM has moved the technology towards the mainstream in a number of presentations at the Information on Demand (IOD) Conference in Las Vegas last week.  While Watson works well beyond the normal bounds of BI, analyzing and reasoning in soft (unstructured) information, the underlying computer hardware is very much the same (albeit faster and bigger) as we have used since the beginnings of the computer era.

But, I was intrigued by an announcement that IBM made in August last that I came across a few weeks ago:
"18 Aug 2011: Today, IBM researchers unveiled a new generation of experimental computer chips designed to emulate the brain's abilities for perception, action and cognition. The technology could yield many orders of magnitude less power consumption and space than used in today's computers.


In a sharp departure from traditional concepts in designing and building computers, IBM's first neurosynaptic computing chips recreate the phenomena between spiking neurons and synapses in biological systems, such as the brain, through advanced algorithms and silicon circuitry. Its first two prototype chips have already been fabricated and are currently undergoing testing.

"Called cognitive computers, systems built with these chips won't be programmed the same way traditional computers are today. Rather, cognitive computers are expected to learn through experiences, find correlations, create hypotheses, and remember - and learn from - the outcomes, mimicking the brains structural and synaptic plasticity... The goal of SyNAPSE is to create a system that not only analyzes complex information from multiple sensory modalities at once, but also dynamically rewires itself as it interacts with its environment - all while rivaling the brain's compact size and low power usage."


Please excuse the long quote, but, for once :-), the press release says it as well as I could!  For further details and links to some fascinating videos, see here.

What reminded me of this development was another blog post from Jim Lee in Resilience Economics entitled "Why The Future Of Work Will Make Us More Human". I really like the idea of this, but I'm struggling with it on two fronts.

Quoting David Autor, an economist at MIT, Jim argues that that outsourcing and "othersourcing" of jobs to other countries and machines respectively are polarizing labor markets towards opposite ends of the skills spectrum: low-paying service-oriented jobs that require personal interaction and the manipulation of machinery in unpredictable environments at one end and well-paid jobs that require creativity, ambiguity, and high levels of personal training and judgment at the other.  The center-ground - a vast swathe of mundane, repetitive work that computers do much better than us - will disappear.  These are jobs involving middle-skilled cognitive and productive activities that follow clear and easily understood procedures and can reliably be transcribed into software instructions or subcontracted to overseas labor.  This will leave two types of work for humans: "The job opportunities of the future require either high cognitive skills, or well-developed personal skills and common sense," says Lee in summary.

My first concern is the either-or in the above approach; I believe that high cognitive skills are part and parcel of well-developed personal skills and common sense.  At which end of this polarization would you place teaching, for example?  Education (in the real meaning of the word - from the Latin "to draw out" - as opposed to hammering home) spans both ends of the spectrum.

From the point of view of technology, my second concern is that our understanding of where computing will take us, even in the next few years, has been blown wide open, first by Watson and now by neurosynaptic computing.  What we've seen in Watson is a move from Boolean logic and numerically focused computing to a way of using and understand and using soft information that is much closer to the way humans deal with it.  Of course, it's still far from human.  But, with an attempt to "emulate the brain's abilities for perception, action and cognition", I suspect we'll be in for some interesting developments in the next few years.  Anyone else remember HAL from "2001, A Space Odessey"?

Posted November 1, 2011 5:30 AM
Permalink | No Comments |
Laurel-Hardy.jpgIn the Information part of Information Technology, Big Data is the Big Hit of 2011.  It's also a wonderful phrase to play with: take the "big", place it in front of a few other words and suddenly you have a strapline... or a blog title!  So, is it a big change for IT, or is it just big hype?

There's no doubt in my mind that big data describes a real and novel phenomenon; unfortunately, there are also many existing and well-understood phenomena in the world of business intelligence and data warehousing that are getting sucked into marketing stories and, indeed, even into respectable articles about big data.


The recent McKinsey Quarterly article "Are you ready for the era of 'big data'?" (registration required) opens with the following example: "The top marketing executive at a sizable US retailer recently [discovered that a] major competitor was steadily gaining market share across a range of profitable segments...  [This] competitor had made massive investments in its ability to collect, integrate, and analyze data from each store and every sales unit and had used this ability to run myriad real-world experiments.  At the same time, it had linked this information to suppliers' databases, making it possible to adjust prices in real time, to reorder hot-selling items automatically, and to shift items from store to store easily.  By constantly testing, bundling, synthesizing, and making information instantly available across the organization... the rival company had become a different, far nimbler type of business.  What this executive team had witnessed first hand was the game-changing effects of big data" [my emphasis].

With all due respect to the authors, I believe that anybody who has been involved in business intelligence over the past ten years will be underwhelmed by this story.  It is almost entirely a scenario, and a common one, at that, describing a pervasive data warehousing implementation and operational BI excellence.  I suspect that the reason this example was tagged as big data was because of the reference to running myriad real-world experiments.  This is a behavior often associated with big data; however, on its own, it is generally not a sufficient characteristic.  

The remainder of the article provides many interesting examples and possible consequences, both beneficial and cautionary, of using big data.  For the business executive, it clearly whets the appetite.  But, from an IT perspective, it misses a key aspect--a viable definition of what big data really is.  This is hardly surprising; big data has reached the point on the hype curve where definitions are considered unnecessary.  We all seem to have an assumed definition that neatly meets our needs, be it selling a product or initiating a project.  Hear me clearly, though.  Despite the hype, there is something real going on here.  And it's fundamentally about the underlying characteristics of the information involved; characteristics that differ significantly from the data we in IT have stored and used over the years.

I contend that there are four types of information that together make up big data:
1.    Machine-generated data, such as RFID data, physical measurements and geolocation data, from monitoring devices
2.    Computer log data, such as clickstreams
3.    Textual social media information from sources such as Twitter and Facebook
4.    Multimedia social and other information from the likes of Flickr and YouTube

They are as different from traditional transactional data (the mainstay of BI) as they are from one another.  They have little in common, beyond their volume.  How business extracts value from them and how IT processes them vary widely.

While closely related to traditional BI and data warehousing, big data projects require additional and often very different skills in business and IT.  Their value is first to drive innovative change in business processes; only afterwards can their use become ongoing and operational.  These are topics I'll return to in the coming months.  But, in the meantime, join me for my webinar "Big Data Drives Tomorrow's Business Intelligence" on 25th October for further insights in this rapidly evolving area.



Posted October 21, 2011 11:03 AM
Permalink | 1 Comment |
PREV 1 2 3