Blog: Barry Devlin Subscribe to this blog's RSS feed!

Barry Devlin

As one of the founders of data warehousing back in the mid-1980s, a question I increasingly ask myself over 25 years later is: Are our prior architectural and design decisions still relevant in the light of today's business needs and technological advances? I'll pose this and related questions in this blog as I see industry announcements and changes in way businesses make decisions. I'd love to hear your answers and, indeed, questions in the same vein.

About the author >

Dr. Barry Devlin is among the foremost authorities in the world on business insight and data warehousing. He was responsible for the definition of IBM's data warehouse architecture in the mid '80s and authored the first paper on the topic in the IBM Systems Journal in 1988. He is a widely respected consultant and lecturer on this and related topics, and author of the comprehensive book Data Warehouse: From Architecture to Implementation.

Barry's interest today covers the wider field of a fully integrated business, covering informational, operational and collaborative environments and, in particular, how to present the end user with an holistic experience of the business through IT. These aims, and a growing conviction that the original data warehouse architecture struggles to meet modern business needs for near real-time business intelligence (BI) and support for big data, drove Barry’s latest book, Business unIntelligence: Insight and Innovation Beyond Analytics, now available in print and eBook editions.

Barry has worked in the IT industry for more than 30 years, mainly as a Distinguished Engineer for IBM in Dublin, Ireland. He is now founder and principal of 9sight Consulting, specializing in the human, organizational and IT implications and design of deep business insight solutions.

Editor's Note: Find more articles and resources in Barry's BeyeNETWORK Expert Channel and blog. Be sure to visit today!

"We are running through the United States with dynamite and rock saws so an algorithm can close the deal three microseconds faster," according to algorithm expert Kevin Slavin at last month's TEDGlobal conference.  Kevin was describing the fact that Spread Networks is building a fiber optic connection to shave three microseconds off the 825 mile trading journey between Chicago and New York.  The above comes courtesy of a thought-provoking article called "When algorithms control the world" by Jane Wakefield on the BBC website.

As someone who follows the data warehousing or big data scene, you're probably familiar with some of the decisions that algorithms are now making, based largely on the increasing volumes on information that we have been gathering, particularly over recent years.  We all know by now that online retailers and search engines use sophisticated algorithms to decide what we see when we go to a web page.  Mostly, we're pretty pleased that what comes up is well-matched to our expectations and that we don't have to plow through a list of irrelevant suggestions.  We're thankful for the reduction in information overload.

But, there are a few downsides to our growing reliance on algorithms and the data they graze upon...

The carving up of the US for fiber highlights the ecological destruction that comes from our ever-growing addiction to information and the speed of getting it.  We tend to sit back comfortably and think that IT is clean and green.  And it probably is greener than most, but it certainly far from environmentally neutral.  We need to pay more attention to our impact--carbon emissions, cable-laying and radio-frequency and microwave pollution are all part of IT's legacy.

But there's a much more subtle downside, and one that has been of concern to me for years.  Just because we can do some particular analysis, does it really make sense?  The classic case is in the use of BI and data mining in the insurance industry.  Large data sets and advanced algorithms allow insurers to discover subtle clues to risk in segments of the population and adjust premiums accordingly.  Now, of course, actuaries have been doing this since the 18th and 19th centuries.  But the principal driver in the past was to derive an equitable spread of risk in a relatively large population, such that the cost of a single event was effectively spread over a significantly larger number of people.  However, data mining allows ever more detailed segmentation of a population, and insurers have responded by identifying particularly high-risk groups and effectively denying them insurance or pricing premiums so high that such people cannot insure their risk.  While in some cases we can argue that this drives behavior changes that reduce overall risk (for example, safer driving practices among young males), in many other instances, no such change is possible (for example, for house owners living on flood plains).  I would argue that excessive use of data mining to segment risk in insurance eventually destroys the possibility to spread risk equitably and thus undermines the entire industry.

In a similar manner, the widespread use of sophisticated algorithms and technology to speed trading seems to me to threaten the underlying purpose of futures and other financial markets, which, in my simplistic view, is to enable businesses to effectively fund future purchases or investments.  The fundamental goal of the algorithms and speedy decision making, however, seems to be to maximize profits for traders and investors, without any concern for the overall purpose of the market.  We've seen the results of this dysfunctional behavior over the past few years in the derivatives market, where all sense of proportion and real value was lost in the pursuit of illusory financial gain.

But, it gets worse!  The BBC article reveals how a British company, Epagogix, uses algorithms to predict what makes a hit movie.  Using metrics such as script, plot, stars, location, and the box office takings of similar films, it tries to predict the success of a proposed production.  The problem here, and note that the same applies to book suggestions on Amazon and all similar approaches, is that the algorithm depends on past consumer behavior and predicts future preferences based upon that.  The question is: how do new preferences emerge if the only thing being offered is designed solely to satisfy past preferences?

I would argue that successful business strategy requires a subtle blend of understanding past buyer behavior and offering new possibilities that enable and encourage new behaviors to emerge.  If all that a business offers is that which has been successful in the past, it will be rapidly superseded by new market entrants that are not locked into the past.  The danger of an over-reliance on data mining and algorithms is that innovation is stifled for business.  More importantly, for civilization, imagination is suffocated and strangled for lack of new ideas and thoughts.

Do you want to live in such an algorithm-controlled world?  As Wakefield puts it so well: "In reality, our electronic overlords are already taking control, and they are doing it in a far more subtle way than science fiction would have us believe. Their weapon of choice - the algorithm."

Posted August 25, 2011 6:19 AM
Permalink | 1 Comment |

1 Comment

Some very interesting thoughts there Barry.

Another concern I have occurs when analysts are given free reign to establish complex segmentation models, but the models are developed without a feedback loop from the front line as to whether the insight it provides is actionable.

For example, assume the analytics team use a complex model to segment the customer base into groups A, B & C, and convince senior management of the strategic benefit of converting C's into B's. Too often, the next step is to make the conversion of C's into B's a font line KPI.

The problem here is the analysts are the ones with the detailed understanding of how the model works, but as far as they're concerned their job is done. Senior management have determined and communicated a strategy, pushing the onus for translating the insight into customer-facing behaviour onto the front line.

If that translation is done without a complete understanding of how the variables in the model interact, the outcome will probably be poor for everyone, yet I see this played out time and time again.

Leave a comment