Blog: Barry Devlin Subscribe to this blog's RSS feed!

Barry Devlin

As one of the founders of data warehousing back in the mid-1980s, a question I increasingly ask myself over 25 years later is: Are our prior architectural and design decisions still relevant in the light of today's business needs and technological advances? I'll pose this and related questions in this blog as I see industry announcements and changes in way businesses make decisions. I'd love to hear your answers and, indeed, questions in the same vein.

About the author >

Dr. Barry Devlin is among the foremost authorities in the world on business insight and data warehousing. He was responsible for the definition of IBM's data warehouse architecture in the mid '80s and authored the first paper on the topic in the IBM Systems Journal in 1988. He is a widely respected consultant and lecturer on this and related topics, and author of the comprehensive book Data Warehouse: From Architecture to Implementation.

Barry's interest today covers the wider field of a fully integrated business, covering informational, operational and collaborative environments and, in particular, how to present the end user with an holistic experience of the business through IT. These aims, and a growing conviction that the original data warehouse architecture struggles to meet modern business needs for near real-time business intelligence (BI) and support for big data, drove Barry’s latest book, Business unIntelligence: Insight and Innovation Beyond Analytics, now available in print and eBook editions.

Barry has worked in the IT industry for more than 30 years, mainly as a Distinguished Engineer for IBM in Dublin, Ireland. He is now founder and principal of 9sight Consulting, specializing in the human, organizational and IT implications and design of deep business insight solutions.

Editor's Note: Find more articles and resources in Barry's BeyeNETWORK Expert Channel and blog. Be sure to visit today!

Recently in Business intelligence Category

eco-skyscraper-by-vikas-pawar-2a.jpgIn an era of "big data this" and "Internet of Things that", it's refreshing to step back to some of the basic principles of defining, building and maintaining data stores that support the process of decision making... or data warehousing, as we old-fashioned folks call it. Kalido did an excellent job last Friday of reminding the BBBT just what is needed to automate the process of data warehouse management. But, before the denizens of the data lake swim away with a bored flick of their tails, let me point out that this matters for big data too--maybe even more so. I'll return to this towards the end of this post.

In the first flush of considering a BI or analytics opportunity in the business and conceiving a solution that delivers exactly the right data needed to address that pesky problem, it's easy to forget the often rocky road of design and development ahead. More often forgotten, or sometimes ignored, is the ongoing drama of maintenance. Kalido, with their origins as an internal IT team solving a real problem for the real business of Royal Dutch Shell in the late '90s, have kept these challenges front and center.

All IT projects begin with business requirements, but data warehouses have a second, equally important, staring point: existing data sources. These twin origins typically lead to two largely disconnected processes. First, there is the requirements activity often called data modeling, but more correctly seen as the elucidation of a business model, consisting of function required by the business and data needed to support it. Second, there is the ETL-centric process of finding and understanding the existing sources of this data, figuring out how to prepare and condition it, and designing the physical database elements needed to support the function required.

Most data warehouse practitioners recognize that the disconnect between these two development processes is the origin of much of the cost and time expended in delivering a data warehouse. And they figure out a way through it. Unfortunately, they often fail to recognize that each time a new set of data must be added or an existing set updated, they have to work around the problem yet again. So, not only is initial development impacted, but future maintenance remains an expensive and time-consuming task. An ideal approach is to create an integrated environment that automates the entire set of tasks from business requirements documentation, through the definition and execution of data preparation, all the way to database design and tuning. Kalido is one of a small number of vendors who have taken this all-inclusive approach. They report build effort reductions of 60-85% in data warehouse development.

Conceptually, we move from focusing on the detailed steps (ETL) of preparing data to managing the metadata that relates the business model to the physical database design. The repetitive and error-prone donkey-work of ETL, job management and administration is automated. The skills required in IT change from programming-like to modeling-like. This has none of the sexiness of predictive analytics or self-service BI. Rather, it's about real IT productivity. Arguably, good IT shops always create some or all of this process- and metadata-management infrastructure themselves around their chosen modeling, ETL and database tools. Kalido is "just" a rather complete administrative environment for these processes.

Which brings me finally back to the shores of the data lake. As described, the data lake consists of a Hadoop-based store of all the data a business could ever need, in its original structure and form, and into which any business user can dip a bucket and retrieve the data required without IT blocking the way. However, whether IT is involved or not, the process of understanding the business need and getting the data from the lake into a form that is useful and usable for a decision-making requirement is exactly identical to that described in my third paragraph above. The same problems apply. Trust me, similar solutions will be required.

Image: http://inhabitat.com/vikas-pawar-skyscraper/



Posted March 17, 2014 5:33 AM
Permalink | No Comments |
orthrus.jpgMuch has happened while I've been heads down over the past few months finishing my book. Well, Business unIntelligence - Insight and Innovation Beyond Analytics and Big Data" went to the printer last weekend and should be in the stores by mid-October. And, I can rejoin the land of the living. One of the interesting news stories in the meantime was Cisco's acquisition of Composite Software, which closed at the end of July. Mike Flannagan, Senior Director & General Manager, IBT Group at Cisco and Bob Eve, Data Virtualization Marketing Director and long-time advocate of virtualization at Composite turned up at the BBBT in mid-August to brief an eclectic bunch of independent analysts, including myself. 

The link-up of Cisco and Composite is, I believe, going to offer some very interesting technological opportunities in the market, especially in BI and big data. 

BI has been slow to adopt data virtualization. In fact, I was one of the first to promote the approach with IBM Information Integrator (now part of InfoSphere), some ten years ago when I was still with IBM. The challenge was always that virtualization seems to fly in the face of traditional EDW consolidation and reconciliation via ETL tools. I say seems because the two approaches are more complementary than competitive. Way back in the early 2000s, it was already clear to me that there were three obvious use cases: (i) real-time access to operational data, (ii) access to non-relational data stores, and (iii) rapid prototyping. The advent of big data and the excitement of operational analytics have confirmed my early enthusiasm. No argument - data virtualization and ETL are mandatory components of any new BI architecture or implementation.

So, what does Cisco add to the mix with Composite? One of the biggest challenges for virtualization is to understand and optimize the interaction between databases and the underlying network. When data from two or more distributed databases must be joined in a real-time query, the query optimizer needs to know, among other things, where the data resides, the volumes in each location, the available processing power of each database, and the network considerations for moving the data between locations. Data virtualization tools typically focus on the first three database concerns, probably as a result of their histories.  However, the last concern, the network, increasingly holds the key to excellent optimization.  There are two reasons. First, processer power continues to grow, so database performance has proportionately less impact.  Second, Cloud and big data together mean that distribution of data is becoming much more prevalent.  And growth in network speed, while impressive, is not in the same ballpark as that of processing, making for a tighter bottleneck.  And who better to know about the network and even tweak its performance profile to favor a large virtualization transfer than a big networking vendor like Cisco? The fit seems just right.

For this technical vision to work, the real challenge will be organizational, as is always the case with acquisitions.  Done well, acquisitions can be successful.  Witness IBM's integration of Lotus and Netezza, to name but two. Of course, strategic management and cultural fit always count. But, the main question usually is: Does the buyer really understand what the acquired company brings and is the buyer willing to change their own plans to accommodate that new value? It's probably too early to answer that question. The logistics are still being worked through and the initial focus is on ensuring that current plans and revenue targets are at least maintained. But, if I may offer some advice on the strategy... 

The Cisco network must recognize that the query optimizer in Composite will, in some sense, become another boss. The value for the combined company comes from the knowledge that resides in the virtualization query optimizer about what data types and volumes need to be accommodated on the network. This becomes the basis of how to route the data and how to tweak the network to carry it. In terms of company size, this may be the tail wagging the dog. But, in terms of knowledge, it's more like the dog with two heads. The Greek mythological image of Kyon Orthros, "the dog of morning twilight" and the burning heat of mid-summer, is perhaps an appropriate image.  An opportunity to set the network ablaze.

Posted September 7, 2013 12:21 AM
Permalink | 1 Comment |
Fig 2-5 - Anne_Anderson05 - Beauty and the Beast.jpg
The second in a series of posts introducing the concepts and messages of my forthcoming book, "Business unIntelligence--Insight and Innovation Beyond Analytics and Big Data", available in mid-October.

What's wrong with business intelligence, you may ask? Why do we need to reopen an architectural can of worms and start debating a new structure instead of or beyond data warehousing?  Would we want another Inmon vs. Kimball type of altercation?  These are all questions I asked in the past few years as I contemplated the architecture I first described in 1988 reaching drinking age (well... in most of the US). The first and most important answer to all those questions comes simply from a comparison between business 1980s-style and business in the current decade.  Chalk and cheese.  Today's business is high-speed, deeply integrated and information rich. It operates in a very different, highly unpredictable environment.  Technology, and information technology in particular, is at the heart of almost every business. I call this new model the biz-tech ecosystem. It has been emerging over the past decade and is now rapidly maturing.  The old concepts of business intelligence and the ancient architecture of data warehousing are no match for this Beauty, this new and terrible beauty that is born.

The characteristics of the biz-tech ecosystem are reintegration of organizations and technologies, interdependence between business and IT, skills crossover between them, cooperation within and without the enterprise and behaviors founded on trust. The business advantages of this new ecosystem are evident: simply look to today's leaders--many of them younger than BI--in almost all areas of business, and observe the central role played by technology.  We see this new ecosystem emerge across all industries, with examples such as integration of the retail supply chain and the potential re-creation of insurance in pay-as-you-drive, to name but two. However, we must also advert to the possibilities for abuse and misuse that are inherent in the technology--an area that receives far too little attention in the quest for bigger business and better technology.

If the business is Beauty, then technology must be the Beast! A review of the evolution of computing in business reveals why IT has been ostracized from the mainstream of corporate activity and has chosen to focus on the care and feeding of the Beast.  It's time IT stepped out of the stable. Also time to think carefully if IT outsourcing and Cloud can really deliver the innovation now central to business.  The biz-tech ecosystem demands that business treat IT as an equal partner and IT steps up to that mark. Businesses must look to deep organizational change to fully embrace and benefit from these new ideas.

Business unIntelligence Cover.jpg
The first message of Business unIntelligence, therefore, is:
The biz-tech ecosystem means business and IT reunited.  This symbiosis is displacing the long-time Business-IT Gap and unthinking outsourcing. It drives faster, integrated and more innovative behavior for leading edge businesses.

I will be exploring the themes and messages of the book over the coming weeks, beginning with a key concept: the biz-tech ecosystem.  You can pre-order the book on Amazon; it should be available in mid-October.  In the interim, check out Thomas Frisendal's in-depth review. I will be speaking on the topic at a number of webinars and conferences.  For starters, check out my presentation at the BrightTALK Business Intelligence and Big Data Analytics Summit, live on September 12th.

Picture: Beauty and the Beast, Anne Anderson, 1902, Wikimedia Commons.

Posted August 22, 2013 4:05 AM
Permalink | No Comments |
Business unIntelligence Cover.jpgThe first in a series of posts introducing the concepts and messages of my forthcoming book, "Business unIntelligence--Insight and Innovation Beyond Analytics and Big Data", available in mid-October.

The term business intelligence has long struck me as something of an oxymoron. Much of the business behavior I've encountered has been far from intelligent. Perhaps I'm a little cynical, but business dysfunction often seemed a more appropriate term. Furthermore, intelligence is surely setting the bar far too low today in the second decade of the 21st century; BI, in practice, too often means little more than generating reports filled with backward-looking data. Can't business do better? What about intuition, insight or inspiration? Should we not also consider social aspects, given that business is largely a collective, collaborative venture?

I played with many positive-tinged terms in search of a title for my book. But none had the breadth of vision I desired; many had already been appropriated by marketing. I introduced the term Business Integrated Insight (BI2) way back in 2009 in a Teradata-sponsored white paper. It was a brave move for a company that is synonymous with the data warehouse, given that the main thrust of the paper was that data warehousing was in need of a major rethink as operational and informational business needs were converging.  The name BI2 didn't stick. But, the ideas did, and the emergence of big data since then has confirmed what I said earlier: we need to start thinking more holistically about the information resource of the enterprise.  

Since then, technologies have evolved--Hadoop, for example--and business priorities have shifted further. Operational analytics drives operational and informational systems ever closer. Collaborative working is finally taking hold in BI. Cloud and Mobile reiterate the importance of a service oriented approach to process.

By late 2012, it was clear to me that the book must begin from what is wrong with how business is today and how to fix it. Simply put, business isn't making enough use of all the information potentially at its disposal. Decisions are poorly understood, never tracked and unrelated to any clear idea of process. Psychological and sociological underpinnings are ignored. Clearly, this is unintelligent behavior by business. Business unIntelligence.

On first encounter, this may sound like a negative phrase. But slowly, gradually, it also became clear that Business unIntelligence is precisely what business today needs in order to benefit from extensive information, adaptive process and the strengths of actual people. Rationality of thought and far beyond it. Logic of process, predefined and emergent. The confluence of reason and inspiration, emotion and intention, collaboration and competition--all that comprises the human and social milieu that is business. Not business intelligence. But Business unIntelligence. Insight and Innovation beyond Analytics and Big Data.

I will be exploring the themes and messages of the book over the coming weeks, beginning with a key concept: the biz-tech ecosystem.  You can pre-order the book on Amazon; it should be available in mid-October.  In the interim, check out Thomas Frisendal's in-depth review. I will be speaking on the topic at a number of webinars and conferences.  For starters, check out my presentation at the BrightTALK Business Intelligence and Big Data Analytics Summit, live on September 12th.

Posted August 13, 2013 6:27 AM
Permalink | No Comments |
Operational analytics is making headlines in 2013. But why is it important? And why is it more likely to succeed now than in the mid-2000s, when it was called operational BI or the mid-1990s when it surfaced as the operational data store (ODS)? 
 
First, let's define the term. My definition, from two recent white papers (April 2012 and May 2013) is: "Operational analytics is the process of developing optimal or realistic recommendations for real-time, operational decisions based on insights derived through the application of statistical models and analysis against existing and/or simulated future data, and applying these recommendations in real-time interactions." While the language is clearly analytical in tone, the bottom line of the desired business impact is much the same as definitions we've seen in the pact for the ODS and operational BI: real-time or near real-time decisions embedded into the operational processes of the business. 

Anybody who has heard me speak in the 1990s or early 2000s will know that I was not a big fan of the ODS. So, what has changed? In short, two things: (1) businesses are more advanced in their BI programs and (2) technology has advanced to the stage where it can support the need for real-time operational-informational integration. 

BI Evolution.jpg
The evolution of BI can be traced on two fronts shown in the accompanying figure: the behaviors driving business users and the responses required of IT providers. As this evolution proceeds apace, business demands increasing flexibility in what can be done with the data and increasing timeliness in its provision. In Phase I, largely fixed reports are generated perhaps on a weekly schedule from data that IT deem appropriate and furnish in advance. Such reporting is entirely backward looking, describing selected aspects of business performance. Today, few businesses remain in this phase because of its now limited return on investment; most have already moved to Phase II. 

This second phase is characterized by an increasing awareness of the breadth of information available collectively across the wider business and an emerging ability to use information to predict future outcomes. In this phase, IT is highly focused on integrating data from the multiple sources of operational data throughout the company. This is the traditional BI environment, supported by a data warehouse infrastructure. The majority of businesses today are at Phase II in their journey and leaders are beginning to make the transition to Phase III. 

Phase III marks a major step change in decision making support for most organizations. On the business side, the need moves from largely ad hoc, reactive and management driven to a process view, allowing the outcome of predictive analysis to be applied directly, and often in real time, to the business operations. This is the essence of the behavior called operational analytics. In this stage, IT must become highly adaptive in order to anticipate emerging business needs for information. Such a change requires a shift in thinking from separate operational and informational systems to a combined operational-informational environment. This is where the action is today. This is where return on investment for leading businesses is now to be found. And, simply put, this is why operational analytics is making headlines today--many businesses are ready for it; the leaders are already implementing it. 

This leads us to the second contention: that technology has advanced sufficiently to support the need. There are many ways that recent advances in technology can be combined to do this. In the white papers referenced above, one shows how two complementary technologies, IBM DB2 for z/OS and Netezza, can be integrated to meet the requirements. The other shows how the introduction of columnar technology and other performance improvements in DB2 Advanced Enterprise Edition can meet these same needs. Other vendors are improving their offerings in similar directions. 

So, to paraphrase the "Six Million Dollar Man": we have the business waiting. We have the technology. We have the capability to build this... But, wait. There is one more hurdle. Most existing IT architectures strictly separate operational and informational systems based on a data warehouse approach dating back to the mid-1980s. This split is a serious impediment to building this new environment that demands a tight feedback loop between the two environments. Analyses in the informational environment must be transferred instantly into the operational environment to take immediate effect. Outcomes of actions in the operational systems must be copied directly to the informational systems to tune the models there. These requirements are difficult to satisfy in the current architecture; they demand a new approach. This is beginning to emerge, but is by no means widespread yet. I'll be discussing this topic further over the coming weeks.

Posted May 13, 2013 8:41 AM
Permalink | No Comments |
PREV 1 2 3 4

   VISIT MY EXPERT CHANNEL

Search this blog
Categories ›
Archives ›
Recent Entries ›