Blog: Claudia Imhoff Subscribe to this blog's RSS feed!

Claudia Imhoff

Welcome to my blog.

This is another means for me to communicate, educate and participate within the Business Intelligence industry. It is a perfect forum for airing opinions, thoughts, vendor and client updates, problems and questions. To maximize the blog's value, it must be a participative venue. This means I will look forward to hearing from you often, since your input is vital to the blog's success. All I ask is that you treat me, the blog, and everyone who uses it with respect.

So...check it out every week to see what is new and exciting in our ever changing BI world.

About the author >

A thought leader, visionary, and practitioner, Claudia Imhoff, Ph.D., is an internationally recognized expert on analytics, business intelligence, and the architectures to support these initiatives. Dr. Imhoff has co-authored five books on these subjects and writes articles (totaling more than 150) for technical and business magazines.

She is also the Founder of the Boulder BI Brain Trust, a consortium of independent analysts and consultants (www.BBBT.us). You can follow them on Twitter at #BBBT

Editor's Note:
More articles and resources are available in Claudia's BeyeNETWORK Expert Channel. Be sure to visit today!

 

I have read with interest a number of blogs and articles, and listened to a number of presentations about data integration technologies from a number of people. It seems to me that we need to clear the air regarding the role of data integration and what projects use it in our enterprises...

Granted data integration technologies have been around for longer than data warehousing but they really first got our attention through the advent of data warehouse and data mart construction. At that time, we called the technologies that helped integrate disparate pieces of data and create snapshots of that data -- "ETL" -- for Extraction, Transformation, and Load. Perhaps we should have called these technologies EIL -- Extract, Integrate, and Load -- but that's another story.

For good or bad, these projects were also the first ones to recognize a number of other, tangentially related data problems and initiatives -- improving the quality of the data being integrated, creating sets of integrated master or reference data (MDM), implementing repositories of current data for management and operational purposes (ODS), and so on. I think this is where the wheels began to fall off. Since data warehouse projects first recognized the need for solutions to these other problems, the implementers and vendors alike assumed that they must belong to the data warehouse environment and the data warehouse team took on responsibility for a much broader, more diverse set of projects.

My comment to this expansion of the data warehouse team's responsibilities is "Back off!" None of these three initiatives has anything to do with strategic, tactical or operational BI implementations. Don't get me wrong -- They certainly make building a data warehouse or mart easier but you should not confuse easier construction with total control. That is a no-win situation.

Data quality is a massive, enterprise-wide effort to bring quality processes and metrics into the entire IT environment -- not just the BI one. Its function should not be buried within the data warehouse function or it will never garner the proper enterprise level support mandatory for successful quality initiatives.

Now let's look at MDM -- again many people come at MDM from a data warehouse perspective. They believe it's just another data integration project using many of the same techniques and technologies used in building our BI environments. Therefore, the leap of logic is to put it under the domain of the data warehouse function. Wrongo! Just like the data quality initiative, MDM is a massive enterprise-wide initiative that affects all aspects of IT -- not just the data warehouse. It has its own set of policies, procedures, applications, and technologies specific to integrating reference data. Once again, the BI environment is greatly benefited by having such an environment in place but so is the business transaction environment.

Finally there is the operational data store (ODS). No wonder people don't understand it or -- worse -- define it as a staging area for the data warehouse. It is a current repository of operational data. If you have a mature MDM environment in place, then the ODS could contain only integrated current transactional data (an operational transaction data store or OTDS) and uses the current master data from the MDM environment.

None of these is a data warehouse project. They should stand on their own two feet as independent initiatives that just happen to make data warehouse construction and BI in general much easier to implement. Just don't make the mistake of assuming that if the project deals with data integration in some fashion, then it must be a data warehouse project. Data integration has expanded WAY beyond data warehousing!

Yours in BI success,

Claudia


Posted March 22, 2007 3:55 PM
Permalink | 4 Comments |

4 Comments

Claudia you are so right.
While all of these activities are related to Enterprise Information Management they should not be confused with BI/EDW activities. Many organisations make the mistake of expecting their BI/EDW steff to be equally expert in other areas of data integration and live to regret it.
Another technology that is involved in data integration but not necessarily with data warehousing is Enterprise Information Integration (EII), although a valid and effective use of EII is to collect data from a number of sources and provide it to a delivery mechanism, possibly in company with data from a data mart or data warehouse.

Hello Claudia,
I agree with you that we should not try to put these integration efforts or MDM under the Date Warehouse project hat.
In a enterprise wide IT program we are planning to build an EDW for a bank with one of the main principles of assuring at every time that it reconciles with the Generel Ledger. We are working on a concept of "Single Source" where we load the GL and the EDW from the same integration platform. With DW ETL and accounting Business rules in the same pipeline we will send consistent (consolidated, integrated) data to both the EDW and the GL. As a DW project specialist I don´t want to be responsible for preparing the postings to the GL. The same with MDM, I don´t want to be responsible for maintaining customer data for use in CRM activities.

Claudia, Glad to read your comments differentiating data integration from ETL processes for DW purposes. Considering the large volumes of data or information held in non-relational data sources and even in unstructured content, truly understanding what data integration is becomes quite significant. Your reminder that data integration hits the heart of the enterprise and should be managed by IT is right on the money.

Claudia, What do you think of DW as a MDM initiative? Picture that you have a normalized EDW based on a common corporate data model with great data quality, already in place. Wouldn't it be a good idea to reuse this as a MDM for the enterprise serving the operational world using some sort of service, making the EDW a so called Active Datawarehouse (term by Stephen Brobst).

Leave a comment

    
   VISIT MY EXPERT CHANNEL

Search this blog
Categories ›
Archives ›
Recent Entries ›