Blog: Colin White Subscribe to this blog's RSS feed!

Colin White

I like the various blogs associated with my many hobbies and even those to do with work. I find them very useful and I was excited when the Business Intelligence Network invited me to write my very own blog. At last I now have somewhere to park all the various tidbits that I know are useful, but I am not sure what to do with. I am interested in a wide range of information technologies and so you might find my thoughts will bounce around a bit. I hope these thoughts will provoke some interesting discussions.

About the author >

Colin White is the founder of BI Research and president of DataBase Associates Inc. As an analyst, educator and writer, he is well known for his in-depth knowledge of data management, information integration, and business intelligence technologies and how they can be used for building the smart and agile business. With many years of IT experience, he has consulted for dozens of companies throughout the world and is a frequent speaker at leading IT events. Colin has written numerous articles and papers on deploying new and evolving information technologies for business benefit and is a regular contributor to several leading print- and web-based industry journals. For ten years he was the conference chair of the Shared Insights Portals, Content Management, and Collaboration conference. He was also the conference director of the DB/EXPO trade show and conference.

Editor's Note: More articles and resources are available in Colin's BeyeNETWORK Expert Channel. Be sure to visit today!

Enterprise data integration is a hot topic and covers a wide range of technologies including enterprise application integration (EAI), enterprise information integration (EII), and extract, transformation and load (ETL). Master Data Management (MDI) is also a data integration technology and should be added to list. Just to make life interesting, analyst organizations like Gartner are talking about Customer Data Integration (CDI), which to me is a subset of MDM. Sorting all this out and coming up with a data integration strategy is not easy.

Currently I am conducting a research project on data integration for the data warehousing institute (TDWI). The results will be published in a TDWI report and Webcast in October.

The premise of the research report is that organizations are moving toward (or need to move toward) an enterprise-wide data integration strategy (of which data warehousing is a piece). It will look at different types of data integration (consolidation, federation, propagation) and data integration technology (EAI, EII, ETL, MDM). It will discuss why these technologies complement, rather than compete, with each other and suggest which technology should be used when. It will also look at requirements for data integration products and look at the concept of a data integration center.

I am interested to hear if you agree with the premise of the paper. The topic of the data integration center is also very interesting. Several companies I have interviewed either have, or are planning to have, a data integration center. Both Ascential (IBM) and Informatica are pushing this concept. Informatica recently published a book on the data integration centers and it is worth reading, even if you aren't an Informatica customer. You can find the book on Amazon.

All comments welcome. Colin.

Posted July 19, 2005 12:21 PM
Permalink | 4 Comments |


I couldn’t agree more on the interest in this area, but I have a bit of a different take on what CDI is. I believe that CDI is pointing all the technologies you listed at a particular business problem – the customer. In any project that involves customer data integration – and Initiate Systems has been involved in over 60 – there is usage for all those technologies brought together with the right process. A bit high level but how the technologies are used:

ETL – extraction from the multiple source systems for the initial creation of the customer hub

MDM – the central management of the data that is going to be stored

EAI – the synchronization mechanism back to source systems

EII – the federation mechanism to integrate the stored data with the data remaining in the source systems

There are other technologies involved as well: direct adapters to source systems, API integration to the customer hub at the points of service among others.

Our successful approach has been to put the most accurate, performant customer hub in the middle and then utilize these other technologies to their best advantage.

I've been trying to address some of these issues about where the lines cross among EAI, EII, CRM, CDI, DW, BI, ETL, MDM and any other TLA you can think of over at CDI Station ( I'm coming at it from the CDI perspective, which is my field. Just getting started, but stop by, take a look, and please leave a comment.

I saw your Enterprise Data Integration BLOG, the topics of the paper and research project are of extreme interest. I did not see Oracle Data Hub type technologies listed, unless you are characterizing this product under master data management. There are some interesting issues around the Data Hub model which is bi-directional having the ability to synchronize sources systems to the Hub. I question the validity of this model which essentially allows a third party application to change source system data. I doubt a sources system such as SAP would allow or support Oracle's Data Hub to change its data. In addition there is a compliance question, by changing the source system you can not maintain data transparency, so this would violate Sarbanes-Oxley.

I agree with the premise of your research report. My comment is around the data integration products used to satisfy all of the capabilities mentioned. The recommendation, based on experience, is to go with a suite of tools already integrated by the vendor as the real cost of integration is having to perform "do it yourself" integration of disparate tools. It is hard enough to deal with the integration and resoultion of business data and processes without the overhead of infrastructure integration as well. It is much more cost effective and efficient to let the vendor do it for you. Make sure your technology recommendations also consider the full spectrum of the problem space including the analysis and maintenance phases where data quality and profiling tools are included. With the proliferation of outsourcing along with mergers and acquisitions and the attendant staffing changes and loss of tribal knowledge, the management of enterprise meta data is a crtical success factor to be considered in the tool space. Yes, a robust and integrated suite is a major investment but one well worth it, even if you don't buy it all at once but have given the forethought to why it is needed and prioritize your purchases based on immediate need. Having a fully thought out strategy and well integrated technology solution will help to mitigate the risks in being able to meet Sarbanes-Oxley and other compliance initiatives while positioning the business to meet both long and short term Information Management requirements. Data integration shouldn't be a technology problem but a business problem. An integrated suite of tools enables the focus to be on solving the business problem; a focus on information and process rather than integrating the integration technology.

Also make a differentiation between the technologies of Information Management and those of Data Management. Too often the database is sold as the ultimate solution. A data integration solution should be able to take advantage of any number of underlying data stores. Databases are primarily for data management, the storage and performance aspects of data provisioning; a small part of the techology capabilities required for managing information.

Leave a comment