A lot of hype and publicity has recently been focused on customer data integration (CDI). Customer information is critical for an organization in order to provide “quality” products and services to their customers.
The question I want to explore is, “What is meant by the term data integration?” The term “business rules” used to mean policies and constraints on business processes that translated into edit and validation rules of the data being created or updated.
However, the term “business rules” seems to be used more in the context of ETL (extract, transform and load) as rules for “transforming” data from the form in which it exists in a source data store to a different or disparately defined form in order to propagate the data into a target data store. Most all uses of the term “data integration” now seem to address the concepts and processes of transforming data from one form to another for propagation to another data store.
Some Existing Definitions of Data Integration
But are any of these definitions really “data integration”? If one has to restructure, relabel or extract and transform data, it cannot possibly be considered “integrated.” It can only be considered redundant proliferation of data that exists in two or more different forms.
Why These are NOT Data Integration
But Mastro goes on to say his earlier cited definition is not data integration: “Data values (i.e., instances) are copied into the component systems for local use. As time passes, the values change independently of the original. There is no means for ensuring value consistency between the component systems. In addition, there is no means for ensuring that the original definition of the data is the same across the component systems.” One can never consider as integration the fact that you have two copies of data, equivalent at one point in time, but can have uncontrolled changes, such that knowledge workers looking at what should be the same record see different data values.
What Data Integration Should Mean
I propose the following definitions of “data integration” and a definition of what has been wrongly labeled data integration:
Every time an organization builds an interface to move and transform data from an operational database to another operational database, they are adding cost – not value – to the enterprise. The reason? Information, as the only non-consumable resource of the enterprise, is never used up or destroyed in the processes that retrieve and apply it.
CEOs should be demanding a business case for any project seeking to build a redundant database. And the CIOs should never have to ask for a business case for projects that call for designing enterprise-strength information models and subject-oriented databases that contain all information about that subject or resource required by all knowledge workers.
What do you think? Let me hear at Larry.English@infoimpact.com.
Recent articles by Larry P. English
Larry P. English, Cofounder of the IAIDQ, is President and Principal of INFORMATION IMPACT International Inc., and author of the widely acclaimed Improving Data Warehouse and Business Information Quality. His forthcoming book, Information Quality Applied: Best Practices for Business Information, Processes and Systems, will be available in early 2009. He is a speaker at the upcoming 2008 IQ Conference in San Antonio, Texas. He provides consulting and training to help information professionals increase their value to the enterprise and provides certification in his TIQM methodology. For details, email TIQMCert@infoimpact.com or visit www.infoimpact.com.