Business Intelligence Network business intelligence resources

Blog: William McKnight

« Use the data warehouse, go to jail | Main | The new OLAP Report is out »

MDM is not just another data mart

Calling master data management just another data mart, as some has done, is flawed on many levels. Probably most importantly, it implies that mart is fed from the data warehouse. Since the data warehouse should contain master data, not mere duplicates of dirty operational data and data marts should be cut for discrete query and application purposes, what purpose would a master data data mart serve? Master data is more about distribution to those operational systems that need it, as opposed to master data query.

Fix master data in the operational environment and feed it to the data warehouse. That way, all systems have access to master data. The data warehouse is too far down in the data lifecycle, and historically has been a poor instrument as a data distribution source, to be effective. If data warehousing met MDM needs, there wouldn't be a clamor for a different way with more applicability to modern business.

Master data management in the operational environment can comprise a virtual management structure or a physical structure. And keep in mind MDM is more than data storage and distribution. It's a holistic approach that includes data quality, data stewardship, third party data, enterprise quality modeling, ROI AND data management and distribution. This is why you have so many vendors at the MDM party. Most contribute a piece, like maybe the appetizer or dessert, but few contribute full MDM. It's really a process more than a technology. Contact me to learn about the experience- and success-based approach to MDM I've developed with my clients for my firm.

Look for my feature article in DM Review next month (April) on master data management and my course at TDWI May 18 in Chicago on the subject for more. Link.

  Posted by William McKnight on March 4, 2006 12:41 PM |

Comments

Hi William,

I respect your opinion, and at the same time I see value in trying to define or differentiate between a "compliant data warehouse" and a "non-compliant data warehouse." In my humble opinion, any non-compliant data warehouse can contain any number of aggregations and integrations that are desired by the business, in other words - data tracability is lost at the granular level. Which means that "master data" can exist in the data warehouse as an integrated and cleansed (altered) source of information.

The problem I have is with compliance, and of course the definition of compliance is fuzzy at best. In the government systems I've built over the past 10 years or more, compliance meant turning the warehouse into an integrated store - with the lowest level of grain of data from the source system (integrated by business key), and the remainder of the data represented the exact data as it stood in the source system for auditing and tracability sake.

In this vision of the data warehouse, there was no "room" for a master data list - that is, one that is cleansed and quality checked. However we did have a master parts list, but again, it was raw data. We provided as a part/parcel of our master data management strategy, "Master Data Marts" which rolled the data to the "flavor of the month", while the underlying data set in the warehouse remained consistent and auditable.

I guess one could say we implemented master data management across both the warehouse and the data marts, with "dirty" data existing in the warehouse because the source systems contained it.

Hope this helps clear things up
Dan Linstedt

Hello b-eye neighbor.

I like to develop enterprise data warehouses for as many uses as possible, compliance- and non-compliance-related. In the build of THE EDW, I prefer to source BOTH a copy of operational data as well as a cleansed version that has undergone data cleanup. They are both discrete data items - operational data representing what was entered or sourced operationally and clean data representing (usually) what it should have been.

The former would satisfy compliance requirements by providing the audit trail and the latter would be more suitable to analytics. Of course, this all occurs only after exhausting the rationale for disallowing dirty data into the environment at all in the first place. But that is not always in our control or we need to act before the quarters/years it will take to make operational change.

I agree that master data can exist in the data warehouse (and marts), but mostly it should be fed there from operational master data processes.

Post a comment