Business Intelligence Network Business Intelligence Resources

Blog: Dan E. Linstedt

« The nature of Appliances in the DW Space | Main | Temperature of Data for RDBMS, and DW 2.0 »

MDM and Consolidation of Data Sets

MDM data often is dispersed across the organization. This begs the question: how can the MDM be a viable asset to the business base? Is the Master Data reused throughout the organization the way it should be? Is it defined in the right context (Master Metadata)? But technically, Master Data should be consolidated into a single global data center. MDM is not a tool, not a toy, not a process - it's a way of doing business that includes tools, best practices, people, governance, metadata and single answers.

Remember: MASTER DATA is NOT, I repeat: NOT a single version of the facts, rather it IS a single version of today's corporately accepted TRUTH. Also remember: what's true today was NOT true yesterday (last week, last year, 5 years ago...) Master Data IS auditable as a system of record, but it's one of 3 system of record definitions (see my blog on System of Record).

Ok, so where does that leave me?
If you've got an SOA effort within your business, then you should have a space in the plan for Master Data, Governance (at different levels), Master Metadata, EII. It means your EDW is already setup as an Active Data Warehouse and is operating in Near Real Time. It means your operational systems have already placed their data exchanges under web-services (or are undergoing this conversion as a part of the SOA project).

To get to Master Data, I would strongly suggest that you have already defined enterprise wide conformed dimensions, or if you've got a 3rd normal form warehouse, that you've already defined enterprise wide accepted metadata definitions. In other words, I don't believe that you can get to Master Data Management successfully without FIRST going through a Master Metadata Management effort.

So what is Master Data Exactly?
As I've suggested in the past, Master Data IS a single consistent, consolidated version of today's truth, based on a SINGLE BUSINESS KEY, and includes descriptive data of that key post-cleansing/aggregation. Keep in mind that Master Data means the data attached to that KEY is at the SAME semantic levels of metadata definition. For instance, a Master Data table is different from a Conformed Dimension in that Master Data Tables do NOT and should not contain hierarchies within the same table set. Hierarchical data is at a different semantic grain, and usually is keyed off a different business key - therefore requires a separated and different Master Data Table.

Can the Master Data Tables be linked together?
Yes - through many to many relationships, Master Data Tables should NOT be dependant on parents, each master data table should be fully independant - as they stand alone in definitional nature. Master Data only requires the business key - it does NOT require the existence of the parent in order to be "created, used, and referenced." The BUSINESS may require the parent key when enforcing business rules for reporting purposes. Referential integrity should be done through secondary processing, or through EII, or through BI query sets to determine what "todays" business rule is, and if the data is in error (according to context).

Keep in mind that Master Data CHANGES CONTEXT depending on who's using it!!! That means, that Master Metadata must be defined, and metadata definitions over-ridden at operational levels (as long as there exists a 100% dependancy on the parent metadata chain) in order to determine context of the Master Data Set.

For example: A car is a car is a car, it has a VIN number - the VIN number doesn't change even though the car's color changes, or the seats change, or the radio is swapped out. The CAR CAN EXIST WITHOUT A DRIVER / OWNER! The business rule for "shipping" of that Car cares about a parent Container, and a Parent ship to that container. The Sales-floor cares about the "car" and the prospective buyer of that car, and the OWNER cares about the car itself. Context of the CAR and how the master data of the CAR changes depending on who's using the data and how it's viewed.

Master Data MUST be consolidated in a single data center. If it's to be utilized (for performance purposes) in other systems, it must flow downstream from the central synchronization point to a local copy. It's a read-only local copy in order to avoid stove-piping in the industry again. Master Metadata must also be centralized, AND the master metadata MUST be delivered along with the Master Data in order to make sense of it at run-time or access time.

Are there problems with the Master Data centralization effort?
YES, timing issues abound, locking, synchronization, and distribution issues abound. However - networks are getting faster, enabling data to centralize better - and be utilized all over the world. Here's where EII can help - EII can begin to compress data sets, store data sets in virtual tables on the server, AND in the near future (EII vendors, are you listening?) manage Master Metadata!! EII is in a prime position to solve the problem of connecting Master Data to Master Metadata definitions, and act as the delivery point for both.

Before you tackle Master Data, please please please spend time in considering the architecture carefully, and I would suggest that you're ADW, EII, Metadata, and web-services components are in place for at least one component of the business.

We are a world-class implementation firm who builds solutions that scale to the Petabyte ranges. Come see my MDM night school course at TDWI next week, followed by my VLDW class the next day.

Thoughts? Questions? Comments?

Thanks,
Dan Linstedt
CTO, Myers-Holum, Inc
http://www.MyersHolum.com

  Posted by Dan Linstedt on August 19, 2006 6:49 AM |

Post a comment