Blog: Dan E. Linstedt Subscribe to this blog's RSS feed!

Dan Linstedt

Bill Inmon has given me this wonderful opportunity to blog on his behalf. I like to cover everything from DW2.0 to integration to data modeling, including ETL/ELT, SOA, Master Data Management, Unstructured Data, DW and BI. Currently I am working on ways to create dynamic data warehouses, push-button architectures, and automated generation of common data models. You can find me at Denver University where I participate on an academic advisory board for Masters Students in I.T. I can't wait to hear from you in the comments of my blog entries. Thank-you, and all the best; Dan Linstedt http://www.COBICC.com, danL@danLinstedt.com

About the author >

Cofounder of Genesee Academy, RapidACE, and BetterDataModel.com, Daniel Linstedt is an internationally known expert in data warehousing, business intelligence, analytics, very large data warehousing (VLDW), OLTP and performance and tuning. He has been the lead technical architect on enterprise-wide data warehouse projects and refinements for many Fortune 500 companies. Linstedt is an instructor of The Data Warehousing Institute and a featured speaker at industry events. He is a Certified DW2.0 Architect. He has worked with companies including: IBM, Informatica, Ipedo, X-Aware, Netezza, Microsoft, Oracle, Silver Creek Systems, and Teradata.  He is trained in SEI / CMMi Level 5, and is the inventor of The Matrix Methodology, and the Data Vault Data modeling architecture. He has built expert training courses, and trained hundreds of industry professionals, and is the voice of Bill Inmons' Blog on http://www.b-eye-network.com/blogs/linstedt/.

Ok, we've all heard the term: MDM or master Data Management, but what the heck does it mean? My opinion is different than most, and in my search for the ultimate compliant warehouse I constantly battle with new acronyms... What EXACTLY is MDM? What about CDI (customer data integration)? Ok, how about these: PDM, BAM, BPR, LI, CRM, ERP, SEI, CMM, PMP, HIPPA, SARBOX, and so on? Well, I went overboard again (oh happy days...) Acronym soup is nothing new to me, I've been in the industry for over 15 years, been through the 80's and the Business Process Reengineering (BPR), also known as Lean Initiatives (LI), Six Sigma quality improvements, and ISO (international Standards Organization) setup.

Now comes BAM (business Activity management), MDM (master Data Management), CDI (customer Data Integration), PDM (parts data management), SCM (supply chain management), CML (customer Master Lists), and so on... Maybe one day we will realize that some of these acronyms are just new names for OLD (but valid) business goals. In this case, it's all about the business management, better quality, and shorter lead times, reduced overhead and increased revenue - after all, if I'm not making money, then why am I in business? In this entry I will focus on MDM and explore just what it is.

MDM aka: Master Data Management is likened to the production of "master lists" of centralized data sets. The notion that these data sets are "gospel" to the enterprise, and are agreed upon all the way to the top (board of directors), that these are single consolidated lists of data based on business keys, and descriptions only furthers the effort to use them as THE MASTER, the GOLDEN KEY, the LIST of: parts, services, products, deliverables, people, customers... you name it, there can be a master list for it....

BUT MDM takes it seriously as GOSPEL. In other words, many organizations and executive decision makers take an MDM to mean: it's the only SOURCE of all the integrated data of that type (CDI (customer for example)) that exists within the organization. The executives have just turned an MDM into a SYSTEM OF RECORD, like it or not, it's now got to be compliant with the WAY the business does business. It's a centralized system of data that has been cleansed, integrated, and archived. By the way, when completed, it also feeds 90% of the source systems from it's Master List.

BUT WAIT!!!!! What is an MDM really? And if I change the data on the way in, is it really compliant?
Here is where the discrepancies pop up. If the corporation were ever to be audited on the MDM they've built, they might be surprised to find out that keeping it as an audit trail much less a compliant audit trail will become extremely difficult, and in fact - won't stand up in court. Why? because the data itself within the MDM is essentially FLAWED, it has been altered to "meet today's expectations of what the TRUTH is." (and you know my feelings about the TRUTH, truth is all subjective no matter how you look at it.)

Now hold on, I'm not saying there's _no_ intrinsic value to the business to strive towards an MDM... I'm actually questioning if it really deserves the term MASTER DATA MANAGEMENT, I prefer to think of it as MASTER DATA MART - why? because my definition of data warehouse has changed. My definition of Data Warehouse (which is truly MASTER DATA) is essentially integrated, consistent, and EXACT copies of data as it stood on the source systems. (I define this much better in the Data Vault data model architecture at: http://www.DanLinstedt.com)

In other words, there are such things as Master Key Lists (MKL), and a horizontally integrated (without changed) data sets across the enterprise, but in my mind, the minute we put together CHANGES (quality, cleansing, profiling, aggregation, merging, purging, etc) in the data set, we've effectively produced a Data Mart. CHANGES TO THE DATA MADE BY ANY PROGRAMMATIC SYSTEM WITHIN THE WAREHOUSE CONSTITUTE THE CREATION OF A DATA MART, in which data is seen by, and created for specific user purposes.

Hence, I would like to say that MDM is really: master Data Mart, and enterprise accepted view of the data, that is not necessarily auditable nor traceable without the "warehouse" to back it up, in other words, the warehouse data to describe HOW we got to this level of integration.

Hold on, does MDM have value to the business?
ABSOLUTELY. There's no question that an MDM (master data mart) has value to the business, or we wouldn't see things like master parts lists, master customer lists, master services, master employees, master suppliers, etc... There is definitely value in consolidation and cleansing of data FROM the warehouse TO the data marts - there is definately value in producing data marts to meet the enterprise needs, answer questions, and provide a basis of historical data for pattern searching. But please don't confuse MDM with Data Warehouse, MDM is a data mart even if it's called Master Data Management, the end result of building one of these should be seen as, viewed as, a data mart and nothing more.

Now, can data marts feed operational systems?
Yes, as long as the physical source data from whence it was computed is actually available, auditable, and complaint. I can't stress enough how Data Warehouses must be compliant. The only time a Data Warehouse must not be compliant or probably does not need this level of source detail is when the warehouse is used for Strategic purposes, is ONE WAY IN (data comes in, is cleansed, and integrated, altered) and NEVER feeds source systems, and NEVER feeds operational activities and NEVER is involved in operational decisions.

CDI (customer data integration) is a form of MDM, and is just another data mart, and if you treat it this way - and keep your warehouse compliant and auditable, you are bound to have huge success moving forward. Contact me for success stories I've assisted with in the industry, I'd be happy to talk to you about this whole notion.

I'd also love to hear from you, even if you disagree, or have other opinions on what MDM means to you. Please elaborate below, all comments are welcome.

Thank-you very kindly,
Dan L


Posted February 7, 2006 10:03 AM
Permalink | 8 Comments |

8 Comments

I've wondered about the same question. Fundamentally, they look more like master dimensions to me, rather than a data mart. It harkens back to Kimball and having customer or product dimensions that can conform across the business.

Hi Julius,

I would agree - whether you call them master dimensions or a full master data mart, they are basically for delivery of ALTERED (non-compliant) information.

I would hope that those vendors using the terms do not continue to mix "compliant data warehousing" with MDM. If you see any articles referencing MDM as a compliant data set, I'd love to read through it.

Cheers,
Dan L

Hi Dan,

I think that I agree partially with your views on MDMs, Marts and Warehouses, however, with my backgound, I perhaps have a different perspective.

I treat MDM not only as linked view of the truths of each of the various enterprise systems (ERP, CRM, SCM etc), but also as collection of real-time services to be provided to the enterprise.
Therefore, I agree with the push-pull model you describe, with enterprise systems - but I also see it as a key component of an SOA architecture.
The services I see it providing, are for master record population and retrieval (key for BPMS type activities), Master Object Reference Lookups (key for real-time systems integration scenarios), and providing the source of meta data against which all services should be based (i.e. a method for identifying the single definition of key data entities).

What then becomes more interesting for me, is how this then links to a federated master data ownership, and lifecycle data management approach - to take the management of data as a business asset, away from IT, and back into the hands of the business.

Is this a rant? Possibly!
I would be interested to see if your views on the linkage of these areas is the future too?

Cheers

David J

I don't think there is any question that data warehousing and MDM principles are almost identical - it's just the purpose that might differ. I believe this is supported by the number of data warehouse solution vendors that have made an almost seamless transition to MDM, eg. Kalido and Hyperion.
However, the application of MDM doesn't have to be limited to data warehouse, marts, dashboards and the like. Its extension to transactional applications is appropriate and there is value in achieving it.
I believe the issue is not whether MDM within transactional applications is appropriate, rather the issue is how best to implement across all applications in the enterprise, to achieve the MDM nirvana. I would argue that nirvana is unlikely to be achieved, but there is considerable value in undertaking the journey.

Dan,

It might be helpful to view a "master data mart" as the end result of a "master data management" process. The MDM process accepts whatever data is fed to it from the source systems - the good, the bad, and the ugly. Turn the crank and the data cleansing business rules are applied to produce the MDM mart.

It is possible - as my most recent DW project is doing - to maintain traceability between the "real" (though flawed) data in the source systems and the "clean" (though transformed) data in the MDM mart. Hence the clean data in my DW effectively becomes "gospel" - but the agnostics and skeptics can trace back to the "real" source system as necessary.

I am not comfortable with your implication - if I am understanding your argument - that "gospel" data in your MDM mart isn't a true reflection of the system of record. The "gospel" data in your MDM mart is a better approximation of reality. The business rule transformations in the data cleansing process are a CORRECTION to a previously incomplete or incorrect understanding of a value in the system of record.

I think that the MDM mart, as you describe it, is the BEST system of record - or at least a better one than the "real" system of record. You can incorporate traceability data structures into this MDM mart to tie back to systems of record.

At any rate I believe strongly that MDM - or RDM or CDI or PDM or whatever you want to call this new DW tool - is a very good thing.

Bill

Thank-you all for your comments, they are very thought provoking - I've added MDM Part Deux (II) for those interested in this thread of discussion.

Bill, you raise a question: what is the definition of system of record? I think the lines have blurred and will continue to do so going forward.

I agree that there is value in cleansed and quality merged data. However, I stand by what I say: that a System Of Record "data warehouse" must be an audit trail of raw data (both good and bad) reflected and traceable in the source systems.

Why? because there's a ton of value in understanding what is BROKEN in the business capture and business processes, and not just understanding what good/better/best data looks like from an analytics standpoint. Also, because when the auditor walks in - it's me that ends up in court because I built the system, and the business users I know will point at IT implementation staff and say: "They didn't implement my business rule properly." - ok Maybe I'm jaded just a bit, but I come from a government environment where this actually happened.

Luckily for my team, we implemented a fully compliant and traceable warehouse, and passed the audits with flying colors. We had built "MDM - master data marts" based on the warehouse detail, and the auditor found the business rules to be in-conflict, furthermore they had "data check errors" built in to their operational systems that had been there for 15 years. They found a billing error that amounted to millions of dollars.

This was 12 years ago for me... I've been implementing compliant systems for at least that long, I'm surprised that it took so long for the commercial market to catch up.

After the audit, even the nay-sayers began stating how good the data warehouse was, and how bad the operational systems were at integrating the data. For the first time in 15 years, we saw a RISE in accountability in the business, and a rise in change requests to FIX the business process and business capture problems in the source systems, thus completing the cycle of information quality.

All because we separated the concepts of Master Data Marts and Auditable / Compliant Data Warehousing. I have additional cases like this that I've worked on, one of the secrets to our data warehousing success is the Data Vault data modeling architecture, you can check that out at: http://www.DanLinstedt.com

Great Comments! Thanks,
Dan L

Dan:

I think we are in basic agreement. The MDM without traceability to the "real" source system is something other than the system of record. However, my current project BUILDS this traceability into the MDM - the MDM persists the source values, a source system identifier, time stamp, and the transformed value. Anyone - user or auditor - can easily trace the MDM "clean" data back to its source - within the MDM. Correct me if I am wrong, but I don't think your MDM included this history/traceability in the MDM. I contend that - if this traceability is built into the MDM - then the MDM becomes the system of record. I am not yet sure if I am being pedantic or if my project team has come up with something new and of some significance. Personally I lean towards the second interpretation ... I am often wrong but never in doubt!

Bill

Hi Bill,

Actually I do need to correct you - the "MDM" that we built was in fact compliant, and maintained a full audit trail of every delta of every record across the warehouse - at the lowest level of grain. Again, the government audit would not have been passed had we not had this layering of data over time, in the warehouse "as it stood on the source system". The only thing "master" about our data warehouse were the integrated visions of parts, suppliers, accounts, customers, plans, contracts, employees, and so on. Business keys that made sense at the same semantic level were integrated.

I'd like to chat with you, drop me a personal email and we'll see if we can connect via phone.

Thank-you very much for the kind comments,
Dan Linstedt

Leave a comment

    
Search this blog
Categories ›
Archives ›
Recent Entries ›