Business Intelligence Network business intelligence resources

Blog: Dan E. Linstedt

« A tribute to my mentors | Main | MDM: Deciphering Vendor Hype »

Clarifying MDM - Setting Standards

I've had a lot of great feedback on the MDM blogs that I've been adding lately, and one kind individual sent me an email asking for a couple of things, including a definition, a practical criteria, a practical taxonomy, and to keep the picture simple enough for organizations to use. In this entry I will do my best to offer my *opinion* on the subject, I am open to comments, corrections, and thoughts from all of you - again, this will be only my opinion. Please note that my opinion is biased towards compliance, accountability of data, traceability, and accountability of business users and arises from my experiences with SEI/CMM, PMP, Six Sigma, TQM, BPR, Lean Initiatives and Cycle Time Reduction. I can't wait to hear back from you.

+ First, a definition of master data that "holds true" across all scenarios.

Ok, I'll take a crack at it - in my last blog entry I showed multiple definitions on Master Data Management from various friends. Here are my thoughts on Master Data itself:

Master Data - Information housed in a single, consistent, quality cleansed reference table; located at a single location. All elements except the surrogate key in the master data set are defined by Master Metadata at a global (enterprise, or sometimes industry wide) level. Master Data is not localized (with multiple copies), does not contain duplicates, and may reside anywhere in the corporation, but is exposed through SOA to the enterprise and possibly outside the enterprise. Master Data and Master Metadata are both delivered through a service in the SOA, Master Metadata is exposed to the service layer for interrogation of meaning of the Master Data. In other words, Master Data are reference tables that exist in a single copy somewhere in the enterprise, but with enterprise level visibility.

Master Metadata - Information describing the elements / Attributes, utilization of those attributes, which make up the Master Data Structure. These metadata are agreed upon by the business users to be universal in definition. However if localized utilization descriptions are required, then the executive staff must buy-off on the changes and overrides, and be capable of defining the impact of the change to the bottom line (financial figures). Master Metadata is enabled to be edited and maintained by business users in accordance with Governance policies created at the enterprise level. Master Metadata is also exposed through SOA to the enterprise, and possibly outside the enterprise. Master Metadata is delivered every time Master Data is delivered - they are co-joined in order to increase the level of understanding and provide context at the time of interrogation of the Master Data Set.

* By the way, Bob Seiner and I had a wonderful discussion yesterday about this topic, I owe quite a bit of this information to Bob's thought processes. He and I agree that Master Data and Master Metadata require Governance policies and procedures, along with Data Management approaches, and Data Administration roles and responsibilities. It will be nearly impossible to have a successful Enterprise Master Data / Master Metadata Management effort without the Governance embedded in the processes. You can see my first article on Governance here, on B-Eye, Bill Inmon's Newsletter.

+ Second, practical criteria for determining whether a piece of information "qualifies" as master data.

This one is a bit harder to qualify, my opinion here is based on my personal experience 12 years ago in a government organization where "master lists" where a huge part of lean initiatives, cycle time reduction, and business process understanding. In order to understand the deterministics of Master Data we should remember that frequently there exist multiple copies of this data across our source systems when we start our efforts. We should also remember that the bottom line is: it's up to the business users, NOT I.T. to fully approve master data sets and master "systems".

Deterministics for Master Data Qualification:
* Data that is entered in the source system via an application, and is assigned a FULL business key within the organization.
* These business keys are UNIQUE within a given time-slice. If they are not unique (for all time), then there may be a broken business process (business is losing money), or it isn't the right starting point for master data.
* Data doesn't exist anywhere else in the organization, yes - as unfortunate as it sounds, even Excel Spreadsheets on users desktops that organize, consolidate, and aggregate / define can be considered master data sets, or perhaps master metadata. It is our job to get these loaded in raw-detail into a data warehouse, as they stand - so that the BI tools and integration tools can roll the data for them, and the business can have better visibility as to how their business works.
* Data related to non-composite business keys. Composite business keys often reflect relationships to other data, inter-dependencies. Data related to non-composite business keys are a dead give away (like VIN number for vehicles). Of course VIN number is a SMART business key, and can be broken down at the manufacturer into additional codes, to the outside world (you and I), a VIN number is a single key that indicates master data (descriptives about the vehicle). Some composite keys can be seen or used as Master Data sets, but more often than not - most composite keys when broken into their respective components reveal more than one "master data set". Remember, that the business keys are in fact the KEYS to the business processes, and reflect how the data travels through the business, what impacts it, how it changes. Without business keys, the data in the systems are useless (they can't be searched, located, edited, or managed). I discuss more about business keys in the Data Vault data modeling concepts (Hub Table Definitions).
* Keep in mind that while master data STARTS with the business key identification, it includes the descriptive data that goes with it. Look for hierarchical structures where multiple business keys are "listed" or housed, these should be broken apart to help create the taxonomy of separate master data sets.

If Master Data is left in hierarchical format with a number of business keys embedded, chances are that over time the master list will Lose value (quickly, possibly on an exponential scale), because editing meaning, and maintaining definitive "lists" of individual master data sets will become impossible. For example a bill of material containing work-order numbers, contract numbers, and parts numbers MUST be broken into separate components, each containing their own master data set: Work Orders Master, Contracts Master, Parts Master, etc..

Then you can re-examine the locality of where EACH of these master data sets come from, where they need to reside, and what parts of the business they affect - along with the individual definitions of Master Metadata. You can also then examine the nature of the business process and really begin to understand where these "business keys" originate, how they change in the business processes, who uses which keys, and whom changes the keys as they course through the business. Again, Master Data is all about BPR and Lean Initiatives (overhead reduction, quality improvement, cycle time reduction).

There is a lot more I can say about this, and I will. I have a book on the Data Vault data modeling architecture that will be published here on B-Eye Network soon, the first two chapters are dedicated to the Business, and all the notions I'm discussing here, along with the impact of Data Architecture, and the value of clear definitions.

+ Third, a practical taxonomy (or matrix) of classes of data, for which "master data" is just one class...what are all the other classes of data...and are their inter-relations truly hierarchical, or are they matrixed (or "multi-matrixed")?

Ok we've come to the third of three questions. I began (in the last section) to discuss the normalization of hierarchical data structures according to true business keys. The business users are a great source of knowledge on the "keys" they use to track, file, and relate to the data, as are the applications they use on a daily basis. Business keys are NOT necessarily surrogate keys, or sequence numbers (although in some cases, these sequences and surrogates have been presented to the business for utilization).

In order to understand the taxonomy or matrix of master data / master metadata, we need to first understand two things: WBS (work breakdown structure), and OBS (organizational breakdown structure). Each of these structures must then be cross-hatched to create a lattice of where the work overlays the organization. Below the lattice (in each cell), are the business processes - which must be outlined, highlighted, and understood. The business processes dictate the flow of data, the origination points, and essentially the "master systems", or what should be the master systems anyhow... The taxonomy of the data follows the taxonomy of the business, and the business processes.

For example: I have an organizational structure: CEO->(VP)->(Director)->Management->Line Manager->Worker
I have a work breakdown structure which defines the roles and responsibilities of each, but shows the matrix of work, and how the work flows. There may be several WBS overlays depending on the business task at hand.

Let's say that I have the following parts of the organization: Sales, Contracts, Finance, Procurement, HR, Planning, and Manufacturing. Let's say that this organization sells very large widgets that cost billions of dollars. Let's say we discover that the business process flow is as follows:
1. Sales Sells a contract,
2. Contracts ratifies the contract and passes it to Planning for estimation.
3. Planning makes an estimate of what it will take, how long, how many people, parts, cost, etc... and passes it back to contracts
4. Contracts contacts the customer, ratifies it again, and seals the deal. Contracts passes the contract to Finance
5. Finance begins the billing cycle for each phase, and expects payment in exchange for delivery. Finance sets up charge numbers for the employees, and passes the contract to HR. Finance continues to watch, the cost, planning cycle, budgeting, etc..
6. HR authorizes specific departments and employees to work on the contract, tracks the hours to the estimate, and ensures (governs) which employees are actually working on the contract. HR Passes the contract to Planning
7. Planning re-plans the contract based on the skill set of the employees involved, and the phases that finance and contracts has setup. Planning assigns part numbers, suppliers, and contractors to the planned bill of material. Planning produces a full planned bill of material for manufacturing to build, and hands it off to Manufacturing.
8. Manufacturing then builds the widget, tracking actuals against planned hours, assigns work-orders to parts of the contract, work stations, and executes agreements with the suppliers (working with finance to ensure the right pricing and delivery grade is provided by the suppliers and contractors). Manufacturing completes the deliverable, and hands the phase I/ II / III deliverable back to Contracts and Finance to complete this portion of the transaction.
9. Contracts then needs to compare as-built to as-planned to as-contracted in order to understand if what was built and delivered was actually contracted.
10. Finance needs to compare as-sold to as-contracted to as-financed to as-planned to as-built (manufactured) in order to understand if the estimates match, if sales is selling beyond their means, if contracts has the right dollar figures, in other words: is the business losing or gaining money?
11. HR needs to compare as-assigned, to as-planned, to as-built (actuals versus estimated) for hours, employees, and variations to know how to better bid next time, and to review employee proficiency.
12. Manufacturing needs to compare as-planned to as-built to as-delivered, because parts break, are replaced, plans don't often match actuals, and the whole point is to narrow the gap between planned and built so that next time it costs less money, is easier to build, and what is delivered is what was built (less parts replacement).

This is a huge example of a business process re-engineering effort I was involved in, we created taxonomies of master data starting with "as-contracted, as-financed, as-manufactured, as-built, as-planned, etc..." each of these had different systems identified where the data flowed, was constructed and was processed. One of the things the business discovered was that EACH silo of business was using their "own" version of contract number, which made it nearly impossible to trace the entire contract through the full-cycle of the business. Our data warehouse integrated all the contract numbers, and showed where they changed, what systems changed them, and demonstrated the "consistent patterns of change" across the business keys.

Our taxonomy flowed from there - understanding the business processes (even across sectors, and across business units) is absolutely VITAL to creating taxonomies of Master Data. Discover the Business Keys, and the business key FLOW, and the taxonomy of Master Data will become clear quite quickly.

+ Trying to keep the picture simple enough, or at least organized enough, so everyone (top-to-bottom in the enterprise) can understand...and we can develop an effective information management strategy and set of approaches.

We ended up with a single PowerPoint presentation consisting of 10 slides for the executive staff and board of directors at this corporation (consisting at that time of 7 sectors, 23 companies and 150,000 employees) which showed just this one portion of business from top to bottom. Using the WBS, OBS, and BPS (business process flow), we showed the taxonomy of the business, and therefore the taxonomy of the Master Data origination points. We then organized the Master Data to "test" the hypothesis: that in fact, what were supposed to be the "master systems" actually were responsible for producing the master data.

In some cases we found that the planning system was actually producing new contract numbers when the business had told us "that is the contracts systems' job, planning should never be producing new contract numbers." We found the data problem, were able to demonstrate when in the business process this was happening, and thus the business was able to fix the problem by issuing change requests to both systems - saving time and money in the long run.

Whew, sorry about the long explanation, but I felt it necessary. If you have thoughts, or comments - I'd love to hear from you. This process works at ALL levels of data integration for master data, even government.

More questions? Send them in...
Thanks,
Dan L

  Posted by Dan Linstedt on April 7, 2006 5:11 AM |

Comments

I just finished reading this and the other postings/comments and the one thing that strikes me is that we can talk the theories of master data/metadata all we want and probably can come to a consensus on what it -should- be. I think Dan and his peers have done a good job here. In reality, it's the business side that ultimately defines it (or how it's implemented) for each of us based upon their pain threshold as it relates to their pocketbook, willingness to take it on, and willingness to cooperate -with each other-.

Without an open, front-ended MDM solution (tool)overseen by a governance body, regulated by enterprise rules and *PROCESSES*, and operated by lines of business, IT is forced to do the conformance thing out of necessity the same way they've done it since the first two applications ever had to be integrated--code driven or data driven using the standard lookup tables. As usual it is up to IT to justify there are better processes and tools today that are more cost effective, and why they (the business) should be in the driver's seat.

Another person made the comment that they've been doing this for decades under different titles. He's right. MDM is just another, logical phase in the evolution of application integration and data quality.

The latest tools (Kaliedo, Stratature, et al) are making it easier to convince the business that they can afford MDM, they should do it themselves, and for their own good. The benefits to the business are remarkable enough to just justify a centralized approach. Satisfying aspects of regulatory compliance and data warehousing are just icing.

I'm on the MDM bandwagon because I've got the scars from the lack of it. I'm a poster child. A poster child drowning in a sea of legacy approaches, hoping the business will see MDM as my life saver.

I have been asked to compile a list of vendors which have MDM products. We are currently looking at the Kalido suite of products. Can you send me a list of the vendors that have MDM products?

Did you ever publish the MDM Vendor ScoreCard you mentionned in April 2006 - I can;'t find it ??

Post a comment