Master Data Management Part 3 of a Continuing Series on Technologies Changing our View of the Data Warehouse
Originally published November 9, 2011
If you thought you were already managing big data with the data warehouse and Hadoop has ruined your day, you may also believe the data warehouse is the place where master data finally “comes together” in an organization. In that case, your day then could be ruined by master data management (MDM).
Utilization of enterprise master data is not limited to the post-operational data warehouse analytical environment. Master data is a corporate resource with vast amounts of potential in all types of operational systems. This includes every enterprise resource planning (ERP) system, sales force automation systems, customer relationship management systems and other types of transaction systems. As such, master data clearly needs to be created and utilized as early as possible. Waiting for master records to form in a batch data warehouse creates unnecessary delay. In addition, the warehouse tends to be insufficient as an entity for real-time data distribution. It’s a great final destination for data or the lock-and-load point for post-operational analytical systems, but it’s best as a one-way street.
Due to these timing and distribution issues, companies began building hybrid constructs in the mid-1990s. The vendor community soon followed, and we now have the emerging field of master data management.
Some view master data management as the process for building the master data, which includes enormously valuable hierarchy management and governance capabilities. These are less competitive with the data warehouse, which typically sources and cleanses data from its “source systems.” The functions of enterprise modeling, data integration and data quality are transferrable to the operational environment in the form of master data management. Data integration with master data management is typically done using a “publish and subscribe” approach on an enterprise service bus.
Master data management is built around business subject areas, or data domains. These are business-oriented subjects, unrelated to their originating source, an analogous concept to the “subject orientation” of the data warehouse. Examples of subject areas are customer, product, parts, vendor and partner. MDM is extremely beneficial for:
Simpler subject areas can be handled on an application-by-application basis and perhaps don’t need an MDM approach.
For an MDM subject area, the data model should be built to encompass enterprise needs. It should not be completely beholden to any single usage or application. To achieve this goal, it needs to be a superset of corporate attributes, modeled at a low granularity and supportive of the governance process, which may need to produce interim versions of the master record. MDM data should also be built with ease of use and with an application programming interface mentality. There will be mapping of any publisher or subscriber to the existing data in MDM, but from that point forward, the integration of the data should be evident and utilize the enterprise service bus.
Many MDM solutions come with a pre-packaged data model, with extensibility anywhere from none to copious. While data warehousing solutions are used to developing the enterprise model, it would be too disruptive to an MDM solution not to use its prepackaged model. Extensibility is important. There is a serious “buy in” nature to a prepackaged MDM data model, or at least to its basics, that is necessary.
The MDM hub, where the enterprise master data is stored and updated on a real-time basis, is also a provider of the data to whatever direct query needs there are for master data. Most modern business intelligence tools can be trained on the master data. Much of this type of query would be interested in the analytical data created by master data management. Analytic fields are created in master data management in real time by either absorbing the analytical data from another originating system, such as a CRM system, or by pulling intelligence out of transactional data and keeping analytics refreshed. Regardless of how the analytic MDM data is created, it becomes part of the master data made available to all enterprise systems for consumption.
One of those systems is the data warehouse. Extensive, historical and transactional reporting will continue to come from the data warehouse. Hence, the data warehouse becomes an important subscriber to master data management, whether the integration is real-time or not.
This “build once, use many times” feature of master data management is a large part of the allure. If each department is left to develop its own master data, that is not only redundant effort, but very likely to create multiple sources of the truth. Harmonizing multiple operational systems to a single enterprise view of data is valuable. Master data management also de-risks and de-scopes data-using projects by providing data as a service.
Though it gets lumped in with ERP systems more often than data warehousing systems, master data management is clearly a separate project from an individual ERP project. Although there are integration points with every publishing and subscribing system, master data management merits separate funding, governance and – if you’re doing Scrum – separate Scrum masters, project backlogs and sprint schedules.
I’ve reviewed several ways that master data management is claiming a function once wholly reserved for the data warehouse. Analytics are real-time. A real-time data warehouse approaches some functionality of master data management, but lacks in master data management’s ability to govern data to become the successful “golden record” and successfully deliver that in real time across the enterprise.
SOURCE: Master Data Management
Recent articles by William McKnight
Copyright 2004 — 2020. Powell Media, LLC. All rights reserved.
BeyeNETWORK™ is a trademark of Powell Media, LLC