Blog: Dan E. Linstedt Subscribe to this blog's RSS feed!

Dan Linstedt

Bill Inmon has given me this wonderful opportunity to blog on his behalf. I like to cover everything from DW2.0 to integration to data modeling, including ETL/ELT, SOA, Master Data Management, Unstructured Data, DW and BI. Currently I am working on ways to create dynamic data warehouses, push-button architectures, and automated generation of common data models. You can find me at Denver University where I participate on an academic advisory board for Masters Students in I.T. I can't wait to hear from you in the comments of my blog entries. Thank-you, and all the best; Dan Linstedt http://www.COBICC.com, danL@danLinstedt.com

About the author >

Cofounder of Genesee Academy, RapidACE, and BetterDataModel.com, Daniel Linstedt is an internationally known expert in data warehousing, business intelligence, analytics, very large data warehousing (VLDW), OLTP and performance and tuning. He has been the lead technical architect on enterprise-wide data warehouse projects and refinements for many Fortune 500 companies. Linstedt is an instructor of The Data Warehousing Institute and a featured speaker at industry events. He is a Certified DW2.0 Architect. He has worked with companies including: IBM, Informatica, Ipedo, X-Aware, Netezza, Microsoft, Oracle, Silver Creek Systems, and Teradata.  He is trained in SEI / CMMi Level 5, and is the inventor of The Matrix Methodology, and the Data Vault Data modeling architecture. He has built expert training courses, and trained hundreds of industry professionals, and is the voice of Bill Inmons' Blog on http://www.b-eye-network.com/blogs/linstedt/.

July 2006 Archives

My last entry introduced the concept. Making it work is the hard part. One of the comments I received asked about handling business changes, politics, and a variety of other circumstances. In this entry we'll begin diving in to the differences a little more, and discuss a small portion of the politics that surround this. I'll be presenting some of this information at the upcoming TDWI conference in August-2006, and the Upcoming Teradata Partners conference in November.

First, one must understand (and agree) that technical and business metadata are to be separated from a conceptual level. Then we can begin to implement projects that tackle finding and cojoining technical metadata. There are several repositories on the market today which are capable of housing technical metadata. My personal two favorite are: MetaIntegration, and SuperGlue (Informatica Product). There is also MetaStage (IBM/Ascential), Platinum (CA), and a few others that stretch the definition.

Regarding successes in the industry, we've had quite a few successes over the years building technical metadata repository stores inside of our 90-day prototypes and larger projects. It is important to remember that technical metadata projects must be scoped just like data warehousing projects must be scoped. Both have to have attainable goals, project charters, buy-off, statement of work, etc... In fact, one of the best practices we employ is: to treat the technical metadata project as a technical metadata data warehousing project. This works quite well - now we begin to capture metadata from different tools, stitch it together and produce it for business users from the BI side of the house.

Having it wrapped via a DW project allows us to use our best practices to get the technical metadata in-place, and delivered via reporting tools as fast as possible, and as accurately as possible. Standards, guidelines, lessons learned are all a part of the on-going project. Companies that we've deployed some successes at include a large travel based company, two different branches of federal government, a manufacturing sector of a large government contractor, and a large television cable provider.

From a business perspective, managing the on-going requirements is the same as the EDW project - risk analysis, mitigation strategies, and best practice policies; the same as any other project. Technical metadata is the easy part, because just like data warehousing, the data "arrives" from a source system in standard formats, and can be retrieved and merged together as master-metadata ontology. The one thing that SARBOX is missing is the ability to count technical metadata as auditable, so by putting both in a warehouse we also begin to make the technical metadata live up to the promises of compliance.

Business metadata is a completely different story. Business metadata is extremely complex, can reside in a hundred different sources, and lives in many different types of documents. Business metadata can (and must be) standardized with regards to a master registry, something that defines the layers and trees of metadata within the business, along with the dependency chains. We have setup such registries at two large branches of the Federal Government, and have other clients considering the same. Registries are just the first step - they define the structure and dependency in a standard format, along with generic terminology (mostly data elements and business definitions, and how-used definitions) within an organization at multiple levels. This MUST be a part of an MDM initiative - if not, MDM will lose its way, and it's meaning quickly after productionalization.

The second step is usually another phase in the project, it is taking the registries, and using EII tools to locate, scan, feed, from unstructured and semi-structured data sources any kind of metadata we cannot get from the technical side of the house. We augment the registries (which are already housed within the metadata-data warehouse). Again, best practices, detailed processes, risk assessments, mitigation strategies, and user communication are key to the success of such projects. The business metadata is then "standardized" according to today's rules and logged as a snapshot (this is what we found... kind of thing). Only EII tools seem to be able to gather this kind of information automatically (once setup).

Of course there are complex tools, like Rational Suite of products which also handle Business Metadata, but they require lots of investment in time and money to get them working to your enterprise advantage. They pull requirements and business metadata from Word Documents, and so on – however today, we find it much more useful to implement EII tools that can grab both unstructured and semi-structured information and interrogate it/interpret it for metadata context. There are some EII/Data vendors that we commonly work with including: Ipedo, Informatica, Silver-Creek Systems, and IBM that help make our job easier.

Handling the politics requires governance policies (see my articles on Governance here on B-Eye-Network). My firm has a proven track record of successful implementations of Master Metadata (including both technical and business metadata). Let us help you get your effort under-way successfully.

Do you have an initiative under-way? Do you have lessons learned you would like to share? Please, comment on the blog - let's discuss it. Or send me your questions privately.

Thank-you for your time,
Daniel Linstedt
Daniel.Linstedt@myersHolum.com


Posted July 31, 2006 7:18 AM
Permalink | No Comments |

Master Metadata - this is the term these days that I use to define enterprise level metadata efforts. In fact, Master Metadata (MM) is just as important as Master Data Management (MDM). Ok, maybe it should be called Master Metadata Management - (MMDM). But really, just what is Metadata? Data about data, wow - that's cool; not really - it's just the nature of the business. In this short entry we'll explore technical versus business metadata definitions.

Are there differences between the two? Not really, it's all metadata, so why separate them - well we want to classify different types of metadata so that we understand how to work with metadata, set it up, govern it, and how tools should work with it.

Technical Metadata: data about the processes, the tool sets, the repositories, the physical layers of data under the covers. Data about run-times, performance averages, table structures, indexes, constraints; data about relationships, sources and targets, up-time, system failure ratios, system resource utilization ratios, performance numbers.

I think you get the picture... Technical metadata include metrics and KPA's for IT, the business of doing IT work, monitoring and managing IT and physical environments. Technical metadata is produced by tool sets, tool vendors, and their repositories. Technical metadata is in essence - everything we deal with "under the covers" that hides complexity from the business users but makes the systems run.

Business Metadata: Data about the Technical Metadata, data layers that provide definition of the functionality, definition of the data, definition of the elements, and definition of how the data is used within business. Business Metadata is definitional in nature, one might also say that Business Metadata includes business requirements, time-lines, business metrics, business process flows, and business terminology.

The point is, we need both to do business properly - many vendors who offer metadata solutions do not properly differentiate between the two. Business metadata is often left behind as the vendors "tout" they have metadata in their solutions, but upon closer inspection it ends up being just technical metadata.

There are ways to bridge the gap today, but it requires standards, best practices and manual efforts (to some degree) and sometimes, multiple tool sets. Our firm has constructed best practices and a metadata tool vendor score-card which looks at all these categories. Contact me for more information.

Cheers,
Dan L
daniel.Linstedt@MyersHolum.com


Posted July 24, 2006 3:56 AM
Permalink | 1 Comment |