Context fascinates me, not to mention I love a good challenge.... Perhaps it's the fever I have tonight and perhaps it's just a wandering mind. In this entry I'm going to explore a couple of perspectives that I've been creating lately, along with a few theories that I'm proving out - and they have to do with (what else?) The obvious. The fact that metadata drives our data model structures, but rarely are metadata synchronized with definitional context (derived automatically from unstructured data sources), and rarely are they visualized beyond the standard 2 dimensional data models that we are so used to seeing and working with.
This entry is a thought experiment that dives into a land of "what-if" analysis, and attaches it to what I call Dynamic Data Warehousing - which also leads to Dynamic Automated Architecture Manageability. The problem is: how can we build a consistent, standardized, and solid foundational data model that will adapt its self going forward as the business changes and the needs change? (All of this of course without loosing sight of all the history that has already been collected). Impossible you say? Not at all...
A few of my good friends (much brighter than I) discuss semantic notions of context on a continuous basis. You can read about some of these phenomenal thoughts here. He discusses notions of semantic neutrality, and the fact that examining semantic reconciliation is the first step to success.
It is important to address semantic reconciliation before other analytical processes (e.g., statistical analysis, market segmentation, link analysis, etc.). This is a "first things first" principle because semantic reconciliation makes secondary analytic and computational problems that much easier and that much more accurate.
All too many "semantic driven engines" on the market today address the data as the starting point, establishing statistical analysis, market segmentation, etc... before ever addressing _any_ of the semantics of the data model (i.e.: the naming conventions, prefixes, suffixes, abbreviations, definitions, correlations, and so on.) This is fine if what you want is to build a "model that will hold your current but not your future nor your past data set."
An engine that builds a data model "based on just the data set" is completely ignoring 80% of the problem. Any semantic engine that "scans the data" to build a model might build the "most correct model" which addresses "todayâ€™s' snapshot of data"; but fails to be dynamic, adaptable, or even consistent (in structural layers) going forward.
The problem with basing metadata or newly constructed models on "data" is that it only captures the value of TODAY. And if you're lucky enough to have history, it might form a picture of yesterday - but then again, the tool may make compromises based on outliers and strange patterns that occurred (errors in the data). Or worse yet, it may not be able to make heads or tails of the data to come to any conclusion about what the model should be.
There is a solution: Data Model Architectural Consolidation based on the metadata or definitional elements held within. Coupling the existing models with semantic definitions, and a few other elements not only increases the value of the metadata, but soon can provide enough contexts to make a decision as to what the model really should be. This is what "Model Driven Architecture" is truly about, MDA is NOT about "data driven architecture" as some vendors claim.
What I'm proposing here is the fact that not only can an engine be developed to automate consolidation of data models, but that it can in fact apply new changes to an existing consolidated data model based on semantic discovery and associability. This can lead to a common data model which would last for years within a business and provide a solid, repeatable foundation from which to build.
It would also be feasible to assume that in order to reach Dynamic Data Warehousing, one must be willing to accept the ideas that data model changes can be "automated" and applied dynamically to everything including queries, load routines, web-services interfaces, and yes (eventually) with security in place.
Metadata is what ties all this together, and when the business changes the models change, when the models change, the context of the data (in most cases) changes... Data discovery to build a model is a crucial technology that can lead to better understanding, but it should *not* be the driving factor in building common data models for EDW or common web services going forward. Using models to drive models, and then making statements through data discovery is the proper way to forge ahead in a model driven world.
All of this should change the way we view metadata, we should begin to realize just how important metadata is, models are, and the impact that naming conventions can have on our business for years to come.
I'd love to hear your thoughts, whimsical or otherwise - again this is a thought experiment.
All the best,
Posted May 28, 2007 8:54 PM
Permalink | 2 Comments |