It seems people have taken the term "Dynamic Data Warehousing" and abused it. They've made it out to be about "Dynamic Data" and completely ignored "Dynamic Modeling", or dynamic restructuring as the case may be. Automorphic means self-changing, self-adapting. In this entry we'll talk about different capabilities of Dynamic Data Warehousing and the changes to data models as they grow.
First, let's define what we mean by Dynamic Data Warehousing:
My definition of DDW has come to mean:
* Data models that can adapt to the incoming data based on A.I. rule sets, learnt data patterns, linguistics, metadata, and associativity.
* Load patterns that are driven by the changes, to move data from "point a" (the source) directly to point b (the target, or the dynamic model)
* Indexes that shift, are created and dropped on the fly based on load patterns and query patterns
* Learning systems attached to the firm-ware of the device to watch, and learn about the metadata - tying the metadata together is an important step.
* Adaptable cubes - dynamic cubes or in-memory aggregations based on "temperature data" (hot, cold, warm). In other words, in-memory ROLAP solutions that are built based on metadata and cube structures, attribution of data sets, and queries or the questions being asked.
Now this may all sound really interesting, and extremely future based - but I can assure you - it's not. I am currently working on solutions in my lab which entail the execution of portions of these elements. The hardest one (you might think) is the dynamic modeling, or dynamic restructuring of the database... well, let me tell you - nope! That’s not the hardest piece (when the Data Vault modeling architecture is used)...
Keep in mind that the DDW is one or two steps beyond the Operational Data Warehouse, which I've just begun writing about. Also remember that the term: DDW retains all the responsibilities of the "Data Warehouse", as in: time-variant, non-volatile, granular, etc... That said, the question for Dynamic Data Modeling then becomes: how do you keep history on massive volumes of information without losing value, and without "reorganizing" or altering existing structures?
The answer is to come later, but with the Data Vault modeling methodology it CAN be done...
So how about 3rd normal form?
Sorry, it seems to be incapable of handling dynamic structure change. Why? The same reason that it fails as a Data Warehousing architecture in the first place. Parent-Child relationships embedded in the tables, and then: placing the structure over time. changing the structure of a 3NF DW is bad enough, let alone trying to alter it on-the-fly during loading and maintain existing history. This requires super-human strength, massive amounts of disk (to copy the elements), and sometimes changes the MEANING of the data when the structure changes.
"Danger Will Robinson!" (quote from a U.S. T.V. show ... lost in space, from the 1960's) http://en.wikipedia.org/wiki/Danger,_Will_Robinson
Ok, so what about Star Schema's?
Well, if you read through the definitions of Star Schemas AS ENTERPRISE DATA WAREHOUSES you quickly find that it's not the right fit, hence the new Generation 2, DW2.0(tm), and other new modeling concepts like the Data Vault.
Have you ever tried to change the structure of a conformed dimension? Does it indeed get harder as the system grows, and/or the more conformity it has? Does it slow down your development efforts?
Yes to all of these (at least from my personal experience). Does that make Star Schema bad? NO! Star Schemas are AWESOME, WONDERFUL, and the ONLY solution to work for OLAP, and Drill Down... Do they have a place in the DDW? YES! ABSOLUTELY! well then, where?
They have a place as adaptable cubes. Something funny happens when Star Schemas are used as SINGULAR STARS to LOGICALLY define VIRTUAL marts. They work extremely well, and as long as they are logical (not physically implemented), then dynamic memory cubes can become a reality. That's right! IN-MEMORY CUBING, it's happening already in certain DB engines, but it's not yet dynamic.
However, as a DDW foundational structure, we need something else. The Data Vault Model seems to be (today) the only other choice available that is actually capable of executing on this dream. We'll talk more about this in my Data Vault blog on http://www.BetterDataModel.com
Cheers for now,
Dan Linstedt
DanL@DanLinstedt.com
Posted April 26, 2008 7:56 AM
Permalink | 1 Comment |