Blog: Dan E. Linstedt Subscribe to this blog's RSS feed!

Dan Linstedt

Bill Inmon has given me this wonderful opportunity to blog on his behalf. I like to cover everything from DW2.0 to integration to data modeling, including ETL/ELT, SOA, Master Data Management, Unstructured Data, DW and BI. Currently I am working on ways to create dynamic data warehouses, push-button architectures, and automated generation of common data models. You can find me at Denver University where I participate on an academic advisory board for Masters Students in I.T. I can't wait to hear from you in the comments of my blog entries. Thank-you, and all the best; Dan Linstedt http://www.COBICC.com, danL@danLinstedt.com

About the author >

Cofounder of Genesee Academy, RapidACE, and BetterDataModel.com, Daniel Linstedt is an internationally known expert in data warehousing, business intelligence, analytics, very large data warehousing (VLDW), OLTP and performance and tuning. He has been the lead technical architect on enterprise-wide data warehouse projects and refinements for many Fortune 500 companies. Linstedt is an instructor of The Data Warehousing Institute and a featured speaker at industry events. He is a Certified DW2.0 Architect. He has worked with companies including: IBM, Informatica, Ipedo, X-Aware, Netezza, Microsoft, Oracle, Silver Creek Systems, and Teradata.  He is trained in SEI / CMMi Level 5, and is the inventor of The Matrix Methodology, and the Data Vault Data modeling architecture. He has built expert training courses, and trained hundreds of industry professionals, and is the voice of Bill Inmons' Blog on http://www.b-eye-network.com/blogs/linstedt/.

I've blogged about this topic for many years now, my first mention of it was in my www.TDAN.com articles regarding the Data Vault Modeling architecture. However, that said, I've been blogging on everything from autonomic data models, to dynamic data warehousing, but in my research, I've come to realize I've left out some very critical components. I've lately been experimenting with building a self-adapting structured data warehouse. There are many moving pieces and not all the experiments are finished, so I cannot write (yet) about any of the findings. But here, I'll expose some more of the under-belly as it were that is necessary to make DDW a reality (in my labs anyhow)....

I've tried and tried to find a new name for this thing, but alas, it just seems to elude me. Dynamic Data Warehousing seems to have a nice ring, and is quite the nice fit. The term however evokes all kinds of different meanings to different companies and different people. So much so, that I've had open discussions with IBM in the past about their use of the term! Oh-well, water under the bridge.

But that brings me to my next point. There are missing components to my definition of DDW, I didn't get it all, and I'm sure that this is just another step in the definition (that the definition will not be completed for another year or two). If I look back at what's going on I see the following:

Convergence of:
* Operational Processing and Data Warehousing.
* Master Data and Metadata to use the Master Data Properly
* Tactical decisions backed by strategic result sets
* Business, Technical, Architectural, and Process Metadata
* Real-Time and Batch processing
* Standard reporting technologies and "Live animated scenarios" with walk-throughs and 3D imagry
* Human-machine interfaces
* MPP RDBMS systems and Column Based Database solutions

Why then shouldn't we see convergence of "data models" and "business processes"?
or "Data Models" and "Systems Architecture"?

The point is: WE ARE. (or at least I am). Not only is this happening in my labs, but It's being requested of me when I visit client sites. The customers want "1 solution", or better yet, they want a solution that "appears to learn" based on the demands put upon the system.

Why do I say "appears to learn?"
because Machine learning and appearances of machines translating context are two totally different things. I cannot and will not claim to have made a machine to think. However, I can and have made a machine's enterprise data warehouse responsive to external stimulous - at least when it comes to the data model, loading routines, and queries. Please do NOT mistake this as anything more than AI applied in a new manner - mining metadata (structure and queries and load-code and web-services) rather than just mining data sets themselves. (more on that later, much later --- I still have a LOT of research to do).

Ok - so what's missing from the Dynamic Data Warehouse definition?
* Use of metadata: business, technical, and process during the model learning/adaptation phase
* Use of an ontology (part of business and technical metadata as described above)
* Use of a training model, all good neural nets need to be trained over time, and then corrected.
* Use of the queries to examine and compare HOW the data sets are being used and accessed against the current data model
* Use of a minimal load-code parser, again to assist in training the neural net to recognize the correct structure.

Anyhow you get the point. Dynamic Data Warehousing is about a back office system, that responds to changes in the structured data world - as the queries change then the indexes change. As the incomming data set changes, the model needs to change. Some queries (if consistent enough) can actually express new relationships that need to be built.

This is an adaptable system, this is a dynamic system, this will eventually become a true Dynamic Data Warehouse.

Thoughts?
Dan Linstedt
DanL@DanLinstedt.com


Posted September 21, 2008 9:52 PM
Permalink | 3 Comments |

3 Comments

Its an appealing architectural style (I am also a bit eluded in where to position it). I just wonder - if you get into the massive TB EDW environments - whether the technical infrastructure can handle these changes in an automatic manner and still perform at a statisfactory level.

To be frank; at the architectural and design level I can totally commit to this concept of DDW. But at the technical level I just dont think we are ready.

What do you think?

I have heard about the Kalido DIW that claims to be dynamic datawarehouse, which is able to adapt to the changes in the business requirement. Does this sense to the terminology to mentioned above on DDW

We are not ready for DDW at the technical level, this is why it's still in the labs on my workbench.

Kalido (as with IBM and other vendors) CLAIM DDW, but they DO NOT have a DDW. Kalido CAN adapt quickly to change, but it requires human intervention.

They do not automatically alter structure, queries and load routines by "recognizing" element changes, nor do they automatically adapt the changes based on utilization.

Kalido's structure under the covers consists of one to 3 tables, each are KEY=VALUE pair. More on this later.

Dan L

Leave a comment

    
Search this blog
Categories ›
Archives ›
Recent Entries ›