Blog: Dan E. Linstedt Subscribe to this blog's RSS feed!

Dan Linstedt

Bill Inmon has given me this wonderful opportunity to blog on his behalf. I like to cover everything from DW2.0 to integration to data modeling, including ETL/ELT, SOA, Master Data Management, Unstructured Data, DW and BI. Currently I am working on ways to create dynamic data warehouses, push-button architectures, and automated generation of common data models. You can find me at Denver University where I participate on an academic advisory board for Masters Students in I.T. I can't wait to hear from you in the comments of my blog entries. Thank-you, and all the best; Dan Linstedt http://www.COBICC.com, danL@danLinstedt.com

About the author >

Cofounder of Genesee Academy, RapidACE, and BetterDataModel.com, Daniel Linstedt is an internationally known expert in data warehousing, business intelligence, analytics, very large data warehousing (VLDW), OLTP and performance and tuning. He has been the lead technical architect on enterprise-wide data warehouse projects and refinements for many Fortune 500 companies. Linstedt is an instructor of The Data Warehousing Institute and a featured speaker at industry events. He is a Certified DW2.0 Architect. He has worked with companies including: IBM, Informatica, Ipedo, X-Aware, Netezza, Microsoft, Oracle, Silver Creek Systems, and Teradata.  He is trained in SEI / CMMi Level 5, and is the inventor of The Matrix Methodology, and the Data Vault Data modeling architecture. He has built expert training courses, and trained hundreds of industry professionals, and is the voice of Bill Inmons' Blog on http://www.b-eye-network.com/blogs/linstedt/.

I've been writing (scantily) about DDW in the past, in this entry we will take a look at what the definition appears to be in the industry, and then I will offer my opinion on what I think the definition _should_ be for DDW. If vendors believe that they have a DDW, or a DDW solution, then I open heartedly invite them to contact the COBICC board members, and give us all a demonstration, along with definitions of what they've produced.

Dynamic Data warehousing, what does it mean to you?
Throughout the industry we've been getting up to speed on Active or Near-Real Time Warehousing lately, and recently we've also begun experimenting with getting to the next level: DW2.0 (which includes an ADW, structured and unstructured information, metadata, and so on). So what are researchers and folks in the industry saying DDW is?

The first link is to a student’s research project regarding what their view of DDW is:
http://64.233.167.104/search?q=cache:KGeKi0tFzFQJ:www.dblab.ntua.gr/~dwq/p44.pdf+dynamic+data+warehouse&hl=en&ct=clnk&cd=1&gl=us

DWs are dynamic entities that evolve continuously over time. As time passes, new queries need to be answered by them. Some of the new queries can be answered by the views already materialized in the DW. Other new queries, in order to be answered by the DW, necessitate the materialization of new views. In any case, in order for a query to be answerable by the DW, there must exist a complete rewriting 5 of it over the old and new materialized views.

Ok, this is an interesting look - but certainly not the complete picture of what I see DDW to be. To give them credit, they are attacking a difficult problem: how to answer a new query that doesn't have the appropriate data set available - by building new materialized views. The concept is decent, but the words "materialized view" make the approach locked in to Oracle, as other databases do not have the notion of a materialized view. They go on to discuss how to create new views that are needed, and they do a good job of expressing the mathematics behind the desire. Again, this is only one piece of Dynamic Data Warehousing.

Here's another project:
http://xml.coverpages.org/xyleme.html
While they discuss some notions of dynamic data warehousing, they do not disclose all the pieces they will manage. They seem more interested in the fact that they can store vast quantities of XML, and rely on the notions that XML query can change with the XML document structure changing, true - but this still doesn't answer the questions about dynamic restructuring, dynamic indexing (changing indexes when a new one is needed), dynamic query building, dynamic security, and so on. I'll provide this list a little later. However, they are closer to a holistic solution than the first reference.

Here's another interesting look, they start out sounding very promising, but when it comes to brass tacks they are merely discussing Dynamic View Generation - still a worthy cause, but not quite a DDW (as they originally claim).
http://davis.wpi.edu/dsrg/EVE/idm2002-eve.html

IBM has been at it a while, and in this definition - they are defining (you guessed it) an appliance, with bundled software, but in their press release I blogged on yesterday they said DDW is not a tool, a product, or a service... yet again they contradict themselves. Besides that - what they really have is an ADW, not a DDW.... read on... http://www.intelligententerprise.com/showArticle.jhtml;jsessionid=MHPVP5URAXGSAQSNDLRSKH0CJUNN2JVN?articleID=198000675

And another post by Doug Henschen agrees with me.

My friend Lou Agosta does a decent job of discussing some of the background pieces involved in Dynamic Data Warehousing. I think what's missing here is the definition of what Dynamic really means... Should Dynamic mean the data warehouse is dynamic with near-real time data? should it mean it is dynamic with query changes? Dynamic with unstructured data? what does it mean?

Here's another vendor (Axiom Software Labs) that claim to have a DDW, they are probably closer to the mark, but again all of these solutions say they have dynamic abilities, but none of them talk about HOW these abilities work, nor do they disclose what true DDW needs to be. Oh yes, a new acronym is emerging (unfortunately) DyDa - what?

Ok, here are my thoughts on what is required in order to be "the next level" or to be a DDW. We require that all of the following be recognized as dynamic:
* Structural changes to structured data sets are recognized, and changed as available - automated back-room basis.
* Views are adapted as needed when structures change
* Active and Batch loaded data is occurring on the same system at the same time
* Procedural Load routines are adapted to the structure changes when they occur
* Data Mining occurs to build new models against the data in a dynamic fashion
* Architecture mining occurs to determine if the structural changes are attached in the right place.
* Unstructured Data is attached, and searched - all data which can be inserted into a structural matrix will be.
* BI Reports and dashboards are dynamically altered to include the new elements.
* Web services are versioned and re-released to include the new elements.

And so on. Dynamic is a very versatile word, and DDW (in my mind) encompasses a whole lot more than just one piece of the pie (Dynamic Data or Dynamic Views). While these are noble efforts and steps in the right direction, they are _not_ qualified to be called a DDW environment, because they are only pieces of a larger puzzle.

I welcome your comments as always, do you have a definition of DDW that you can share? What is it in your mind?

Thank-you,
Daniel Linstedt
http://www.COBICC.org/


Posted June 6, 2007 4:04 AM
Permalink | 5 Comments |

5 Comments

So, I am trying to get my mind wrapped around this. I can see you are trying to get to something beyond ADW and real-time DW. However, I need one clarification (for today) - when you say "dynamically" (as in: "BI Reports and dashboards are dynamically altered to include the new elements" ), do you mean "automatically" (as in without human intervention)?

Thanks for the post.

Hi Kent,

Yes, what I am discussing is similar to temperature of data, but rather temperature of change to the structures. If the change appears to be measured as "harmless" then it may be colored green, and be applied automatically.

If the change is more than expected, but not a major shift (like new structures, new releationships, etc..) then it's colored yellow, and an email is sent stating that a change occurred.

If the change is quite abrupt, or "large" by some scale, it is colored red hot. The email goes out, and the change is put on hold until human interventention can figure it out.

Hope this helps,
Dan L

Yes that helps and clarifies it. So dynamic is an automatic reaction to a change. The type of reaction is determined by the type or scope of the change (and by defined business rules).

Dan,

I honestly believe that DDW is just a theory and will remain as theory for the next decade.

Take for example OLTP systems which have been in the market for atleast two decades and still they are not dynamic. I'm not sure if there is an acronym called dynamic OLTP databases OR DTB(Dynamaic Transaction base) etc.

You just can't make these databases(OLTP,DW,ODS etc) as dynamic. The case you described above with RED, YELLOW and GREEN sounds good in theory, but in practice, all the scenarios will fall into RED category (according to your definition), as impact analysis need to be done in every case and requires human eye to evelaute the results.

Enterprises are struggling today to get to a basic datawarehouse. For DDW to become real and practical, metadata management layer in a given enterprise has to be very strong, robust and proven. Most of the enterprises today are at level 2 or level 3 (10 being best) in managing their metadata.

So I would say DDW is a good academic subject to write articles upon, but its real life applicability is at a distant future

Chandra Kapireddy
Chief BI Architect, D2I3 Inc

Very interesting viewpoint. The same was said (that it was all hypothetical) to manage terabytes of data from a traditional RDBMS. The same was said about Strategic Analysis being Theoretical when Bill first discussed the need of Data Warehousing. And the same was said about Artifical Intelligence, learning systems and neural networks. All of these things have come to pass, and in fact AI, DW, and neural networking are all a part of daily life today.

I think DDW is _NOT_ theoretical, I think there are parts of it in existence today, things are being done at the semantic level of integration that were previously unnatainable. However, DDW is some years off (in reality), like anything else, it too will come to pass because there are those of us developing the parts and pieces of the architecture today.

Thank-you for your thoughtful comments. People said the same thing about quantum physics and nano technology...

Dan Linstedt
DanL@RapidACE.com

Leave a comment

    
Search this blog
Categories ›
Archives ›
Recent Entries ›