We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.

Blog: Dan E. Linstedt Subscribe to this blog's RSS feed!

Dan Linstedt

Bill Inmon has given me this wonderful opportunity to blog on his behalf. I like to cover everything from DW2.0 to integration to data modeling, including ETL/ELT, SOA, Master Data Management, Unstructured Data, DW and BI. Currently I am working on ways to create dynamic data warehouses, push-button architectures, and automated generation of common data models. You can find me at Denver University where I participate on an academic advisory board for Masters Students in I.T. I can't wait to hear from you in the comments of my blog entries. Thank-you, and all the best; Dan Linstedt http://www.COBICC.com, danL@danLinstedt.com

About the author >

Cofounder of Genesee Academy, RapidACE, and BetterDataModel.com, Daniel Linstedt is an internationally known expert in data warehousing, business intelligence, analytics, very large data warehousing (VLDW), OLTP and performance and tuning. He has been the lead technical architect on enterprise-wide data warehouse projects and refinements for many Fortune 500 companies. Linstedt is an instructor of The Data Warehousing Institute and a featured speaker at industry events. He is a Certified DW2.0 Architect. He has worked with companies including: IBM, Informatica, Ipedo, X-Aware, Netezza, Microsoft, Oracle, Silver Creek Systems, and Teradata.  He is trained in SEI / CMMi Level 5, and is the inventor of The Matrix Methodology, and the Data Vault Data modeling architecture. He has built expert training courses, and trained hundreds of industry professionals, and is the voice of Bill Inmons' Blog on http://www.b-eye-network.com/blogs/linstedt/.

May 2008 Archives

My book on the Business of Data Vault Modeling, approach and archtitecture is finally available after 7 years. If you'd like to purchase it, you can grab a copy from LULU.com (here: http://www.lulu.com/content/1371769) If you'd like a signed copy, please contact me directly with all your information. Bill Inmon has kindly written the forward for the book.

The Data Vault Model and approach to implementation is the next paradigm shift in accordance with DW2.0

It has been a long time coming, but we finally have all the right pieces in place now. The Data Vault model and approach to implementing EDW's for real-time, unstructured data, operational data warehouses, and strategic data warehouses is critical to success moving forward. If you want to find out why (from a business perspective), the book is a great resource to guiding you through the mitigation strategies that exist today.

The Data Vault modeling principles are based on (backed with) SEI/CMMI Level 5, Six Sigma, Lean Initiatives, Cycle Time Reduction, and Business Activity Management components.

Let me know what you think of the book.

I've got new blog entries coming soon.

Dan Linstedt

Posted May 25, 2008 10:01 PM
Permalink | 1 Comment |

Before we get to Dynamic Data Warehousing, we need to first reach Operational Data Warehousing. Now I realize that I'm not the first, nor will I be the last to use or even possibly abuse this term. In fact if you search on the term today you'll get tons and tons of hits. I do however believe that Data Warehousing and BI as an industry have gotten slow, and become somewhat of a laggard in terms of keeping up with technology. Just look at the adoption curve of DW2.0... It simply isn't there yet (wish it were). Anyhow, in this blog let's take another look at the ODW as Bill Inmon and I are beginning to discuss it.

First, I must say: Thank-you to Bill for not only being a great friend, but a wonderful mentor to me. I must also say, thank-you to Claudia Imhoff and Colin White for writing about Operational BI lately, and of course all my other friends out there who continually amaze me by answering my simplistic and absurd questions.

On that note, I've been pondering (and asking Bill for help) Operational Data Warehousing. I've also been blogging on the subject lately, and as of last week - had the wonderful opportunity to share my questions with my good friend Jeff Jonas (see his blog here). More on that in my thought experiments section, where I'll also be blogging on Form versus Function and some new advances in computing sciences.

So to the point: Operational Data Warehousing as it were requires:
* good form
* strong functionality
* streaming real-time data
* scalability, and flexibility

When I talk about real-time data, I'm not talking about "every 3 to 5 seconds, I get 500 transactions or so..." No, I'm talking about every 2 to 5 microseconds, the warehouse receives burst rate mini-batches of 500 to 10,000 transactions across multiple feeds... In other words, AS the transaction is created and pushed across to other source systems, so it is pushed directly in to the warehouse on an Operational basis.

In some cases, the objects doing the data collection and generating the transactions do NOT keep a copy of the transaction. In these instances, it is important to realize that the real-time fed data to the ODW IS a system of record. Now, as a DW2.0 compliant architecture, we are housing SOR data and NON-SOR data in the same structure, in the same place, at the same time. By Non-SOR data I mean: anything defined as "arriving from an operational source system which keeps transactional history" *** It has NOTHING to do with it being batch or non-batch ***

Ok, so there was a comment on my blog from a good friend: Walter Smetsers, requesting clarification of a statement about "will we continue to need operational systems" once we build an ODW... The answer today is: maybe. In the future, as new systems are built - convergence will take hold and the answer may become: no.

In an ODW, we not only have the capacity, but also the capabilities to program directly on top of the ODW the operational applications themselves. However in order to make this happen, we also need a Master Data layer inside the ODW, along with a Metadata Layer, and a Master Metadata Layer. All of this MUST be coupled together, and managed through an ontological function.

Ok - enough blathering, do we have one of these or don't we?
Yes, we've built one, but it doesn't YET have all the components it needs.

Where is it?
Unfortunately I can no longer share the customer information, however it's a Data Vault modeling architecture called the Serialization Vault that we've built for the national E-Pedigree mandates from the FDA. I've heard that the WHO (world health org) is interested among other parties.

How does it work?
We accept data in real-time from collectors on the manufacturing lines to a central Data Vault modeled data warehouse. We keep the data itself separate, and through a logical model and metadata layer we can re-assemble the disparate data sets to provide the drug manufacturers a complete picture.

We also have a layer of operational systems on top of the ODW, allowing data to be logically updated by the application. I say logically because in keeping with the DW2.0, it stores history of the original transaction, and merely inserts new information rather than updating in place.

There are a few other customers who've had one of these in process for years. I'd be happy to put you in touch with them.

BACK to ODW...
Are vendors supporting these concepts today? Not directly. You can build one on any RDBMS system, or column-based database, as long as application programming logic can access the data underneath directly. Operational BI is coming to the fore-front, and there are a number of young tech vendors coming to the table to meet the challenge, but it will be a while before the market space "grows up".

Does this mean my ODW is treated just like an operational system?
YES! It is an OPERATIONAL SYSTEM that has converged with a DATA WAREHOUSE, therefore it has the same requirements as an operational system and an EDW at the same time. The best type of data modeling for this is a normalized format, for scalability, flexibility, and auditability.

I'd love to hear your opinions, thoughts and questions.

Dan L

Posted May 4, 2008 9:30 PM
Permalink | 3 Comments |

I've just completed Bill Inmon's brand new course on Unstructured Data using his new Unstructured Data ETL tool. It's been very eye opening. Every time I meet Bill I'm always learning something new. There was a discussion at the end of the class that asks the question: WHAT do you do if you FIND "structural definition elements" in unstructured data that AREN'T represented in the EDW??

So what is interesting here, is the notion that mining unstructured data gains knowledge or metadata by association that provide definitional elements _about_ the information held within. What we can do, and what we need here is truly Dynamic Data Warehousing - which includes the ability to BRIDGE structures, along with creating new structures.

To get to the unstructured data mart, we should be going throught the Dynamic Data Warehouse. Remember, I'm using the term Dynamic to include STRUCTURAL CHANGE, INDEX CHANGE, etc... So what that means is: that the structure which is discovered through unstructured data surfing is built via nerual networks, then it is subsequently graded on strength and confidence, and finally - optimized and adjusted by the neural net as new unstructured data is fed in.

This is what the Dynamic Data Warehouse looks like to me. The follow on is dynamic cubes, and once the unstructured data reaches the dynamic cubes we (I'm sure) will be surprised at what we find...

Stay tuned for more info later.

Dan L

Posted May 2, 2008 1:42 PM
Permalink | No Comments |