Blog: Dan E. Linstedt« Part 4: Secrets of the Masters | Main | New skills required for BI and Data Presentation going forward » The new evolution of Data ModelingBill Inmon and I sat down the other day to discuss a system that we are building. We didn't have a good "name" for it, but what it amounts to is: Operational Data Warehousing. If you can believe it, what we've done is taken the Operational specifics of systems capturing data - and placed it on top of the Data Warehouse as a single integrated historical and operational data store. We are currently using the Data Vault model for this componentry. Some folks have called this "Active Data Warehousing" in the past, but we feel that this is one step beyond, in that it actually IS the operational store at the same time as being the Data Warehouse. Convergence has arrived... I've blogged about convergence in the past, it's no secret that the world is converging, and I.T. is no different. It is also no secret that EDW technology is converging with operational technology. Well, if we look behind us (20/20 is always best) we can see the divergence path of data warehousing and operational systems, and the re-convergence of these systems. Active Data Warehousing coupled with SOA, and real-time alerts coming back from the ADW have begun to turn the tables. We have closed the gap on this one. Using the principles of the Data Vault modeling (http://www.DanLinstedt.com) we've constructed an Operational Data Warehouse (right now, Bill and I do not have a better term for this, Bill also thought that this is a new approach). What does Operational Data Warehouse do? Another way to describe it is: as a data warehouse with operational (raw) data. Why do it this way? The structures of the Data Vault have been setup within the databases in such a way as to allow tremendous scalability and flexibility. We have physically partitioned the machines for security purposes, and scalability purposes. We can join 800M rows to 300M rows to 100M rows, and bring back 10k rows in under 10 seconds when we know what we're looking for. This setup is housed on SQLServer2005 on Windows 2003 r2, with 32 bit, 2 dual core CPU's at 2.8 GHZ, 2GB RAM. So what's this got to do with Operational Data Warehousing? So what are the technical requirements? And of course it MUST follow the DW2.0 requirements: So what? How do I get there? So you mean to say there "is no operational system"? Next time we'll dive in a little deeper as to what it means to construct one of these, and how they work. You might already have one of these, if you do - I'd love to hear about it. As always, thoughts, comments, corrections, are welcome. Cheers, |
Comments
Dan,
Nice - very nice. I so agree with the convergence scenario between transational and informational worlds. I see however two scenario's developing at the moment. The guys and galls who believe that integration should be done by application integration (the federated approach - a lot of the SOA guys are keen on this one) and the guys and galls that believe in data integration. Needless to say.....I vote for the second one.
The first is the federated one. Data is stored where data is created. Federation software (EII) is retrieving the data when asked for.
To be frank; I dont like this one.....In fact, its intrinsic wrong. How about traceability? History? I think this aint doable, especially if u would hold it against the stated requirements u posted.
The second one is something u seem to have cracked. The second one is a (as I call it) data centric scenario where transational and informational are converging. The data in the transactional (read: operational) system only stores the current transaction, once committed it is directly fed into the data-service/ODS/data warehouse/?? Voila....your data warehouse is active (u can now use this data for almost any purpose - not just BI!!)
In this future (second scenario), operational OLTP databases supporting the operational system will die eventually. They will at least be very thin. They will focus on the services they need to provide - good thing I should say.
This scenario opens up a lot of doors of opportunity now. I can't even begin to imagine the impact of such an architecture on an Enterprise Architecture.
I am off course very interested in the technical solution u chose and the model underneath it.
Great post!
Posted by: Ronald Damhof | February 25, 2008 12:42 PM
Dan (and Bill),
I must admit the concept looks appealing, but do understand you both correctly if I say the following;
"We don't need a operational system any more with this concept. We only need a interface to a very small database, where the information is 'stored' until it is 'extracted' by the ODW?"
The closed loop that will originate from this approach is very nice. Data Warehouse supporting the Operational Interface, which delivers the information for this support...
But... doesn't this shift responsibility of data quality more and more towards the data warehouse? You might say data quality becomes more visible (and quicker adjustable), but making mistakes is human. Using the operational interface means human intervention...
My feelings is that in the future it might be a nice opportunity for highly mature companies, but with companies on the lower part of the BI / DWH ladder, we might be facing problems using this ODW. In my opinion it asks for a very strong BI / DWH vision, which I miss at some customers (Sorry for that)
Posted by: Walter Smetsers | March 10, 2008 6:20 AM