Recently I've been asked about Active Data Warehousing, and (Real-time) Right Time Data Warehousing, what do these mean to the enterprise? In this short blog entry, I offer my opinion on the definition of each. In future entries I will define the basics of building one, the questions to ask, and potential value to the enterprise. I lead an effort in Active Data Warehousing, and Right Time Data Warehousing for Myers-Holum, Inc. We have best practices surrounding these efforts, and will soon offer tips and tricks for free on our site.
Too often, we are confused by marketing literature and vendor hype. I'm going to set the line and offer my opinion in DEFINING just what an Active Data Warehouse is, and just what a Right Time Data Warehouse is. Are they different? Yes, why? Well, we'll get to that in a minute. For now, this is the way I define each:
Active Data Warehousing (ADW)
The technical ability to capture transactions when they change, and integrate them in to the warehouse - along with maintaining batch or scheduled cycle refreshes.
Right-Time Data Warehouse (RTDW) (Not REAL-TIME)
The ability to answer a specific justifiable business question at the time in which it is asked. In other words: a pre-designed business question that requires heaps of pre-integrated data (from the warehouse) in order to answer the question. The answer to the question drives a competitive business decision or a decision that already has a cost / benefit analysis tied to the answer (in other words: quantifiable answer).
RTDW: If the business needs to answer a question at the end of the day, every day - then a RTDW would refresh on a daily basis. If the business needs to answer a question every 30 minutes, then an RTDW would refresh every 30 minutes - assuming the data is available.
Is an RTDW an Active Data Warehouse?
In my opinion: not always. The way I see it: ADW refreshes WHEN TRANSACTIONS CHANGE (they also combine pre-scheduled batch cycles). An RTDW is an ADW when the two are in sync - in other words, if transactions change every 10 minutes, and are captured and integrated into the warehouse as they change, and I have a business question that must be answered every 10 minutes, then I have an ADW and an RTDW.
Is an ADW also an RTDW?
Not necessarily, although 99% of the time, yes - it should be. Why? Because it costs a lot of money to build and feed an ADW, the ADW shouldn't be constructed without solid quantifiable business questions, and in doing so - thus answer the RTDW criteria.
So then, what is a Real-Time Data Warehouse?
I've blogged before on my interpretation of Real-Time data warehousing, I still maintain that it does not exist, and that timing just can't be fast enough (due to laws of physics) to make Real-Time decisions. That of course is the technical definition.
Is the business definition of Real-Time different than Right-Time?
Yes, there is a distinction between the two. Real-Time is more geared towards what I've defined as ADW (here), than it is Right-Time (as defined here).
Please don't mince words when going forward. Develop best-practices, metadata definitions, and terminology standards (business metadata). Too often our businesses are confused by all the vendor hype and marketing material that "throws a term in" just because it sounds cool.
I'd love to hear your thoughts on this topic, even if you disagree. How do you define Real-Time, Right-Time, and Active DW?
Posted January 30, 2006 12:47 PM
Permalink | 5 Comments |