Question: "If we adopt data virtualization, can we throw away the data warehouse, because we can access the data in the production databases straight on, right?"
Wrong! Data virtualization is not some data warehouse killer. In most projects, where data virtualization is deployed, you will still need a data warehouse. In many systems, if no data warehouse it developed, it won't be possible to implement the information needs of many reports. Let me give the two key reasons:
- Most production systems do not contain historical data. They were not designed to keep track of historical data. If a value is changed, the old value is deleted. For reports that need to do trend analysis, those deleted values may be needed. Thus, those values have to be stored somewhere. And this is where the data warehouse comes in: data warehouses are needed to store historical data.
- Production systems may contain inconsistent data. One system may say that a customer is based in New York, while the other system indicates that he is based in Boston. Inconsistencies can't always be solved using software, sometimes human intervention is required to indicate what the correct value is. The result of that intervention must be stored somewhere, so that it can be reused. Again, that's where a data warehouse comes in.
Worthwhile to mention is that if a data warehouse system consists of a data warehouse and deploys data virtualization, then (physical) data marts may not be needed anymore when they contain data derived from the data warehouse. Such data marts can be simulated by the data virtualization server. We usually refer to them as virtual data marts.
So, introducing data virtualization in a data warehouse system does not imply throwing away the data warehouse. The data warehouse is still needed.
Note: For more information on data virtualization, I refer to my new book "Data Virtualization for Business Intelligence Systems" available from Amazon.
Posted September 27, 2012 12:04 PM
Permalink | No Comments |