Blog: Krish Krishnan« Where Is My Data? | Main | Data Warehouse Appliances Get Attention » Where Is My Data? - Part 2From my previous posting, we started looking at the topic of auditing the data in the data warehouse and some benefits of doing this. Continuing on the same topic, due to its elaborate depth and breadth. Everytime anybody asks me the need to audit the data within the data warehouse, and if it only solves the compliance need, my response is, no, it goes beyond making your warehouse compliant. It gives you insight into the business rules processing around data and how effective it is, it helps establish a rigor on how the ETL / ELT process is working both from a performance perspective and a logic perspective. The resulting data collected from this process, if reported on, gives the business user information about the data content that they have in their hands, the differences if any between source and data warehouse from a numbers perspective, some information on why data was rejected etc, which is extremely invaluable. But above and beyond all these tangible benefits, there are two key points that need to be understood, by implementing an auditing process you get profound insight into 1. Data lineage - you get the ability to trace data from the data warehouse all the way back to source systems. Apart from the above mentioned benefits, you now will also have the information on things like ETL performance, server utilization, network latency which will start providing helpful insight into the overall solution architecture in the data loading process. Let's assume that you do decide to embark on this process, if you are still in the process of building your data warehouse it is another change request; but if you have already deployed your data warehouse it is a different challenge. How to start this and how to go about making decisions etc will be a topic of discussion for the next day. |