Blog: Krish Krishnan« Is the SMP architecture suited for the Data Warehouse Ecosystem | Main | Where Is My Data? - Part 2 » Where Is My Data?Every business user has asked this question to their IT counterpart at least once. IT users often feels frustrated that they are held responsible for every byte of data that has been processed and are compelled to provide an audit of what they received from sources to what was processed and loaded to the data warehouse or datamart. The question is not whether IT or business owns this process, but what is the process itself. If you have not thought of the answer by now, we are talking about the need to Audit your data warehouse. The process of auditing the data warehouse will provide an insight into all the areas of the data warehouse beginning from source systems and culminating in the datamart or reports. While you can do this process as a large scale enterprise initiative or s small group level initiative, this will clearly provide the business and IT users with answers on how data moves through the system, potential data flaws, business rules execution as key benefits. Other benefits of this initiative will be the ability to provide counts by each type of data, totals for sales data and the ability to explain discrepancies and correct them at the source system. To be able to execute this type of a process in your data warehouse you need to be able to enforce rules of engagement like control files for source data, reference data for all the common entities across the data warehouse, common data format definitions etc. A business sponsor with an IT enabler will be required to make this initiative a success and they will need the backing and business case from a steering committee or a data warehouse governance body within the organization. The success of this type of an initiative can be measured on the following 1. A data flow scorecard which will provide a traceability matrix. Overall the process of auditing the data warehouse is a lynchpin to the success of the data warehouse itself. It is interesting to note that in most cases this effort is often overlooked due to initial cost concerns and when there is mayhem in the data warehouse, this is the effort taken to solve the issue. How do we execute this process, when do we start and how do we integrate are all aspcets that will be covered in forthcoming blogs. |