There is one task in virtually every BI project that seems to cause the most surprises and to be the biggest schedule buster -- that is estimating how long the ETL or data integration process will take. While the reasons for this are numerous, it seems that a lack of business understanding leads the pack.
My friend, James Schardt, Chief Technologist for Advanced Concepts Center, sent me a suggestion for how he mitigates this serious problem in his DW projects.
From James Schardt:
"Here's the situation: A data model has been developed and the appropriate physical schemas for the DW have been designed. (This may or may not have been done using gathered documented requirements for the DW). The design team has some idea of which source system data records will be used to populate the DW. Now the design team must develop a set of transformations that must properly clean, transform, and/or merge the data into the right location in the DW. The team discovers they simply do not have an understanding of the business to correctly specify many of the more complex transformations. They make assumptions. They get it wrong. The users are not happy with what they get.
When I talk about gathering and documenting requirements for the DW, I mention the "Reverse Engineering" technique. During the Requirements and Analysis phases of DW development, have a couple of developers reverse engineer the important source systems. The result is a conceptual level ER or UML class diagram that forms the basis of bottom-up domain modeling. (This is then combined with top-down domain modeling to form the basis for subsequent data warehouse and ODS design).
While the development team is reverse engineering source systems, they will uncover problems with the data and the structure of the data. When they combine multiple reverse engineered models into an integrated bottom-up domain model, they will uncover many other problems with the operational data sources. But, these problems are discovered during analysis and before starting the design. The DW team has an opportunity to resolve these issues with users before they begin design work. Users have input and more buy-in to what the DW will provide them. We get the users to specify their requirements for clean and integrated data using business rules that make sense to them.
By performing reverse engineering early in the process the developers should be able to perform the ETL development without a lot of surprises."
Please add your comments to James' suggestion or make your own suggestions to this universal BI project schedule killer.
Yours in BI success!
Posted June 24, 2005 1:29 PM
Permalink | 2 Comments |