Business Intelligence Network business intelligence resources

Blog: Dan E. Linstedt

« Federated Star Schemas Going Super-Nova | Main | Automated Derterministic Contextual Coefficients »

Why do business changes impact my EDW so much?

Well, if you're like most of the world, you have an EDW that is built from a Federated Star Schema... This could be part of the problem, but definitely not all of it. A large part of the problem is where the business rules sit in terms of processing data going IN to the EDW. Regardless of architecture, if the Business rules sit between the source systems and the EDW (and actually don't sit in the source systems themselves), then we (I.T.) are doing a great dis-service to the business community, and ourselves. If you're like me, and have been around the block more than once with a Data Warehouse, then you know that "Changes to business usually don't come cheap" within the EDW environment - and that usually translates into high dollars, high impact, and eventually high-risk (resulting in Super-Nova of Star-Schemas as an EDW).

So what... I don't have this problem, my EDW is fast, nimble, and changes quickly...
Great, then this particular entry is not for you (it may be light entertaining reading for you), but for those of you with issues when the business changes, read-on....

So much of what we do in I.T. in building an EDW is to "help the business", but what we have failed to realize is that the industry for years has been feeding us an interesting way of getting that done. We've heard time and time again that we must implement "business rules" on the way IN to our data warehouses. That if we deliver "bad" data, that our warehouse efforts have failed, that we are to "fix data", fudge data, change data, or in other disguised terms: "quality check data, aggregate data, fill in blank values, compute data based on a lose set of business rules" or my personal favorite: "just get it to match the operational report..."

Wow, what if the Operational Report is wrong? They usually don't consider this to even be a possibility.... well who moved on, and blessed the operational report as "king?" Has the operational report been audited? Can the operational report give us the same answer that it provided for us 15 years ago? Has it changed? if it's changed, how, when, why? Usually because.... the business rules changed.

The architectural image used today looks a little like this (again, regardless of the data model)
TraditionalBusnEDWProcess.jpg

Ok - so the interpretation of the data changed, which means that the operational report (if run against historical) won't give us the same result that it gave us 15 years ago BASED ON THE ORIGINAL DATA. This is part of the point. When we (I.T.) take it upon ourselves to change data on the WAY IN to the EDW, we are subjecting the data in the EDW to the BUSINESS RULES OF TODAY... we're setting our EDW up for "failure" in no uncertain terms.

Our architectures, data models, load routines and queries all match "today's version of the truth." and the minute the business changes, or asks to see today’s data in comparison with historical - the warehouse and all it contains become invalid (to some degree), because the business now interprets data differently.

How on earth can we expect to audit such a system? It becomes increasingly difficult (even with audit trails of raw data sets) as the business continues to change, and we continue to implement those changes into our system GOING IN to our EDW.

How on earth do we mitigate this? This is a huge risk, huge concern right?
Well, there is one way, but it requires a fundamental shift in thinking as to WHAT an EDW is supposed to be, and how to load it. The fundamental shift is: the EDW should be an INTEGRATED (by business key only), raw view of historical, unchanging data at the grain in which it arrived. The data itself should not change even when the business changes.

The other thing this means is: move the BUSINESS PROCESSING of that data downstream- going from the EDW into the Data Marts. This is the right place for these business functions. This not only makes I.T. more nimble (moving the business logic closer to the business) but enables the EDW to be consistent over time, regardless of any business changes - no matter how big or how small. An architecture for this might look as follows:

NewBusnEDWProcess.jpg

I'm also suggesting that "exposing" the data flaws, and showing what the systems are producing "as-is" is just as important for fixing money loss at a business level, as showing cleansed and quality driven data. We (I.T.) need to put accountability for the data itself back in to the hands of the business, so that they can align what the source systems are doing with what they envision the business actually doing.

Exposing the flaws in the data, and the gaps between the belief of business processing versus what is actually happening can lead to real changes in the way business operates.

Again, is there value in "cleansing, cleaning, changing, standardizing data"?
Absolutely yes, but not on the way IN to the EDW, only on the way OUT to the data marts. This type of thinking allows us to become more nimble, and responsive. It lessens the impact on the EDW systems we've built when the business wants to change, and it provides the business with a manner in which they can be auditable, compliant, and accountable for finding and fixing the operational gaps in their source systems.

Love to hear your thoughts, and as always - if you disagree, I'd love to find out why, there's a good chance I've not thought of it in a different light.

Thank-you very much,
Daniel Linstedt

  Posted by Dan Linstedt on September 1, 2007 4:10 PM |

Post a comment