Business Intelligence Network Business Intelligence Resource

Blog: Dan E. Linstedt

« Oracle - Fully Loaded? Or Dried up like dust? | Main | A funny idea: Slower Melting Snow »

What is the TRUTH anyway?

We've all heard of "Single Version of the Truth", and we've all seen shows, presentations, and even read a paper or two on this topic. In fact, search Google for this phrase results in 36,900 hits! But this begs the question: what the heck is "TRUTH"? How can you hold "TRUTH" accountable? Is a "Single version of the Truth" compliant?

I stand here sure as the sun will rise today to tell you that I believe TRUTH is subjective in nature. Of course I think it also depends on how you define "Truth" in your enterprise integration efforts.

My friend and mentor Bill Inmon wrote about this here. I would tend to say after reading and re-reading this article that indeed, SVT (single version of the truth) is in fact a GOAL, nothing more. I would also tend to say that truth itself is in the eye of the beholder (or money-holder as the case may be), because as we all know - one persons truth is not necessarily coherent with another’s.

If all truth were equal we would have discussions about metadata, common meta definitions. Nor would we fight over what the master system is for customer information, nor would we re-state or alter data in the warehouse when one major money holder leaves, and another takes their place.

The operational systems' data tells one story (which for the most part matches the business expectations), while the metadata and business requirements usually tell a different story. The "gold" or real profitability within the organization is to resolve the discrepancies find and fix the business issues causing the discrepancies - this will also put you on the road to full compliance, auditability, and end-user accountability.

I would tend to say that SVT is a wonderful goal - but ONLY achievable within a specific point in time as viewed and presented to the current money holder who agrees with and accepts the definitions set before them.

What then really makes a truly robust, scalable integration platform? Be-it for metadata, data, or business rules?
The real answer can only be found by a close approximation: for lack of a better term: "single integrated statement of data facts (SISDF)", certainly not as clean and "saleable" as SVT, but it works none-the-less. In order to define this let me clarify for a moment the differences between SVT and SISDF.

SVT says: we have truth when data is munged and agrees with whatever the current business rules state it should match.

SISDF says: we have truth when data is integrated by common semantically defined sets of business keys, and is the raw data that arrived in accordance with the source system feed - both on the same level of detail, and with exacting traceability.

SVT goes on to say: cleansed, merged, mixed and matched data should be put in the integration store.

SISDF takes a different approach: data will be loaded to the integration store AS IT STANDS 1 for 1 match with what arrived on the source feeds.

This pushes the notion of SVT downstream from the SISDF Integration store - and puts the burden of proof (to be labled an SVT) in the delivery mechanisms - which are usually data marts. In other words: Sales and Finance don't agree with each other on what "revenue" means, so one data mart has a sales SVT, and the other data mart contains finance SVT.

Here we have a case where both SVT's are correct at the same time, but depending on who's using the data - it could be wrong, meaning that "truth" falls apart because of interpretation. Typically though, this case is overcome through hard-work, data stewards, metadata management, and SLA's with the end-user base to agree on two definitions: SALES REVENUE and FINANCE REVENUE, each calculated differently.

On the other hand I've seen this happen first hand: SVT is built "on the way in" to the enterprise data warehouse. All is fine and dandy until the current money-holder is replaced with a new money-holder. The new money-holder says: hey, my sky isn't blue - it's white, and the SVT you think you have is WRONG until you change it. They are henceforth left with "restating" the data within the enterprise integration store (data warehouse).

They've not only broken the SVT, they've broken compliance, traceability, auditability, and strangely enough - the SVT that WAS in place? It WAS correct for the time it WAS in place. A conundrum I think is what they call this.

Well, with the onslaught of SOA, enterprise integration efforts are hotter than ever (as is metadata definition and management), right along with data quality. It's high time we fed ourselves with an architecture that presents SISDF rather than SVT, and moved our SVT's downstream (in the place of loading data delivery platforms such as data marts).

In other words - it's high time we STOPPED all this transformation, cleansing and data alteration on the way in to our enterprise warehouses, and STARTED all the transformation, cleansing, and data alteration on the way OUT of our EDW and in to our marts.

I'm not saying "don't deliver SVT", I'm not saying "don't cleanse or quality check your data", I'm not saying "it's bad to integrate or complete your metadata efforts."

I AM saying: if you want a real, compliant, single/consistent integrated version of the DATA within your enterprise - you need to move the SVT notions down stream - and produce the "truth" as you produce the delivery mechanisms. I am also saying: take a hard look at producing an ERROR MART or two, or more - move the dirty data through the system, and put it in the hands of savvy business users who will begin seeing "who can clean up their data fastest" as a game.

Your enterprise might just be surprised what they uncover when this type of effort is implemented. I know of several companies that found and saved $15M to $45M in the first six months of operation of a SISDF.

You can read more about the data architecture behind these notions on my web site. Thoughts?

  Posted by Dan Linstedt on October 14, 2005 1:39 AM |

Post a comment