Blog: Claudia Imhoff Subscribe to this blog's RSS feed!

Claudia Imhoff

Welcome to my blog.

This is another means for me to communicate, educate and participate within the Business Intelligence industry. It is a perfect forum for airing opinions, thoughts, vendor and client updates, problems and questions. To maximize the blog's value, it must be a participative venue. This means I will look forward to hearing from you often, since your input is vital to the blog's success. All I ask is that you treat me, the blog, and everyone who uses it with respect.

So...check it out every week to see what is new and exciting in our ever changing BI world.

About the author >

A thought leader, visionary, and practitioner, Claudia Imhoff, Ph.D., is an internationally recognized expert on analytics, business intelligence, and the architectures to support these initiatives. Dr. Imhoff has co-authored five books on these subjects and writes articles (totaling more than 150) for technical and business magazines.

She is also the Founder of the Boulder BI Brain Trust, a consortium of independent analysts and consultants (www.BBBT.us). You can follow them on Twitter at #BBBT

Editor's Note:
More articles and resources are available in Claudia's BeyeNETWORK Expert Channel. Be sure to visit today!

June 2007 Archives

I just finished reading an interesting article in the latest TDWI newsletter (dated today)titled "Data Cleansing versus Auditability: Achieving Competitive Advantage in a Cautious Era" It was written by Michael Mansur, Simon Terr, and David Stanvick from HP Services' Information Management Practice. It got me to thinking about the notion that there may be times when we destroy the usability of data by fixing it up too much...

If you are interested in reading their article, click here and sign up for the newsletter. It is worth reading.

I have always heard and I am sure you have too the data warehousing mantra that we must clean up the data flowing into the warehouse as much as possible before turning the users loose on it. Most of the time this is true because it inspires confidence in the business community by correcting many of the errors created by the operational systems that could impact analytics.

However, the point made in the article is that data cleansing also pretty much destroys the audit trail for a piece of data. So in this era of regulations, compliance issues, security and privacy breaches, losing the audit trail of a piece of data can be a serious problem. There should be a way to balance the need to cleanse the data to achieve the goals of the users while preserving the original values in case of an audit.

Let me give you an interesting example of when quality processes destroy the value of the data. Several years ago, I worked with a credit card company on their data warehouse. They stored all the transactions from credit card users for quite a long period of time, making them available to analysts to slice and dice, do fraud models, perform regression tests, and perform other analytics.

All was going well until I ran into a lady who was responsible for determining when the credit card readers were beginning to pot out. I guess the card swipes don't just suddenly die -- they slowly lose their ability to "read" the magnetic strip on your card. What happens is that they begin sending in some of the data but not all. And the missing bits change with every swipe.

The lady told me that she did her own extractions against the operational database that collected all these transactions rather than use the data warehouse data. I was puzzled why she would go to all that trouble to redundantly do what others were already doing. I decided to pursue it more.

Turns out that the implementation team was really efficient at creating integration and quality processes -- so good that, by the time the transactions hit the data warehouse, all the missing or partial data had been overwritten with the correct fields. The data that would indicate a reader going bad was completely gone... This was a case of too much or inappropriately placed data quality processes.

The article in TDWI's newsletter suggests four options to overcome this problem:

1. Flat-file archive - which they say is the simplest technique. You capture all the operational data extracted into a flat file and archive it somewhere. Downside is of course that this is not a real DBMS so analytics and comparisons get to be difficult.

2. Minimal duplicates - you retain both the original data and its fixed up counterpart through the ETL layer to the presentation layer. In other words the original data is stored in its own redundant column. Not as elegant and certainly not as simple as the first solution but it does have some advantages since the data is actually in the warehouse database and readily available. However, redundancy comes with its own set of problems in a growing warehouse.

3. ETL Express - the authors say this is a variation of the flat file idea. You use the same flat file technique with the addition of a "straight through" or ETL express process that can be run on the source data when needed. The express jobs populate special audit tables, each having duplicated primary keys tied directly to the cleaned up warehouse data. Redundant data is brought into the warehouse only when it's needed for audit purposes. Again somewhat limiting and complex but better than options 1 or 2 in most cases.

4. ABC Tables - The last option is an implicit approach which involves creating audit, balance and control (ABC) tables. These tables are used for more than just auditing since they support the balancing and control functions as well. Rather than explicitly storing the original and corrected data values and the history of changes, this option "makes business rules available so source and intermediate values of data can be constructed from the final target values". Obviously this is a complex process and may not suit your audit purposes exactly.

My advice is to do a thorough study of the audit needs in your enterprise and then craft the option (or combination of options) that supports your requirements AND is feasible in your technological environment. It may not be simple but, in the long run, you will create an environment that provides everyone with a satisfactory BI environment -- even the auditors!

Yours in BI success,

Claudia


Posted June 28, 2007 10:57 AM
Permalink | 2 Comments |

It's June -- My birth month. I always require humor at these times. And have I got one for you...

I got the following gem from my good friend, Shelley S - a long-time Oregon resident. Turns out Oregon is one strange state... or at least has some very strange ideas about "waste disposal". The "powers that be" most definitely needed an injection of BI analytics before they decided on how to dispose of a dead whale...

Thirty-seven years ago, a dead whale washed up on the beach of a small Oregon town (Florence, OR). It sat there for many days (the smell was apparently overpowering) while the Highway Division and a bunch of civil engineers decided how to dispose of said carcass. Why the Highway Division got this little problem is beyond me.

Their options were:

1. Bury the 45 foot, 8 ton (that's 16,000 pounds of rotting whale to you) body. That was thrown out because the engineers and Oregon's Highway Division determined that it would just get dug up again. I can 't help but wonder who or what would dig up a putrefied whale but like I said, Oregon is one strange state.

2. Cut the whale up and haul off the pieces. They quickly discovered that no one in their right mind would do this so the idea was jettisoned.

3. Burn the whale. See reason 2 for why that didn't happen. Again the idea was thrown out (or up!).

The civil engineers then hit on the bright idea to simply blow the sucker into a billions little pieces. Yes, you read correctly -- disintegrate 8 tons of rancid, decaying whale by stuffing it with a half ton of dynamite. Woohoo!

The video says it all. Watch this and then we will return to the discussion...

From my friend, Shelley: "The part they don’t talk about is how the entire town was coated in “rotten whale mist” and it took weeks to wear off… Guess there wasn’t enough tomato juice to wash down the entire town."

She goes on, "To get esoteric, they could have created drift models for any resulting fallout. Basically, my biggest question is WHAT WERE THEY (NOT) THINKING? It’s the LACK of intelligence that makes this so funny, although not for the guy whose car was smashed (can you imagine that insurance claim?) or the town that stunk for weeks. These guys were civil engineers, for heaven’s sake! Even if they’re used to blowing up things a little more solid, like hillsides or boulders, you would think ONE of them would have said “We pause here to ask the question, what is inside a rotten whale? Stinky, gooey stuff. Right! And what will happen when we atomize this stinky, gooey stuff with explosives? It will drift through the air, covering everything in its path. But wait, which way will it drift? Well, at the beach, the airflow is usually from the water to the land. Correct! So to summarize, we are going to explode a huge whale full of stinky, gooey stuff so that we can cover the town in rotten whale mist. Brilliant! Let’s get to it!!!”

You might think they would at least analyze how far people would have to move away from the site to ensure that they would be out of the "strike zone". But alas, it was not to be -- no BI for these folks.

Unfortunately, history has a nasty way of repeating itself. ANOTHER dead whale just recently washed up on the shores of Oregon, a mere 40 or 50 miles from the original dead whale debacle. Apparently the Oregonian authorities are looking for the people who removed bits and pieces of this rotting whale (you have got to be kidding...) There have been several proposals put forth about how to dispose of this stinker: Shelley informs me, "One was to pull the whale back into the water and let it decompose at sea or (much more likely) come ashore some place else (where it would be someone else’s problem). This from a government official! And you wonder why blowing up the last one was considered a good idea…"

It was 80 degrees in Oregon when she sent me her note, and expected to get up to 90. Ah, the pungent smell of ripe whale in the morning.

It appears the Highway Division learned at least what not to do (as the reporter foretold 37 years ago). The current plan is to bury this whale this time...

I can't wait to see what digs it up.

How could you not love a state like Oregon?

Yours in BI and whale exploding success.

Claudia


Posted June 7, 2007 2:27 PM
Permalink | 1 Comment |