Blog: Claudia Imhoff Subscribe to this blog's RSS feed!

Claudia Imhoff

Welcome to my blog.

This is another means for me to communicate, educate and participate within the Business Intelligence industry. It is a perfect forum for airing opinions, thoughts, vendor and client updates, problems and questions. To maximize the blog's value, it must be a participative venue. This means I will look forward to hearing from you often, since your input is vital to the blog's success. All I ask is that you treat me, the blog, and everyone who uses it with respect.

So...check it out every week to see what is new and exciting in our ever changing BI world.

About the author >

A thought leader, visionary, and practitioner, Claudia Imhoff, Ph.D., is an internationally recognized expert on analytics, business intelligence, and the architectures to support these initiatives. Dr. Imhoff has co-authored five books on these subjects and writes articles (totaling more than 150) for technical and business magazines.

She is also the Founder of the Boulder BI Brain Trust, a consortium of independent analysts and consultants (www.BBBT.us). You can follow them on Twitter at #BBBT

Editor's Note:
More articles and resources are available in Claudia's BeyeNETWORK Expert Channel. Be sure to visit today!

Ah – the much maligned and misunderstood Operational Data Store (ODS)… I am at the Data Warehousing Institute conference in San Diego this week and I heard an astounding thing – a declaration that the ODS was dead. Gee, I must have missed the obit…

Why was the ODS declared dead? Because many people mistakenly believe that the ODS is a staging area for the data warehouse. Whether or not you need a staging area for the raw data coming into a data warehouse is another discussion altogether but let’s stick with the ODS for this blog.

OK – the basics. What is an ODS? In the mid-1990’s, the ODS was introduced by Bill Inmon and myself in our book, “Building the Operational Data Store”. In it, we defined the ODS as a “subject-oriented, integrated, current, volatile collection of data used to support the tactical decision-making process for the enterprise.”

Let’s go through it slowly to make sure everyone gets the definition:

1. Subject-oriented – The ODS is built similarly to the data warehouse in that its data content is focused about different data subjects. Examples of data subjects are customer, product, and order data.

2. Integrated – the data from each subject area is fully integrated – again much like the data in the data warehouse. It is cleaned up as much as is technologically and humanly possible during the ETL process before it is populated into the store.

3. Current – the data in the ODS is as current as we can technologically make it – a significant difference from a traditional data warehouse. Current as in as close to “real time” as we can get. The data probably won’t be synchronous with the operational system source because of the latencies inherent with integration but it will be close.

4. Volatile – major differentiator from a data warehouse. The new or changed data flowing into the ODS updates the appropriate records or fields exactly like you would update a record in an OLTP operational system. For example, when a new customer address is brought in, the old customer address is overwritten. An audit trail is created to trace the change but the old data is gone. If you want the history of the customer’s moves, you will have to either go to the audit trail or the data warehouse where history is preserved in snapshots.

It is this last aspect of the ODS that differentiates it significantly from a data warehouse and puts it in the camp of operational systems. Because of this feature, referential integrity must be fully implemented. Cascading updates and deletes, complete edit checks, and so on are needed just as they are in operational systems.

I just don’t get it. Does this sound like a holding area for extracts from operational systems? Don’t think so… In fact, it sounds an awful lot like the newly invented customer hub (CDI fame) if that hub contains only current customer data. However, the ODS was originally developed to integrate both current master data and current transactional data – the latter inclusion making it broader in purpose than the CDI hub. In any case, it was NEVER defined as a staging area for the data warehouse. I suggest that those calling their staging areas an ODS start calling those components by their rightful name. That would certainly help clear up the confusion.

The ODS’ purpose ranges from producing operational reports to propagating operational data for downstream operational systems to supporting the migration of legacy systems. And, of course, like any other operational database – it can be a source of data for the data warehouse. Perhaps it was ahead of its time when we first introduced it but to malign such a useful component in your arsenal of integration capabilities is just a shame.

Hopefully this short tutorial on what an ODS is has helped dispel the incorrect notions floating around. I feel like Mark Twain who said “The reports of my death are greatly exaggerated…” The ODS is not dead – it is alive and kicking more than ever helping companies integrate their operational data for operational BI and master data management.

If you are interested in a white paper I wrote on the ODS (lots more detail about what it is, how to build one, its uses, and how to get going), just send an email at CImhoff@IntelSols.com. I'll be happy to send it to you.

Long live the ODS!

Yours in BI (and ODS) success,

Claudia


Posted August 23, 2006 10:40 PM
Permalink | 4 Comments |

4 Comments

Hello Claudia,
I read your recent blog article on the role of the ODS ("NOT a staging area for the datawarehouse"). I agree with your definition and it was useful to see each part of the definition explained.

My special area of interest is Information Quality Management (that's Data Quality Management at the IT level + Business Information Quality management at the knowledge-user interface). You could say that DQM is 'bottom-up", while IQM is "top down".

Do you believe that the ODS is a good place to straighten out any existing data quality issues, such as format discrepencies, invalid or missing data, data not compliant with Business Rules..etc..), or should this quality check (and improvement where possible) take place elsewhere. (if so, where?).

I know that we should really be focussing on preventing low quality data rather than identifying and improving it where possible, but most companies are not starting from scratch and already have a set of data sources which are not of high quality and are difficult to integrate. (... not to mention the fact that they seldom have up-do-date metadata or a single, published corporate-wide set of Business Rules....... which people can refer to).

(By the way, I used to work for Bill at Prism Solutions way back....)

kind regards,

Alan Snow
Senior Consultant
Ordina VisionWorks www.ordinavisionworks.nl
Ringwade 1, (Toren B-3), 3439 LM Nieuwegein
Tel: (+31)30-6066336 Fax: (+31)30-6638810
Mob. +31-6-55880529 Email: alan.snow@ordina.nl
Presenting "Measuring and Improving the Business Value of Information" at the Data Management, Meta Data & Information Quality Conference, London, Oct. 30 to Nov. 2 2006

The ODS is a perfect place to handle post-operational system data quality problems. As the data is being integrated, it can certainly be cleaned up as much as possible before entering the ODS. It is why some people consider the ODS to be the master data store for current system of record master data.

I also agree with you that a better place to do this is where the errors happen -- that is, back in the operational system of entry -- fix them before they get propagated like a virus. And yes, you are correct too that it may be very difficult to change those systems or processes to improve the quality there.

So you are left with where the data gets integrated - either the ODS or possibly the data warehouse.

Thanks for your thoughtful comment.

Claudia

Claudia, good discussion out here and very timely for us. I have a question that extends beyond the initial intended use of an ODS and maybe one that picks more at the maturity of an ODS. Since the ODS is intended to be that integrated/current source of information, the tendency over time is to make it your authoritative source of key subject-oriented information (ie Customer demographics - name, addresses, phone, etc.) As this happens data begins to be primarily handled at the ODS level and is a secondary concern in the operational stores..to the point of complete movement to an ODS. This then puts the ODS in the role of also addressing and meeting operational needs...which then must take on a whole new role in the organization with additional avaialability expectations.

Is this a normal thing you see happening within organizations that have been leveraging ODS repositories successfully?

How do you combat or continue to support the true intended use of an ODS without forcing redundant operational stores for the sake of keeping the ODS out of an operational role?

Claudia,
please send me articles about ODS. I´m very interested in a other´s white papers about ODS.
Kind regards,
Ana Claudia

Leave a comment

    
   VISIT MY EXPERT CHANNEL

Search this blog
Categories ›
Archives ›
Recent Entries ›