Blog: Claudia Imhoff Subscribe to this blog's RSS feed!

Claudia Imhoff

Welcome to my blog.

This is another means for me to communicate, educate and participate within the Business Intelligence industry. It is a perfect forum for airing opinions, thoughts, vendor and client updates, problems and questions. To maximize the blog's value, it must be a participative venue. This means I will look forward to hearing from you often, since your input is vital to the blog's success. All I ask is that you treat me, the blog, and everyone who uses it with respect.

So...check it out every week to see what is new and exciting in our ever changing BI world.

About the author >

A thought leader, visionary, and practitioner, Claudia Imhoff, Ph.D., is an internationally recognized expert on analytics, business intelligence, and the architectures to support these initiatives. Dr. Imhoff has co-authored five books on these subjects and writes articles (totaling more than 150) for technical and business magazines.

She is also the Founder of the Boulder BI Brain Trust, a consortium of independent analysts and consultants (www.BBBT.us). You can follow them on Twitter at #BBBT

Editor's Note:
More articles and resources are available in Claudia's BeyeNETWORK Expert Channel. Be sure to visit today!

August 2006 Archives

Data integration has come a long way since the early days of data warehousing. We can now identify three major data integration initiatives in our IT environments. And – guess what? They require a clear and concise architecture. Read on to understand the three initiatives making up an enterprise data integration architecture.

Data seems to fall into three categories: master data, operational transaction data, and data used for decision making. Each of these categories requires specialized technologies and architectural components. I would like to briefly go over these in this blog.

1, Master Data – Master or reference data has been in the news a lot lately. This data consists of information about your customers, products, locations, and other major subject areas of interest to the enterprise. Today we have master data management (MDM) technologies that assure this form of data – both the current version as well as all of its history – is the best that it can be. The integrated and consolidated reference data is stored in a master data hub store.

2. Operational Transaction Data – This data consists of all the information about the activities occurring throughout the enterprise – basically it is what tracks the enterprise as it conducts its business. Purchases, call detail records, campaign activities, supplier orders, call center contacts, claims, and so on, are all examples of operational transaction data. This data must be managed much the same way as the master data and then stored in an operational data store (ODS).

3. Decision Support Data – This data consists of the historical snapshots of data used in strategic and tactical analyses. Trends, patterns, mining, multi-dimensional analytics -- all depend on this vast store of decision-making data. The snapshots are loaded into a data warehouse for ultimate delivery into the various BI applications and data marts.

You can see why an architecture is needed. All three forms of data require similar processes – the data must be collected, cleaned up, integrated, and populated into its appropriate store. All this data must also be accessible to the other environments and, in many cases, back to the operational sources themselves. The architecture becomes your road map, demonstrating data flows into and out of each data integration capability.

In addition, the three forms of data integration share many of the same technologies – EII, ETL, hardware, software, and even application software may be reused for each initiatives. Whether you create a physically distinct set of components for each initiative or create some form of mixed workload situation (combining two or more of these initiatives into the same component) is up to you. Study what each vendor has to offer, determine how the technology will fit into your architecture, ultimately supporting your overall data integration environment.

Yours in BI success,

Claudia


Posted August 30, 2006 3:13 PM
Permalink | 1 Comment |

Ah – the much maligned and misunderstood Operational Data Store (ODS)… I am at the Data Warehousing Institute conference in San Diego this week and I heard an astounding thing – a declaration that the ODS was dead. Gee, I must have missed the obit…

Why was the ODS declared dead? Because many people mistakenly believe that the ODS is a staging area for the data warehouse. Whether or not you need a staging area for the raw data coming into a data warehouse is another discussion altogether but let’s stick with the ODS for this blog.

OK – the basics. What is an ODS? In the mid-1990’s, the ODS was introduced by Bill Inmon and myself in our book, “Building the Operational Data Store”. In it, we defined the ODS as a “subject-oriented, integrated, current, volatile collection of data used to support the tactical decision-making process for the enterprise.”

Let’s go through it slowly to make sure everyone gets the definition:

1. Subject-oriented – The ODS is built similarly to the data warehouse in that its data content is focused about different data subjects. Examples of data subjects are customer, product, and order data.

2. Integrated – the data from each subject area is fully integrated – again much like the data in the data warehouse. It is cleaned up as much as is technologically and humanly possible during the ETL process before it is populated into the store.

3. Current – the data in the ODS is as current as we can technologically make it – a significant difference from a traditional data warehouse. Current as in as close to “real time” as we can get. The data probably won’t be synchronous with the operational system source because of the latencies inherent with integration but it will be close.

4. Volatile – major differentiator from a data warehouse. The new or changed data flowing into the ODS updates the appropriate records or fields exactly like you would update a record in an OLTP operational system. For example, when a new customer address is brought in, the old customer address is overwritten. An audit trail is created to trace the change but the old data is gone. If you want the history of the customer’s moves, you will have to either go to the audit trail or the data warehouse where history is preserved in snapshots.

It is this last aspect of the ODS that differentiates it significantly from a data warehouse and puts it in the camp of operational systems. Because of this feature, referential integrity must be fully implemented. Cascading updates and deletes, complete edit checks, and so on are needed just as they are in operational systems.

I just don’t get it. Does this sound like a holding area for extracts from operational systems? Don’t think so… In fact, it sounds an awful lot like the newly invented customer hub (CDI fame) if that hub contains only current customer data. However, the ODS was originally developed to integrate both current master data and current transactional data – the latter inclusion making it broader in purpose than the CDI hub. In any case, it was NEVER defined as a staging area for the data warehouse. I suggest that those calling their staging areas an ODS start calling those components by their rightful name. That would certainly help clear up the confusion.

The ODS’ purpose ranges from producing operational reports to propagating operational data for downstream operational systems to supporting the migration of legacy systems. And, of course, like any other operational database – it can be a source of data for the data warehouse. Perhaps it was ahead of its time when we first introduced it but to malign such a useful component in your arsenal of integration capabilities is just a shame.

Hopefully this short tutorial on what an ODS is has helped dispel the incorrect notions floating around. I feel like Mark Twain who said “The reports of my death are greatly exaggerated…” The ODS is not dead – it is alive and kicking more than ever helping companies integrate their operational data for operational BI and master data management.

If you are interested in a white paper I wrote on the ODS (lots more detail about what it is, how to build one, its uses, and how to get going), just send an email at CImhoff@IntelSols.com. I'll be happy to send it to you.

Long live the ODS!

Yours in BI (and ODS) success,

Claudia


Posted August 23, 2006 10:40 PM
Permalink | 4 Comments |

For the past five years, I have been privileged to participate in Scott Humphrey's highly anticipated Pacific Northwest BI Summit. Without a doubt, this is the highlight of my year. Read on to find out why and what we discussed at the comfortable and charming Weasku Inn (pronounced We Ask You Inn... Cute huh?).

First the players. The pacific Northwest BI Summit is made up of:

1. Four industry leaders -- myself, Colin White, president of BI Research, William McKnight, Senior Vice President of Data Warehousing, and Jill Dyche, partner with Baseline Consulting.

2. A group of outstanding vender representatives -- Frank Dravis from Business Objects (formerly FirstLogic), Brian Staff from StrataVia, Steve Smythe from Celequest, Donald Farmer from Microsoft, Val Rayzman from Informatica, Kim Dossey from Teradata, Jeff Dandridge from InfoCentricity, and Kim Stanick from Baseline.

3. Media representatives-- Dave Stodder, Editor of Intelligent Enterprise, Ron Powell, publisher of the B-EYE-Network (and my blog!), and Shawn Rogers, Executive Editor from the B-EYE-Network.

4. And our terrific host -- Scott Humphrey. For those of you perhaps not familiar with Scott, he is one of the best in BI Public Relations.

The Summit is two days over a weekend -- yes, weekend -- in which we have frank, informal, and informative discussions about the state of BI -- its current and future state, emerging trends, new problem areas, complications, successes, and even failures. This year, we focused on 3 main areas. You can listen to podcasts of our opening statements introducing the topics by clicking here.

1. Master Data Management (MDM) and its sub-application, Customer Data Integration (CDI). Colin White and Jill Dyche started the discussion for these topics. The discussion demonstrated rather clearly that there is still a good deal of confusion about what MDM and CDI really are. Applications? Software? Hardware? Fully managed environments? Just a storage hub for cleaned up and integrated data?

My own thoughts track along with Colin White's in his recent blog in which he states that "master data integration doesn't always equate to master data management. There are many good CDI solutions out there, but many of them don't have sound management capabilities for supporting master data hierarchies, master data versioning, historical master data, master metadata, data lineage reporting, and so on. Many CDI deployments have evolved from customer ODS and/or customer analytics projects. My concern here is that CRM and data warehousing designers and experts are frequently driving these projects in isolation. They often don't have the required knowledge, business transaction expertise, and enterprise perspective to move the CDI project toward true enterprise MDM, and the result will be therefore be a CDI silo." I think as the MDM projects begin to unfold, we will see the need to elevate MDM from just an integration project to a fully managed data integration environment.

2. The second topic was presented by William McKnight -- a most interesting and, at times frightening, overview of RFID and its usages. He pointed out that RFID technology will have a profound impact on most data warehouses due to the immense increase in volumes of data. Think of all the ways that RFIDs can impact your life, your kids' lives, the lives of terrorists and citizens alike. It makes for thought provoking discussions on security, privacy, and other such hot topics. We will have to wait for a while to see what futuristic applications will come from these tiny transmitters because they are still too expensive for most applications -- but it won't be long before "Big Brother" can truly peer into your life...

3. The last discussion was mine and was a good follow-on to William's talk. I discussed Operational BI or the push today to speed up BI analytics and to integrate them into the operational processes. I brought up three areas where I see potential and real problems with implementing operational BI. First, BI implementers must understand the operational processes, something that we have not studied much before. If operational BI is to be useful, it must be integrated at the appropriate points in our operational processes. Second, we must understand the impact of integrating operational BI into the operational systems. Some of these systems are fragile, possibly inflexible, and mucking around with them trying to interface our BI analytics with them could cause real damage to their performance and functionality. Tread lightly until you understand the landscape. And finally, for operational BI to be completely immersed in operations, the operational personnel may need retraining or retooling so that they understand when, how and where to bring these new services into play.

The weighty topics were then followed by an afternoon of sheer entertainment. Saturday, we took a fast boat ride down to the impressive Hellgate Canyon on the Rogue River (see Shawn's blog --- complete with video!). What a spectacular sight to behold. Then we were treated to a wonderful cookout on the porch of the Weasku. Sunday afternoon, Scott really put us to the test by having all of us paddle down the river in kyacks. My arms will never be the same! Gotta get to the gym more often...

The conference ended Sunday night with another wonderful meal at the Weasku and a lot of late night discussions on the health and well-being of BI, MDM, technologies, and what we want to do at next year's conference. A hearty thank you, Scott, for delivering yet another fantastic and enjoyable Summit. I can't wait for next July's event!

Yours in BI Success.

Claudia


Posted August 8, 2006 4:04 PM
Permalink | No Comments |