Three Types of Data Models

Originally published May 6, 2010

Recently, I was working with an organization on a data warehouse. There came a time when the organization needed a data model, which is absolutely normal in the life of a data warehouse. So this organization set out to find a data model for their business.

Indeed, after searching the Internet a bit, a data model for their industry was found. And the organization that had the data model allowed us to take a look at it.

At this point, we made a discovery – there are different kinds of data models. In this case, the data model that was presented was an operational data model. The operational data model was appropriate for the day-to-day operational needs of the organization. But when it came to building a data warehouse, the operational data model simply was a misfit.

With that short perspective in mind, it seems that there are at least three very different kinds of data models – operational data models, data warehouse (or analytical) data models, and global data models. In fact, there are probably even other types of data models.

So what is the difference between an operational data model and a data warehouse data model? There are in fact many differences. Some of those differences are:

  • Operational data models contain data that is purely operational, whereas data warehouse data models contain only information useful for analytical processing. As an example, consider telephone number. There is no question that telephone number is of operational importance. But a telephone number being used for analytical processing? We don’t make analytical decisions on the basis of the digits of a person’s telephone number.

  • Date stamping. All data warehouse data is time stamped. But only some operational data is time stamped. As data passes from the operational world to the data warehouse world, if a time stamp is not present, it is added.

  • Granularity. Data in the data warehouse world is always granular. Always. But data in the operational world is not granular at all.

  • Subject orientation. Data in the operational world is organized by function. But data in the data warehouse world is organized by subject area. This very real difference in orientations shows up in the data model.
Now when it comes to the global data model, it seems that there are even more fundamental differences. In general, the data model reflects commonality between organizations. It is in the commonality of the business that organizations find their need for shared data. The problem is that when looking at business across the globe, there often is very little common data. Across the globe – even for the same company – business practices are different. Who a customer is, what a product is, how a transaction is conducted are all very different. In many organizations, about the only commonality across the organization is a common balance sheet.

It is for these reasons then that the global data model tends to be:
  • Simple, and

  • Focused on finance, particularly profit and loss.
These then are some of the differences between the different types of data models.
  • Bill InmonBill Inmon

    Bill is universally recognized as the father of the data warehouse. He has more than 36 years of database technology management experience and data warehouse design expertise. He has published more than 40 books and 1,000 articles on data warehousing and data management, and his books have been translated into nine languages. He is known globally for his data warehouse development seminars and has been a keynote speaker for many major computing associations.

    Editor's Note: More articles, resources and events are available in Bill's BeyeNETWORK Expert Channel. Be sure to visit today!

Recent articles by Bill Inmon

 

Comments

Want to post a comment? Login or become a member today!

Posted May 17, 2010 by Anonymous

I came across this article in search of a definition for 'Data Intgeration Model'.  I've been in the Industry since Pine Cone's ETL tool PRISM and I still have my Tech Topics.  Point is, it does seem more often then not we see new terms applied to old problems.  To me, when working with a case tool like PowerDesigner, where source models 'intgrate/link' with a target 'integrated' model, then essentialy you have a data integrated model.  You know, so the case tool reports data lineage.  Anyhow, maybe that's too simplistic?  Bill, can you provide a defintion for a 'Data Integration Model'?

Is this comment inappropriate? Click here to flag this comment.

Posted May 6, 2010 by

This is a nice summary of the different types of model.  However, I think Bill misses a point when he says that "data warehouse data models contain only information useful for analytical processing".  Once the analysis is done, you need to be able to take action.  (Or else why did you bother with the analysis?)  To take action, you need some indication of the entities that affected a particular analytical conclusion.  For example, if toothpaste is selling poorly, you need to know what region or perhaps even which stores need more marketing attention.  If telephone expenses are too high, you need to know who is making the excessive calls, and perhaps what numbers they are calling.  Region names, store names, telephone numbers - none of these is used directly for analytical processing.

Is this comment inappropriate? Click here to flag this comment.

Posted May 6, 2010 by

This is absolutely our experience here Bill. We model top down by subject area and bottom up by data needed by the business. In every case it gets down to dollars and numbers as measures and Things (parties, locations, products etc) as they pertain to organisational business units (Sales, Marketing Finance etc). Phil

Is this comment inappropriate? Click here to flag this comment.