<-- Back to full color view

Data Warehousing: The Generic Entity

Originally published April 28, 2011

“I just can’t ever get my organization to agree on anything, especially definitions. And I have to get them to agree on everything before I build my data warehouse.”

How many times have we heard this? And, of course, no one agrees on anything, especially definitions. One organization thinks customers are all current customers. Another organization thinks customers are current customers and prospective customers. Another organization thinks customers are all retail customers but not commercial customers. And trying to get the organization to collectively think one way and agree on a single definition is a Herculean task. In fact, it is doubtful that you will EVER get your organization to agree.

And if you think that you won’t build your data warehouse until everybody agrees on everything, then you have a long, thankless, never-ending task in front of you. It is going to be a long time before you ever start building your data warehouse.

However, there is an easy and quite effective way around this problem. You indeed do not have to have your entire organization agree on definitions. What you need to do is to create “generic entities.”

In order to understand what a generic entity is and what to do with it, consider the customers of an organization. There are, in fact, many different interpretations of who a customer is. So you create a generic customer. A generic customer is a customer that has ALL the characteristics of a customer, regardless of which part of the organization is describing the customer.

In order to do a database design for a generic customer, you create a customer data structure in your database. You attach all the attributes that you would expect for a customer. Then you add an extra set of attributes (called “discerning attributes”) to distinguish one type of customer from the next. Your structure looks like this:


The first set of attributes contains attributes that you use in the normal course of business when dealing with a customer.

The second set of attributes – the “discerning” attributes, in this case first purchase date and last purchase date – can be used to distinguish between different types of customers. These attributes can be used to determine whether the customer is an existing customer and how long they have been a customer. The next discerning attributes – first contact date and last contact date – can be used to determine whether people are existing customers and to tell how long it has been between contacts. The discerning attributes – commercial/retail – can be used to determine whether or not a customer is commercial or retail, and so forth.

There may be many other discerning attributes. By including the discerning attributes, the analyst that looks at customer can tailor his/her definition. One department can look at inactive retail customers. Another department can look at current upscale customers. Yet another department can look at prospects who have not yet made their first purchase.

At the point of analysis, the analyst can determine the class of customer that is desired. Furthermore, by creating discerning attributes, the analyst can change his/her mind.

By creating a generic entity and including discerning attributes, the data warehouse designer does not have to get all the different organizations within a company to agree on a single definition. Instead, the data warehouse designer creates a generic entity that can be used by everyone, regardless of their interpretation or their specific needs for looking at data.

Generic entities then can save HUGE amounts of design time that are needlessly spent trying to untie the Gordian knot. The data warehouse designer should just simply cut the Gordian knot into pieces.

SOURCE: Data Warehousing: The Generic Entity

  • Bill InmonBill Inmon

    Bill is universally recognized as the father of the data warehouse. He has more than 36 years of database technology management experience and data warehouse design expertise. He has published more than 40 books and 1,000 articles on data warehousing and data management, and his books have been translated into nine languages. He is known globally for his data warehouse development seminars and has been a keynote speaker for many major computing associations.

    Editor's Note: More articles, resources and events are available in Bill's BeyeNETWORK Expert Channel. Be sure to visit today!

Recent articles by Bill Inmon



Want to post a comment? Login or become a member today!

Be the first to comment!


Copyright 2004 — 2020. Powell Media, LLC. All rights reserved.
BeyeNETWORK™ is a trademark of Powell Media, LLC