We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.

Blog: Rick van der Lans Subscribe to this blog's RSS feed!

Rick van der Lans

Welcome to my blog where I will talk about a variety of topics related to data warehousing, business intelligence, application integration, and database technology. Currently my special interests include data virtualization, NoSQL technology, and service-oriented architectures. If there are any topics you'd like me to address, send them to me at rick@r20.nl.

About the author >

Rick is an independent consultant, speaker and author, specializing in data warehousing, business intelligence, database technology and data virtualization. He is managing director and founder of R20/Consultancy. An internationally acclaimed speaker who has lectured worldwide for the last 25 years, he is the chairman of the successful annual European Enterprise Data and Business Intelligence Conference held annually in London. In the summer of 2012 he published his new book Data Virtualization for Business Intelligence Systems. He is also the author of one of the most successful books on SQL, the popular Introduction to SQL, which is available in English, Chinese, Dutch, Italian and German. He has written many white papers for various software vendors. Rick can be contacted by sending an email to rick@r20.nl.

Editor's Note: Rick's blog and more articles can be accessed through his BeyeNETWORK Expert Channel.

Many times we criticize users for having poor or no definitions at all for their concepts, and we can even get upset if different users of the same organization use different definitions for the same concept. However, can we say with certainty that we are doing a good job with respect to definitions in our own field? I am not so sure. It's more like the pot calling the kettle black. In the world of business intelligence and data warehousing, many concepts have been defined poorly or not at all, including those concepts we use daily. Obviously, this always leads to confusing discussions.


A good definition of a concept satisfies several requirements, one is reversibility. Suppose that we have the following abstract definition: "A is text". Reversibility means that everything that satisfies the text is also an A. Take for example the concept of an african elephant (Loxodonta). A possible definition of elephant would go along the lines of "a big herbivore with a trunk, tusks, and big feet". So each mammal satisfying these requirements is an elephant by definition. Only having a trunk is not sufficient, you must have tusks, big ears, and big feet as well. 


With a decent definition we want to include the correct concepts and exclude the wrong ones. For example, from the above definition of the african elephant we can conclude that the savannah elephant is indeed an african elephant. However, by including big ears as a requirement, we exclude the asian elephant rightfully so. By demanding that a concept's definition is reversible, we assure that the wrong concepts excluded.


Unfortunately, in our world not all the definitions are reversible. Let's take as an example Bill Inmon's well-known and frequently used definition of a data warehouse: "A data warehouse is a subject oriented, integrated, non volatile, time variant collection of data for management's decision making". Unfortunately, this definition is not reversible. If a user creates a spreadsheet containing customer data (subject-oriented), that have been brought together from different systems (integrated), that remain unchanged the entire time (non-volatile), and that contain historical data (time variant), and, in addition, if this spreadsheet has been developed to support decision making, then this spreadsheet satisfies all the requirements specified in the specified definition. Ergo, this spreadsheet is a data warehouse. In fact, a lot of data marts that have been created would also satisfy this definition. However, I don't think this is Inmon's intention. In short, the definition has been defined too "wide".


Note that it's not only the definition of the concept data warehouse that is not reversible. It applies to definitions of many other popular concepts as well.


Isn't it about time we scrutinize all our definitions? If disciplines as chemistry, physics, and economy are able to come up with sound definitions, we should be able to do so as well. By the way, I am not even mentioning the fact that for certain concepts we don't have a definition at all.

Posted December 15, 2010 6:25 AM
Permalink | No Comments |

Leave a comment