Data Warehousing: Relational vs. Multi-Dimensional Data
Published: April 6, 2005
The differences between Inmon's concept of relational data design and Kimball's multi-dimensional data design are at the heart of the controversy in defending their architectures.

Continuing with our view of similarities and differences between the Inmon and Kimball designs, we turn your attention to a "mixed" view or controversy concerning whether data in the data warehouse should be relationally designed data (Inmon) or whether the data should be of a multi-dimensional design and used in a logical collection of data marts (Kimball). 

Each architect addresses the level of granularity of the data required for his design, namely, the Inmon CIF or the Kimball BUS.

According to Bill Inmon: "The paradigm for relational data in the data warehouse should be at a low level of granularity and should be in third normal form... then it is possible to 'lightly denormalize' the data if [it] is commonly used in the denormalized form...

Relationally designed data warehouse data stored at a low level of granularity can be used in a wide variety of ways... [It]... can be used to support a wide variety of structures of data:

  • Exploration warehouses;
  • Data mining warehouses; and
  • OLTP data bases, etc."

The Inmon data warehouse is a physical repository of data, which can be used to build data marts in which the data can be in multi-dimensional or other forms.

Mr. Kimball also considers granularity and atomic level data to be key to his multi-dimensional design data warehouse or BUS. In Kimball University, Design Tip #21, Declaring the Grain" it states: "The most important step in a dimensional design is declaring the grain of the fact table. Declaring the grain means saying EXACTLY what a fact table represents... When you make a grain declaration, you can have a very precise discussion of which dimensions are possible and which ones are not...

Atomic data has the most dimensionality and so it can be constrained and rolled up in every way that is possible for that data source. Atomic data is a perfect match for the dimensional approach... higher levels of aggregation will almost always have smaller dimensions...

Since useful aggregations necessarily shrink dimensions and remove dimensions, it leads to the realization that aggregated data must always be used in conjunction with base atomic data, because aggregated data has less dimensional detail."

At this point in the comparison between Inmon and Kimball, it seems they agree on the need for atomic data and the need for it to be available when aggregated data is being used.

Mr. Kimball is emphatic on this point in defending data marts: "Some authors get confused on this point and after declaring that data marts necessarily consist of aggregated data, they criticize the data marts for 'anticipating business questions.'"

He then points out that the misunderstanding can be clarified by providing the atomic data along with the derivative aggregated data.

The next and last article will be a summary and will provide reader feedback and answers to readers' questions received during this series.


Recent articles by Katherine Drewek

Katherine Drewek -

Katherine has more than 30 years of experience in the editorial and corporate law environment. She has been responsible for the content review, editing and formatting of international newsletters focusing on business intelligence and data warehousing. She is a frequent lecturer and panelist for the American Bar Association, the National Association of Credit Managers and the Corporate Practice Institute. She has been a mentor for the Women's Leadership Conference. Katherine serves as Managing Editor for the Business Intelligence Network. Katherine can be reached at kdrewek@b-eye-network.com.

showing all