Data Warehouse: Ralph Kimball’s Vision

Originally published March 16, 2005

Mr. Kimball’s concept of the data warehouse has evolved over the last 20 or so years because of the ever-changing information technology environment. The complex business communities which need to access and analyze their data for accurate and profitable decision making have refined their requirements and reconfigured their queries as they themselves have evolved.

What started out being termed a “data mart” or a collection of “data marts” used the star schema(s) data models to provide direct access to the stored data by a specified class of user. The star schema approach has been viewed as a “Bottom Up” approach from those outside the Kimball group, as contrasted with the Bill Inmon approach, which has been termed “Top Down.”

The most accurate description regarding the Kimball approach, in the author’s opinion, comes directly from material from the Kimball website “Design Tip #49’Off the Bench’”:

    “When we wrote “The Data Warehouse Lifecycle Toolkit,” we referred to our approach as the Business Dimensional Lifecycle. In retrospect, we should have probably just called it the Kimball Approach as suggested by our publisher. We chose the Business Dimensional Lifecycle label instead, because it reinforced our core tenets about successful data warehousing based on our collective experiences since the mid-1980s.

  1. First and foremost, you need to focus on the business …. You must have one eye on the business’ requirements, while the other is focused on broader enterprise integration and consistency issues.
  2. The analytic data should be delivered in dimensional models for ease-of-use and query performance. We recommend that the most atomic data be made available dimensionally so that it can be sliced and diced “any which way.”
  3. While the data warehouse will constantly evolve, each iteration should be considered a project life cycle consisting of predictable activities with a finite start and end ...”

As the above-referenced Design Tip was written expressly to refute the “Bottom Up” label for the Kimball approach, it went on to explain that the Kimball approach recommends developing an “enterprise data warehouse bus matrix.”

Design Tip 49 continues:

“Finally, we believe conformed dimensions (which are logically defined in the bus matrix and then physically enforced through the staging process) are absolutely critical to data consistency and integration. They provide consistent labels, business rules/definitions and domains that are re-used as we construct more fact tables to integrate and capture the results from additional business processes/events.”

The above excerpts from the design tip describe the more current Kimball approach, which is called the “data warehouse bus architecture.”

This architecture is comprised of:

  • A staging area (which can have an E/R or relationally designed 3NF design or flat file format), which cannot be accessed by an end-user of the data warehouse bus.
  • The Data Warehouse Bus itself which includes several atomic data marts, several aggregated data marts and a personal data mart but no single or centralized data warehouse component.

The Data Warehouse Bus:

  • Is dimensional;
  • Contains transaction and summary data;
  • Includes data marts, which have single subject or fact tables; and
  • Can consist of multiple data marts in a single data base.

According to the article by Ralph Kimball and Margy Ross, “Differences of Opinion” in Intelligent Enterprise, March 2004, in the Data Warehouse Bus Architecture:

“…a dimensional model contains the same information as a normalized [3NF] model, but packages it for ease-of-use and query performance…. It includes both atomic detail and summarized information….Queries descend to progressively lower levels of detail, without reprogramming….

Dimensional models are built by business processes … not business departments. Once foundation business processes are available in the warehouse, consolidated, dimensional models deliver cross-process metrics. The enterprise data warehouse identifies and enforces the relationship between business process metrics (facts) and descriptive attributes (dimensions).”

A fundamental concept of the Kimball Data Warehouse Bus design is that in this approach the data warehouse is not a physical repository of the data as in the Inmon approach. It is “virtual.” It is a collection of data marts, each having a star schema design at its base.

Our next article will describe Mr. Inmon’s Concept of the data warehouse and set the stage for the Corporate Information Factory.

  • Katherine Drewek

    Katherine (1950-2010) had more than 30 years of experience in the editorial and corporate law environment. She was responsible for the content review, editing and formatting of international newsletters focusing on business intelligence and data warehousing. She was a frequent lecturer and panelist for the American Bar Association, the National Association of Credit Managers and the Corporate Practice Institute. She had been a mentor for the Women's Leadership Conference and served as Managing Editor for BeyeNetwork.

Recent articles by Katherine Drewek



 

Comments

Want to post a comment? Login or become a member today!

Be the first to comment!