<-- Back to full color view

Data Warehouse Metrics

Originally published December 16, 2004

Everything has metrics. Roads have speed limits. People have a weight. Cars have a tachometer. Days have temperatures.

Metrics help us organize our thoughts and make meaningful comparisons. Without even thinking about it, metrics allow us to size up an event or a condition. And data warehouses are no different in this regard. We need meaningful measurements of a data warehouse if we are to have a basis for comparing one company’s warehouse vs. another company’s warehouse. It simply is human nature to want to have a few meaningful measures that quickly tell their own story.

With that in mind, here is a list of metrics for the data warehouse environment:

Size:

   Bytes for the data warehouse;

      Raw storage;

      Index space; and

      Work space.

ETL:

   Number of ETL programs;

   Frequency of execution;

   Amount of data passing through the ETL environment; and

   Where the ETL program is executed.

Data marts:

   Number of data marts attached to the data warehouse;

   Volume of data found in the data marts; and

   Frequency of replenishment. 

Data warehouse data:

   Frequency of access per byte; and

   Size of physical record.

Data warehouse structure:

      Number of tables;

      Number of rows per table; and

      Average size of row per table.

Data egress:

   By table, number of rows of data exiting the data warehouse;

   By table, frequency of exiting; and

   Criteria for egress.

Query access:

   Number of queries handled per day; and

   Average amount of data of each query.

History:

   Length of time data resides in the data warehouse.

Other structure supported:

   Exploration warehouse;

   Adaptive data mart;

   DSS applications;

   ERP applications;

   ODS;

   Near line storage; and

   Archival storage.

There you have it—a suggestion as to how to measure your data warehouse.

Of course this list can be expanded and altered in many ways. One problem with this list is that it refers to data only as of one moment in time. It is often useful to look at these metrics over time. So. you could add:

  • Current volume; and
  • Rate of growth of volume.

Or, you could section up the list by internal sub organizations. You might have something like this:

   Inventory table:

      Inv10229…..

      Inv569934…….

      Inv466510……

      Inv338761……

Often, it makes sense to logically group data together when the physical implementation does not have such a grouping.

Of course, you also could add the type of dbms the data warehouse is written in:

  • Table ABC—Teradata; and
  • Table BCD—IBM UDB, etc.

The modifications and additions are endless.

So, who is it that finds metrics useful?

  • Database administrators;
  • Database designers;
  • Database developers;
  • Users;
  • Contract administrators; and
  • System programmers and many more.

SOURCE: Data Warehouse Metrics

  • Bill InmonBill Inmon

    Bill is universally recognized as the father of the data warehouse. He has more than 36 years of database technology management experience and data warehouse design expertise. He has published more than 40 books and 1,000 articles on data warehousing and data management, and his books have been translated into nine languages. He is known globally for his data warehouse development seminars and has been a keynote speaker for many major computing associations.

    Editor's Note: More articles, resources and events are available in Bill's BeyeNETWORK Expert Channel. Be sure to visit today!

Recent articles by Bill Inmon

 

Comments

Want to post a comment? Login or become a member today!

Be the first to comment!

 

Copyright 2004 — 2019. Powell Media, LLC. All rights reserved.
BeyeNETWORK™ is a trademark of Powell Media, LLC