Business Intelligence Network business intelligence resources

Blog: William McKnight

« 5 Success Factors for 2007 for Item-Level RFID Deployment | Main | Fields of Study for Careers to Come »

Logical Modeling and Physical Modeling

I’ve been putting some thought into data warehouse design issues in my recent three (and one more to go) DM Review column, but today I was thinking about what inhibits good data modeling. In other words, what are the practices leading to poor data warehouse data modeling?

The main ones really all have to do with the same thing – not doing a logical model before the physical model. The major reasons for this are:


· Thinking physical too early - The hardest part of modeling is getting the grain of each entity and the relationships right. It is difficult to tell the grain and “real” relationships (versus those that are being added for performance) when physical constructs are thrown in too early.

· Thinking usage over relationships too early - Sure, there are those target queries that you will test the model with. But if you model to them, you can easily make the model very difficult to accomplish future requirements with – those sitting below the peak of the proverbial iceberg.

· Thinking denormalization before the normalization is done - Anticipating downstream query performance shortcomings and considering them too early into the modeling process is a reason models fail. By the way, I'm using normalization here generically to mean normalizing or dimensional modeling - whichever you are doing.

As a rule, you should try to stay close to the logical model during implementation. Physical modeling should be iterative and based on performance testing. It’s not something you necessarily do with your modeling tool. If you “over-spec” the machine a little bit, and arm your toolbelt with ability to index, cluster, partition (which does not stray from the logical model) and build cubes, summaries and push complexities in a mart or at least outside the core model, what you gain in model understandability and manageability usually outweigh the added complexities of numerous rule exceptions, like denormalizations, in the model.

Technorati tags: Data Modeling, Data Warehouse, Normalization

  Posted by William McKnight on April 13, 2007 2:41 PM |

Post a comment