Business Intelligence Network Business Intelligence Resources

Blog: Krish Krishnan

« Why you need Data profiling | Main | The Need for Speed »

A Scalable Architecture

Why does scalability need to be considered when you are selecting an architecture for your data warehouse?

Whenever you see performance problems in the data warehouse the knee jerk reaction is to start tuning the data warehouse rdbms platform, os platform, queries and end user applications. While this is a good stop gap, you will start running in circles on the query tuning and platform settings since tuning for one application or user type will affect someone else adversely. This is a day to day situation faced by all of us in a data warehouse.

In a hindsight thought process all of us start examining what type of growth from a data and an user perspective has occurred in the data warehouse and what workload is the data warehouse executing on a daily basis.

This is where the scalability question arises. When you start the design process for a data warehouse, you need to examine the type of queries and application that will use the solution and what kind of mixed workload will you need to anticipate. The reason you need this exercise is when you can predict the overall volumetric growth in terms of data, you can also predict how much your infrastructure will be used in supporting the different types of workload, by running sample workload queries and simulating the users.

If you have not considered the scalability exercise while you designed the data warehouse due to whatever reason, best rdbms, vendor best practices etc, whenever you come across this problem, start doing the scalability exercise on your infrastructure. You will start understanding the limitations on the traditional data warehouse architecture due to infrastructure constraints at the end of this exercise, leaving you quite frustrated and your CFO fuming at the dollar spend and any proposed spend.

An alternative way to approach the infrastructure limitation is to explore the newer advances in technology one of which is the data warehouse appliance. There are multiple articles and white papers on the subject for academic reading. The data warehouse appliance is targeted to be augmented into the data warehouse architecture to address the scalability issue. It is built ground up with addressing the question of sustained performance at lower costs.

While I'm not saying that by implementing a data warehouse appliance, you have a silver bullet to answer your scalability needs, I'm assuring you that it is worth your while to start looking at this addition to your data warehouse architecture in the future to ensure that scalability needs are met.

  Posted by kkrishnan on August 22, 2007 10:10 AM |

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)