Blog: Krish Krishnan Subscribe to this blog's RSS feed!

Krish Krishnan

"If we knew what it was we were doing, it would not be called research, would it?" - Albert Einstein.

Hello, and welcome to my blog.

I would like to use this blog to have constructive communication and exchanges of ideas in the business intelligence community on topics from data warehousing to SOA to governance, and all the topics in the umbrella of these subjects.

To maximize this blog's value, it must be an interactive venue. This means your input is vital to the blog's success. All that I ask from this audience is to treat everybody in this blog community and the blog itself with respect.

So let's start blogging and share our ideas, opinions, perspectives and keep the creative juices flowing!

About the author >

Krish is a recognized expert worldwide in the strategy, architecture and implementation of high performance data warehousing solutions. He is a visionary data warehouse thought leader and an independent analyst, writing and speaking at industry leading conferences, user groups and trade publications. He has authored two eBooks, more than 75 articles, viewpoints and case studies on business intelligence, data warehousing, and data warehouse appliances and architectures. In his 19 plus years of professional experience, he has been solving complex architecture problems spanning all aspects of data warehousing and business intelligence for Fortune 1000 clients. He has designed and tuned some of the world’s largest data warehouses.

The Vice President of Strategy at Chicago Business Intelligence Group, Krish teaches regularly at TDWI, DAMA, IRM UK and other conferences, and is helping drive and mature the data warehouse appliance market. Krish also serves as Associate Vice President of Programs for DAMA Chicago and is Ethics and Governance Advisor to DAMA International.

Editor's Note: More articles and resources are available in Krish's BeyeNETWORK Expert Channel. Be sure to visit today!

Recently in Data Warehouse Category

Data warehouses have been evolving over time and data within the warehouse will reach a maturity point beyond which, it will be obsolete. Is the simplistic answer to just delete or archive this data? No, just deleting the data creates space but not solve the underlying problem, which is the value of the data (read attribute in the E-R world) itself.

Sit back and take a look at the data evolution in your data warehouse. When you start examining the data evolution you see that your data architecture has been evolving over time to accommodate source system changes and end user demands constantly, while portions of the data elements from legacy source systems do not even get to the data warehouse anymore.

When you scream, my goodness why do we have all this extra data? You will realize that the data within the data warehouse does lose its value, thereby reaching an end of lifecycle. Now it makes sense all of a sudden that archiving the data is not the solution.

How do you determine the lifecycle of data in the data warehouse? To answer this question, you will need to have information about the following -
• Data Lineage
• Data Dependency Matrix
• Data Usage (This is an activity that needs more definition and methodology and will be addressed in subsequent blogs)
• Metadata – Business and Technical
• Report usage activity for reports with the legacy or obsolete data

Once you have the relevant information gathered about the data , you will compile a findings and recommendations document, meet with the data governance committee and get the final approval on the obsolete data removal.

Removing obsolete data along with its definition, benefits the data warehouse in the following areas
• Data Movement
• Data Processing
• Data Presentation
• Data Warehouse Performance

Remember that this is not a simple task and requires elaborate planning and execution. You will be changing the data definitions for dimensions in the data warehouse, thereby requiring testing and user approval before this goes to production.

Fortunately for us from a methodology perspective, Bill Inmon’s DW2.0 will serve as a blueprint in this exercise. For more information on DW2.0 please visit www.inmoncif.com or watch this space in upcoming months for articles around this topic.

A second part of this blog will address some technology specific ideas.


Posted August 6, 2007 12:31 PM
Permalink | 1 Comment |

In the recent years, there has been a surge in the volume of information that needs to be stored in any organization due to changes in regulatory policies primarily and due to mergers and acquisitions, global business expansions and other activities. In this background, organizations have started looking at the current investments into data warehousing and started leveraging the existing solution to accommodate more data. This is the correct approach, keeping in accordance with the definition of the data warehouse as the "single, integrated, non-volatile version of truth."

We all know that every organization engages teams to work on this task and a lot of planning, development, testing, etc. happens before this integration is done and after this integration has been completed. The question that begs to be answered is: Does the data warehouse serve all the facets of the organization with this data or is this data needed by everybody across the organization for any reason. If the answer is no, then why do we need to load this data into the data warehouse? Why cannot we just store it offsite and access as needed?

A different way to approach this matter is to consider solution architecture for your data warehouse. By this we are not discussing the design methodology, database technology or BI tools. We are discussing the overall approach to solve your data warehouse requirements. Look for further reading on this in an upcoming article in my Business Intelligence Network Data Warehouse Appliance Expert Channel

Krish


Posted August 2, 2007 6:30 AM
Permalink | No Comments |