We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.


Blog: Krish Krishnan Subscribe to this blog's RSS feed!

Krish Krishnan

"If we knew what it was we were doing, it would not be called research, would it?" - Albert Einstein.

Hello, and welcome to my blog.

I would like to use this blog to have constructive communication and exchanges of ideas in the business intelligence community on topics from data warehousing to SOA to governance, and all the topics in the umbrella of these subjects.

To maximize this blog's value, it must be an interactive venue. This means your input is vital to the blog's success. All that I ask from this audience is to treat everybody in this blog community and the blog itself with respect.

So let's start blogging and share our ideas, opinions, perspectives and keep the creative juices flowing!

About the author >

Krish Krishnan is a worldwide-recognized expert in the strategy, architecture, and implementation of high-performance data warehousing solutions and big data. He is a visionary data warehouse thought leader and is ranked as one of the top data warehouse consultants in the world. As an independent analyst, Krish regularly speaks at leading industry conferences and user groups. He has written prolifically in trade publications and eBooks, contributing over 150 articles, viewpoints, and case studies on big data, business intelligence, data warehousing, data warehouse appliances, and high-performance architectures. He co-authored Building the Unstructured Data Warehouse with Bill Inmon in 2011, and Morgan Kaufmann will publish his first independent writing project, Data Warehousing in the Age of Big Data, in August 2013.

With over 21 years of professional experience, Krish has solved complex solution architecture problems for global Fortune 1000 clients, and has designed and tuned some of the world’s largest data warehouses and business intelligence platforms. He is currently promoting the next generation of data warehousing, focusing on big data, semantic technologies, crowdsourcing, analytics, and platform engineering.

Krish is the president of Sixth Sense Advisors Inc., a Chicago-based company providing independent analyst, management consulting, strategy and innovation advisory and technology consulting services in big data, data warehousing, and business intelligence. He serves as a technology advisor to several companies, and is actively sought after by investors to assess startup companies in data management and associated emerging technology areas. He publishes with the BeyeNETWORK.com where he leads the Data Warehouse Appliances and Architecture Expert Channel.

Editor's Note: More articles and resources are available in Krish's BeyeNETWORK Expert Channel. Be sure to visit today!

Recently in Data Warehouse Category

Data warehouses have been evolving over time and data within the warehouse will reach a maturity point beyond which, it will be obsolete. Is the simplistic answer to just delete or archive this data? No, just deleting the data creates space but not solve the underlying problem, which is the value of the data (read attribute in the E-R world) itself.

Sit back and take a look at the data evolution in your data warehouse. When you start examining the data evolution you see that your data architecture has been evolving over time to accommodate source system changes and end user demands constantly, while portions of the data elements from legacy source systems do not even get to the data warehouse anymore.

When you scream, my goodness why do we have all this extra data? You will realize that the data within the data warehouse does lose its value, thereby reaching an end of lifecycle. Now it makes sense all of a sudden that archiving the data is not the solution.

How do you determine the lifecycle of data in the data warehouse? To answer this question, you will need to have information about the following -
• Data Lineage
• Data Dependency Matrix
• Data Usage (This is an activity that needs more definition and methodology and will be addressed in subsequent blogs)
• Metadata – Business and Technical
• Report usage activity for reports with the legacy or obsolete data

Once you have the relevant information gathered about the data , you will compile a findings and recommendations document, meet with the data governance committee and get the final approval on the obsolete data removal.

Removing obsolete data along with its definition, benefits the data warehouse in the following areas
• Data Movement
• Data Processing
• Data Presentation
• Data Warehouse Performance

Remember that this is not a simple task and requires elaborate planning and execution. You will be changing the data definitions for dimensions in the data warehouse, thereby requiring testing and user approval before this goes to production.

Fortunately for us from a methodology perspective, Bill Inmon’s DW2.0 will serve as a blueprint in this exercise. For more information on DW2.0 please visit www.inmoncif.com or watch this space in upcoming months for articles around this topic.

A second part of this blog will address some technology specific ideas.


Posted August 6, 2007 12:31 PM
Permalink | 1 Comment |

In the recent years, there has been a surge in the volume of information that needs to be stored in any organization due to changes in regulatory policies primarily and due to mergers and acquisitions, global business expansions and other activities. In this background, organizations have started looking at the current investments into data warehousing and started leveraging the existing solution to accommodate more data. This is the correct approach, keeping in accordance with the definition of the data warehouse as the "single, integrated, non-volatile version of truth."

We all know that every organization engages teams to work on this task and a lot of planning, development, testing, etc. happens before this integration is done and after this integration has been completed. The question that begs to be answered is: Does the data warehouse serve all the facets of the organization with this data or is this data needed by everybody across the organization for any reason. If the answer is no, then why do we need to load this data into the data warehouse? Why cannot we just store it offsite and access as needed?

A different way to approach this matter is to consider solution architecture for your data warehouse. By this we are not discussing the design methodology, database technology or BI tools. We are discussing the overall approach to solve your data warehouse requirements. Look for further reading on this in an upcoming article in my Business Intelligence Network Data Warehouse Appliance Expert Channel

Krish


Posted August 2, 2007 6:30 AM
Permalink | No Comments |