We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.

Blog: Krish Krishnan Subscribe to this blog's RSS feed!

Krish Krishnan

"If we knew what it was we were doing, it would not be called research, would it?" - Albert Einstein.

Hello, and welcome to my blog.

I would like to use this blog to have constructive communication and exchanges of ideas in the business intelligence community on topics from data warehousing to SOA to governance, and all the topics in the umbrella of these subjects.

To maximize this blog's value, it must be an interactive venue. This means your input is vital to the blog's success. All that I ask from this audience is to treat everybody in this blog community and the blog itself with respect.

So let's start blogging and share our ideas, opinions, perspectives and keep the creative juices flowing!

About the author >

Krish Krishnan is a worldwide-recognized expert in the strategy, architecture, and implementation of high-performance data warehousing solutions and big data. He is a visionary data warehouse thought leader and is ranked as one of the top data warehouse consultants in the world. As an independent analyst, Krish regularly speaks at leading industry conferences and user groups. He has written prolifically in trade publications and eBooks, contributing over 150 articles, viewpoints, and case studies on big data, business intelligence, data warehousing, data warehouse appliances, and high-performance architectures. He co-authored Building the Unstructured Data Warehouse with Bill Inmon in 2011, and Morgan Kaufmann will publish his first independent writing project, Data Warehousing in the Age of Big Data, in August 2013.

With over 21 years of professional experience, Krish has solved complex solution architecture problems for global Fortune 1000 clients, and has designed and tuned some of the world’s largest data warehouses and business intelligence platforms. He is currently promoting the next generation of data warehousing, focusing on big data, semantic technologies, crowdsourcing, analytics, and platform engineering.

Krish is the president of Sixth Sense Advisors Inc., a Chicago-based company providing independent analyst, management consulting, strategy and innovation advisory and technology consulting services in big data, data warehousing, and business intelligence. He serves as a technology advisor to several companies, and is actively sought after by investors to assess startup companies in data management and associated emerging technology areas. He publishes with the BeyeNETWORK.com where he leads the Data Warehouse Appliances and Architecture Expert Channel.

Editor's Note: More articles and resources are available in Krish's BeyeNETWORK Expert Channel. Be sure to visit today!

October 2007 Archives

As the year comes to an end, in the Data Warehouse landscape, if we start assessing the ups and downs of the industry, the success and failures of companies etc. One bright spot in this entire spectrum is the attention that Data Warehouse Appliances are finally getting.

In my opinion this concept is beyond a solution for addressing Data Warehousing needs or problems with speed and cost. It is a platform that can address data related problems in general. When we talk of speed and cost, we get so mired in looking at cost and often decide to hedge bets with known technologies with several limitations since the Appliance is yet to be proven large scale.
With the recent spate of activities and postings from vendors, analysts and other industry experts in how the appliance has been implemented, and the fact that TDWI 2007 Q4 conference is dedicating sessions for speakers to talk on this subject, proves that the appliance is coming to age and adoption to this technology will gain momentum in the next few years.
We will be seeing reports from the TDWI conference posted on the www.beyenetwork.com by my colleagues who will attend this event, while you all wait for those reports, I can assure you of one fact that right here and right now is the next best thing in Data Warehousing. Lets live the moment and applaud all of those individuals who have dreamt of this concept and brought it to life.

Posted October 26, 2007 11:17 AM
Permalink | No Comments |

From my previous posting, we started looking at the topic of auditing the data in the data warehouse and some benefits of doing this. Continuing on the same topic, due to its elaborate depth and breadth.

Everytime anybody asks me the need to audit the data within the data warehouse, and if it only solves the compliance need, my response is, no, it goes beyond making your warehouse compliant. It gives you insight into the business rules processing around data and how effective it is, it helps establish a rigor on how the ETL / ELT process is working both from a performance perspective and a logic perspective. The resulting data collected from this process, if reported on, gives the business user information about the data content that they have in their hands, the differences if any between source and data warehouse from a numbers perspective, some information on why data was rejected etc, which is extremely invaluable.

But above and beyond all these tangible benefits, there are two key points that need to be understood, by implementing an auditing process you get profound insight into

1. Data lineage - you get the ability to trace data from the data warehouse all the way back to source systems.
2. Data quality - a large benefit from the auditing exercise is your ability to handle the quality issues surrounding data.

Apart from the above mentioned benefits, you now will also have the information on things like ETL performance, server utilization, network latency which will start providing helpful insight into the overall solution architecture in the data loading process.

Let's assume that you do decide to embark on this process, if you are still in the process of building your data warehouse it is another change request; but if you have already deployed your data warehouse it is a different challenge. How to start this and how to go about making decisions etc will be a topic of discussion for the next day.

Posted October 23, 2007 2:27 PM
Permalink | No Comments |

Every business user has asked this question to their IT counterpart at least once. IT users often feels frustrated that they are held responsible for every byte of data that has been processed and are compelled to provide an audit of what they received from sources to what was processed and loaded to the data warehouse or datamart.

The question is not whether IT or business owns this process, but what is the process itself. If you have not thought of the answer by now, we are talking about the need to Audit your data warehouse. The process of auditing the data warehouse will provide an insight into all the areas of the data warehouse beginning from source systems and culminating in the datamart or reports. While you can do this process as a large scale enterprise initiative or s small group level initiative, this will clearly provide the business and IT users with answers on how data moves through the system, potential data flaws, business rules execution as key benefits. Other benefits of this initiative will be the ability to provide counts by each type of data, totals for sales data and the ability to explain discrepancies and correct them at the source system.

To be able to execute this type of a process in your data warehouse you need to be able to enforce rules of engagement like control files for source data, reference data for all the common entities across the data warehouse, common data format definitions etc. A business sponsor with an IT enabler will be required to make this initiative a success and they will need the backing and business case from a steering committee or a data warehouse governance body within the organization.

The success of this type of an initiative can be measured on the following

1. A data flow scorecard which will provide a traceability matrix.
2. The ability to detect and report anomalies
3. The ability to fix anomalies
4. Data consistency in statistics of good or bad data.
5. Peaks and valleys of data volumes
6. Inference on data quality to reported data information.

Overall the process of auditing the data warehouse is a lynchpin to the success of the data warehouse itself. It is interesting to note that in most cases this effort is often overlooked due to initial cost concerns and when there is mayhem in the data warehouse, this is the effort taken to solve the issue. How do we execute this process, when do we start and how do we integrate are all aspcets that will be covered in forthcoming blogs.

Posted October 10, 2007 9:00 AM
Permalink | No Comments |