We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.


Blog: Krish Krishnan Subscribe to this blog's RSS feed!

Krish Krishnan

"If we knew what it was we were doing, it would not be called research, would it?" - Albert Einstein.

Hello, and welcome to my blog.

I would like to use this blog to have constructive communication and exchanges of ideas in the business intelligence community on topics from data warehousing to SOA to governance, and all the topics in the umbrella of these subjects.

To maximize this blog's value, it must be an interactive venue. This means your input is vital to the blog's success. All that I ask from this audience is to treat everybody in this blog community and the blog itself with respect.

So let's start blogging and share our ideas, opinions, perspectives and keep the creative juices flowing!

About the author >

Krish Krishnan is a worldwide-recognized expert in the strategy, architecture, and implementation of high-performance data warehousing solutions and big data. He is a visionary data warehouse thought leader and is ranked as one of the top data warehouse consultants in the world. As an independent analyst, Krish regularly speaks at leading industry conferences and user groups. He has written prolifically in trade publications and eBooks, contributing over 150 articles, viewpoints, and case studies on big data, business intelligence, data warehousing, data warehouse appliances, and high-performance architectures. He co-authored Building the Unstructured Data Warehouse with Bill Inmon in 2011, and Morgan Kaufmann will publish his first independent writing project, Data Warehousing in the Age of Big Data, in August 2013.

With over 21 years of professional experience, Krish has solved complex solution architecture problems for global Fortune 1000 clients, and has designed and tuned some of the world’s largest data warehouses and business intelligence platforms. He is currently promoting the next generation of data warehousing, focusing on big data, semantic technologies, crowdsourcing, analytics, and platform engineering.

Krish is the president of Sixth Sense Advisors Inc., a Chicago-based company providing independent analyst, management consulting, strategy and innovation advisory and technology consulting services in big data, data warehousing, and business intelligence. He serves as a technology advisor to several companies, and is actively sought after by investors to assess startup companies in data management and associated emerging technology areas. He publishes with the BeyeNETWORK.com where he leads the Data Warehouse Appliances and Architecture Expert Channel.

Editor's Note: More articles and resources are available in Krish's BeyeNETWORK Expert Channel. Be sure to visit today!

September 2007 Archives

Whenever a debate comes up amongst RDBMS vendors on who is the fastest database on which platform, you are directed to TPC results and how each one can surpass the other in controlled tests. Well when we talk transaction processing that is one thing, but if we talk about the data warehouse, this takes a whole different meaning.

I do not mean to belittle the TPC and their benchmarks, being a DBA myself for a number of years. But when we talk of the data warehouse ecosystem, we are not talking about small discrete transactions, we are talking about mixed queries that need processing power from the underlying database, storage and network.

The current stack of database, storage and network platform, with all the power and processing capabilities have yet not solved the issue of query speed and sustained performance. Here is where the end users are left frustated and IT often helpless, since the underlying platform cannot cope with the expanding demands, which arise from the fact that the underlying architecture of these traditional solutions are SMP based.

The net result is more spending all over to ensure sustained speed and performance while paying for database and code redesign and deployment. Does this cycle ever slow down lest stop, is there an alternative?.

We do have an answer when we talk of less spend for more performance. Before you go on your next splurge, take a look at the Data Warehouse Appliance. If you need answers ask the hard questions. This is one compiled stack of database, storage and network all built for performance and sustained scalability and most importantly based on MPP architecture.

There are multiple vendors in the market with Data Warehouse Appliances\ offerings, each of them have a solution that can be applied to solve specific problems. Yes they are all up and coming technologies, but rewind the clock and so were the database, storage and network vendors in the early years.

No early adopter of this technology has claimed failure so far, that itself is a testimonial to the fact that the Appliance technology is here to stay and help solve the problems that cannot be solved by the SMP stack. It is not just being MPP that matters, but building the right solution offering at the right price is where the Appliances are making the inroads.

Whether we choose to look at this technology or choose to ignore it for all reasons, it is becoming clear that SMP architecture cannot fuel the data warehouse ecosystem for a long time and will need the MPP architecture to co-exist and provide the combined platfom.


Posted September 23, 2007 8:13 PM
Permalink | 1 Comment |

Historical data (greater than 3 years from today), Is it needed all the time by your users? can you afford to move your historical data to an offsite location, will you be able to get it back and have it available in the data warehouse when required? if the answer is yes, then you are in good shape, if the answer is no to any of these thoughts, then read on.

Look at the overall cost of what historical data does to your bottom line

1 Increases the cost of storage
2. Increases the response time of your queries
3. Increases the time you take to load data to the warehouse

Well most of us know the obvious issues as listed above, but there are other issues that are often overlooked

1. Metadata and Master Data for the historical data needs to be maintained in the data warehouse.
2. If ETL code was developed in increments over the historical data, it is an additonal overhead that needs to be maintained.
3. If the historical data is archived and needs to be brought back then there is an issue of missing the metadata and thus losing the interface to the data.
4. Integration of the historical data to the new data always causes issues.
5. Data content between historical and current data fields might have changed, this is an impact when you want to do comparison reporting.

When you start looking at the overall impact of both maintaining and not maintaining the historical data, you will be confused as to which way to proceed. A correct decision to be made in this regard will require you to look at the value that this data will provide to your business. In other words, before you decide how to manage a data lifecycle within the data warehouse, involve the business, by taking this problem to your data governance and steering committee. Start showing the business the pains that they are facing from a volume of data and explain to them the different options of how you can mitigate this pain. One such option is to consider a new platform to store your historical data onsite at alower cost of ownership.

A data warehouse appliance will help you achieve the ability to store your historical data and still not pay in storage and reloading costs. All of the appliance vendors are ANSI SQL compatible and you should have no issues in pointing your BI tools to this platform for analyzing historical data. If you need to do comparison analysis, you can still build a special dataset on this platform and then bring it to your data warehouse for further reporting.

Bottom line is that there are new and flexible options in the technology for your data warehouse solution architetcure, how soon will you adopt to it depends on how intense your pain and financial drain is.


Posted September 13, 2007 10:28 AM
Permalink | No Comments |