Blog: Krish Krishnan Subscribe to this blog's RSS feed!

Krish Krishnan

"If we knew what it was we were doing, it would not be called research, would it?" - Albert Einstein.

Hello, and welcome to my blog.

I would like to use this blog to have constructive communication and exchanges of ideas in the business intelligence community on topics from data warehousing to SOA to governance, and all the topics in the umbrella of these subjects.

To maximize this blog's value, it must be an interactive venue. This means your input is vital to the blog's success. All that I ask from this audience is to treat everybody in this blog community and the blog itself with respect.

So let's start blogging and share our ideas, opinions, perspectives and keep the creative juices flowing!

About the author >

Krish is a recognized expert worldwide in the strategy, architecture and implementation of high performance data warehousing solutions. He is a visionary data warehouse thought leader and an independent analyst, writing and speaking at industry leading conferences, user groups and trade publications. He has authored two eBooks, more than 75 articles, viewpoints and case studies on business intelligence, data warehousing, and data warehouse appliances and architectures. In his 19 plus years of professional experience, he has been solving complex architecture problems spanning all aspects of data warehousing and business intelligence for Fortune 1000 clients. He has designed and tuned some of the world’s largest data warehouses.

The Vice President of Strategy at Chicago Business Intelligence Group, Krish teaches regularly at TDWI, DAMA, IRM UK and other conferences, and is helping drive and mature the data warehouse appliance market. Krish also serves as Associate Vice President of Programs for DAMA Chicago and is Ethics and Governance Advisor to DAMA International.

Editor's Note: More articles and resources are available in Krish's BeyeNETWORK Expert Channel. Be sure to visit today!

There has been a lot of debate recently on - "do we need a database to build a DW?". In my personal opinion, we will continue to need a database to build a DW, whether columnar or traditional or hybrid. The rationale for this build from the fact that we need to persist, prepare and deliver data to Analytical and BI applications, and this data when stored in a database provides multiple benefits than linking and associating multiple files.

The question is "do we need SQL" to use a database or can we live in a world of NoSQL. The answer in this case, is probably minimal to no SQL will be the future EDW world. The reason for this being technologies like Hadoop and MapReduce, which are reducing workload complexity from Applications. These applications are being increasingly built and delivered on the cloud and mobile platforms, which require a very light front-end footprint and heavy back-end processing power.

Another driving trend is the increasing adoption to semantic technologies. The semantic technologies propel another trend "in-memory analytics", whereby SQL overheads are minimized on query performance. Backend systems will be SQL intensive and will use a database

A third trend is to integrate "unstructured" or"semi-structured" data and query that result set, which is largely semantic driven.

In conclusion, we will use the DW  as a backend, number crunching platform, and slowly move away from "SQL" dependency on the front end, for building out Analytical and BI application.

Virtualization and Cloud will definitely be drivers, but I do not see EDW's or even large DW's being run on pureplay Cloud platforms. 


Posted December 25, 2010 3:21 PM
Permalink | 1 Comment |

1 Comment

I believe the real question as what is considered to be a database today ?

Before Hadoop and MapReduce, a database was a packaged product, it had several features to persist and retrieve data, but it had clear boundaries.

Files or source-data were still termed as files or in other words they were outside the territory of a 'database'. While a database is still a set of files, the collective operation through few control structures is what we've been commonly referring to as a database.

With the need to store and analyze more information, in the fastest time possible the definition of a 'database' has been stretched and its going through transformation of its own.

Expectation of a database 'setup' to be able to give a collective image of both database structures and file structures (as what you mentioned unstructured data) is the need of the day.

Other thing that has changed in the old database world is how it was setup. Underlying infrastructure for a database were mostly local and remote access and collective access was a "feature" in that world.

That has totally shifted today - while the necessacity to see them as a one (big) database has remained the same (and most important than others), infrastructure costs and constraints around that, paved ways for cloud computing. This shift has been so strong that we look back to our needs of the day and tier them up - spring a database only when you need.

On SQL, again I think we got some clues from SAS, on the possibility of interaction with database structures without a free form query layer as SQL.

Leave a comment