We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.


Blog: Wayne Eckerson Subscribe to this blog's RSS feed!

Wayne Eckerson

Welcome to Wayne's World, my blog that illuminates the latest thinking about how to deliver insights from business data and celebrates out-of-the-box thinkers and doers in the business intelligence (BI), performance management and data warehousing (DW) fields. Tune in here if you want to keep abreast of the latest trends, techniques, and technologies in this dynamic industry.

About the author >

Wayne has been a thought leader in the business intelligence field since the early 1990s. He has conducted numerous research studies and is a noted speaker, blogger, and consultant. He is the author of two widely read books: Performance Dashboards: Measuring, Monitoring, and Managing Your Business (2005, 2010) and The Secrets of Analytical Leaders: Insights from Information Insiders (2012).

Wayne is founder and principal consultant at Eckerson Group,a research and consulting company focused on business intelligence, analytics and big data.

For awhile the Hadoop community was proselytizing the new open source distributed file system as a relational database killer. But wiser minds have prevailed, namely that of Mike Olson, long-time database executive and current CEO of Cloudera, a leading distributor of Hadoop and related open source add-ons.

I recently sat down with Olson and Jon Kreisa, Cloudera VP of Marketing, and heard loud and clear that Hadoop plays a complementary role to relational-oriented data warehouses and BI tools. "It would be foolish for us to duplicate the functionality of a relational database which has more than 20 years of development behind it," says Olson.

According to Olson, Hadoop's sweet spot is processing large volumes of semi-structured and unstructured data in batch-oriented programs written by developers. Many BI architects see Hadoop as a perfect environment for staging and processing large volumes of clickstream and other unconventional data not commonly stored in a data warehouse.

In effect, Hadoop serves as staging area and ETL system to filter and process "big data" so it can loaded into a data warehouse and joined with other corporate data for reporting and analysis purposes. Hadoop also makes a terrific low-cost archival system that enables companies to keep all their data online without having to summarize it or migrate it to tape.

Last year, Cloudera notched partnerships with a bevy of relational database vendors, who also see the complementary nature of Hadoop to their data warehousing business. This year, Olson says, Cloudera will establish partnerships with multiple ETL and BI vendors, solidifying Hadoop's position as a key component in a large-scale BI architecture. Already, Cloudera has partnered with database, ETL, and BI vendors to create bridges between the two worlds. Database partners of Cloudera include Aster Data, Greenplum, Membase, Netezza, Quest, Teradata, and Vertica. Its ETL partners include Informatica, Pentaho Data Integration, and Talend. And its BI vendors include Jaspersoft, MicroStrategy, and Pentaho.

In finishing, Olson admitted that despite the current cooperation between the Hadoop and BI communities, each is aggressively developing capabilities offered by the other, which will eventually minimize the need for such partnerships. In fact, many large Internet companies, including eBay which recently spoke on a Cloudera Webcast, said they are using Hadoop for reporting and analysis as well as staging, archiving, and preprocessing.

So, while the two camps are playing nice today, the battle has only just begun!


Posted December 2, 2010 2:52 PM
Permalink | No Comments |

Leave a comment

    
Search this blog
Categories ›
Archives ›
Recent Entries ›