We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.

Blog: Wayne Eckerson Subscribe to this blog's RSS feed!

Wayne Eckerson

Welcome to Wayne's World, my blog that illuminates the latest thinking about how to deliver insights from business data and celebrates out-of-the-box thinkers and doers in the business intelligence (BI), performance management and data warehousing (DW) fields. Tune in here if you want to keep abreast of the latest trends, techniques, and technologies in this dynamic industry.

About the author >

Wayne has been a thought leader in the business intelligence field since the early 1990s. He has conducted numerous research studies and is a noted speaker, blogger, and consultant. He is the author of two widely read books: Performance Dashboards: Measuring, Monitoring, and Managing Your Business (2005, 2010) and The Secrets of Analytical Leaders: Insights from Information Insiders (2012).

Wayne is founder and principal consultant at Eckerson Group,a research and consulting company focused on business intelligence, analytics and big data.

Say you have a ton of data in Hadoop and you want to explore it. But you don't want to move it into another system. (After all, it's big data so why move it?) But you don't want to go through the hassle and expense of creating table schemas in Hadoop to support fast queries. (After all, this is not supposed to be a data warehouse.) So what do you do??

You Hunk it. That is, you search it using Splunk software that creates virtual indexes in Hadoop. With Hunk, you don't have to move the data out of Hadoop and into an outboard analytical engine (including Splunk Enterprise). And you don't need to create table schemas in advance or at run time to guide (and limit) queries along predefined pathways. With Hunk, you point and go. It's search for Hadoop, but more scalable and manageable than open source search engines, such as SOLR, according to Splunk officials.

Hunk generates MapReduce under the covers, so it's not an interactive query system. However, it does stream results immediately once the job starts, so an analyst can see whether his search criteria generates the desired results. If not, he can stop the search, change the criteria, and start again. So, it's as interactive as batch can get.

Also, since Hunk is a Hadoop search engine, you cannot do basic things you can do with SQL, such as join tables or add up columns easily or store data in a more compressed format. But it does let you search or explore data without specifying schema or other advanced setup.

And unlike Splunk Enterprise which only runs against log and sensor data, Splunk Hunk (gotta love that product name) can run against any data because it processes data using MapReduce. For instance, Hunk can search for videos with lots of red in them by invoking a a MapReduce function that identifies color patterns in videos. You can also run queries that span indexes created in Splunk Enterprise and Hunk, making Hunk a federated query tool. And like Splunk Enterprise, Hunk supports 100+ analytical functions, making it more than just a Hadoop search tool.

So, if you're in the market for a bonafide exploration tool for Hadoop, try Hunk.

For more information, see www.splunk.com.

Posted March 17, 2014 7:22 PM
Permalink | No Comments |

Leave a comment

Search this blog
Categories ›
Archives ›
Recent Entries ›