We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.


Accelerating the Adoption of Hadoop: A Spotlight Q&A with Jim Vogt of Zettaset

Originally published July 25, 2012

This BeyeNETWORK spotlight features Ron Powell's interview with Jim Vogt, CEO of Zettaset. Ron and Jim talk about how Zettaset is increasing adoption of Hadoop by providing the resiliency, high-availability, security, monitoring, and management required for production environments.

BeyeNETWORK Spotlights focus on news, events and products in the business intelligence ecosystem that are poised to have a significant impact on the industry as a whole; on the enterprises that rely on business intelligence, analytics, performance management, data warehousing and/or data governance products to understand and act on the vital information that can be gleaned from their data; or on the providers of these mission-critical products.

Presented as Q&A-style articles, these interviews conducted by the BeyeNETWORK present the behind-the-scene view that you won’t read in press releases.

Jim, please give our audience an overview of Zettaset and what sets you apart from all the other Hadoop vendors.

Jim Vogt: Hadoop is a great tool whose main function is to support the distributed processing of large data sets across clusters of computers using map reduction, and getting all types of raw data into a controlled and structured environment so it can be mined for analytic purpose. Zettaset is an enterprise software company, and we take big data and Hadoop to the next level. We put it in a product package called Orchestrator™, which is actually a consumable, enterprise-grade solution for the enterprise to deploy in production environments. If you look at the early Hadoop projects, you will see a lot of evaluations, but only a very small percentage of implementations that are actually in production environments. The reason is that the Hadoop open source itself really doesn't meet enterprise requirements in terms of resiliency, high-availability, security, monitoring, and management – all the types of things that enterprises need for deployment in a production environment.

As an enterprise software company, we know how to package Hadoop because we know what enterprises want in terms of big data features and requirements. Our Orchestrator™ product is easy to install, doesn’t require human intervention, is highly secure, and is highly available.

What I mean by high availability is that our solution doesn't require human intervention to keep the cluster up and running. In other words, it's all done through software, and that's very different from our competition, namely Cloudera, in that they talk about high availability but human intervention is required.

In terms of security, the biggest piece of that is role-based access control, which means that you're able to control who's looking at what jobs and who's doing what. You can do it on a user and group level and really set policy, which is powerful for the enterprise in terms of controlling a solution like this.

The Zettaset big data software is definitely portable as you've seen in our announcements with STEC, with Fusion-io through AMAX, and Hyve. We've been able to integrate a solution with high-performance computing technology, with SSD, because this is how people are building networks. Essentially, being able to mix in those hybrid environments is very key to us in terms of servicing the enterprise, and it’s also very key to the enterprise.

The last piece is that we’re distro agnostic, which means that Zettaset Orchestrator™ works with multiple distributions of Hadoop. As you know, we have established a commercial licensing relationship with IBM for integration of the Zettaset Orchestrator™ solution with the IBM BigInsights distro. There are also distributions from Cloudera, Hortonworks, and MapR. We can work with any of those distros, and this creates more flexibility and better options for the enterprise. The value that we add is really goes beyond Hadoop because it's going to take quite a while for Hadoop to catch up with these types of enterprise functions and needs. We do that in our software.

What do you see as the key barriers to Hadoop adoption for the enterprise? You mentioned some of them, but are there others?

Jim Vogt: Well, the first barrier is really usability of the technology, just getting Hadoop up and working because it is open source. That’s not well documented, and I think that's what everybody's trying to fix at the distro level. But even above the distro level, getting Hadoop to work in your existing environment, getting it to work on your iron or on a different piece of iron – those are the startup issues that we take care of in an automated way. Zettaset Orchestrator™ is actually able to load up on top of Red Hat/Linux; and without human intervention, we can bring the clusters up all the way through the distro level. Beyond that, we also can auto-load our software.

The next barrier is going to be meeting the functional needs that make an enterprise comfortable enough to deploy us in the data center. Those are some of the things I was talking about in terms of managing, monitoring, logging, rule-based access control, security, encryption, and high availability. It’s not just the namenode failover, which people talk about. There are at least 30 known exploits within Hadoop, and any one of those processes going down could actually take the cluster down with it. We focus on monitoring those processes so that we can actually keep the cluster up. Here's an interesting fact: Kerberos was adopted by the community to address security. Number one, that alone does not address security. Number two, if Kerberos fails and stops issuing tickets, you can no longer process jobs and the cluster goes down. So actually Kerberos is a vulnerability point for high availability.

Those are the types of barriers facing people who are trying to deploy Hadoop. Not only does it have to be easy to use, but it also has to meet enterprise requirements for putting it in the data center. I like to say that you don't need to be a mechanic to drive a car – you don't really have to know how the engine works. In this case, Hadoop is the engine, and Zettaset is building the car around the engine so people can drive it.
 
That's a really good analogy. In the next five years, how do you see the Hadoop market evolving?

Jim Vogt: I think that it's going to be very much like the local area network market. Once we started running, for example, local area network technology over phone wire versus coax, it exploded. The challenge is getting Hadoop in a usable format because it really is the best tool available for getting all of the varied sources of data – structured and unstructured – into a repository so it can be mined and used. Hadoop is going to be a key part of that big data infrastructure, and a lot of these enterprise requirements will get into the community and eventually end up in the distribution. But there is a long list of enterprise requirements that will develop over the next few years, and I think that's where the focus will be. You know what's funny is that people talk about big data, data scientists and the analytics. In reality, people already have the analysis tools and the algorithms in place. What Hadoop offers for them is a broader set of information from varied sources that we can actually put in a manageable and mineable format. It’s the classic scenario: I'll make a better decision if I have more information. I think we're going to see Hadoop expanding because of the data explosion and the need to keep larger and larger repositories that people can use and mine in a very controllable manner.

Why do you feel unstructured data is so important to the enterprise with Hadoop?

Jim Vogt: If you look at some of the data that people are gathering, especially, in the banking industry – the transactional type data that's hard to capture and structure, mobile phone data, and social networking data – those are typically not well-structured data sources. Being able to include those new points of data in the decision process is where you really benefit from Hadoop. We’ve all heard about big data in terms of velocity, volume, and variety of data, right? We always talk about variety in terms of structured and unstructured, but it's also the velocity and volume that need to be dealt with. Being able to manage that and mine it to a usable dataset is where Hadoop really provides benefit. I think that is a critical piece of the solution.

What are the vertical markets that you're focusing on with Zettaset and why?

Jim Vogt: Well, for Zettaset, a key strategy is working with system integrators who focus on specific verticals. Also, our system partners play into multiple verticals as well. But what's interesting about big data is that it actually blankets the verticals for different reasons.

In healthcare, the focus is on compliance, being able to gain efficiencies in processing patients, and limiting liability. You have merchandisers trying to predict buyer paths and purchasing patterns based on unstructured and structured data. In the petroleum industry, they want to make faster decisions on sites in terms of where they’re extracting their oil. Across the board, there are very different applications. In financials, for example, it's all about security and being proactive in detecting fraudulent acts. It all comes down to broader sources of data, mining the data, and making decisions with existing algorithms. In insurance it’s all about better efficiency and claim processing. It really is across the board in terms of the verticals that are covered.
 
What does it take to install Zettaset? I would think with Hadoop the data sizes could vary significantly. Does size have any impact?

Jim Vogt: Installing Zettaset Orchestrator™ is very straightforward. We can have the clusters up and running with Hadoop within 20 minutes in an automated fashion. From there, we have standard configurations that exist in our software but obviously, you can customize that. Typically, when we consult with customers, we ask them to identify their sources of data and the types of jobs they are trying to do. We assist them in customizing Zettaset to their application. But that's a matter of hours, not days or weeks, and it's something that we offer as a service which is included with our products.

Jim, it appears that Zettaset really accelerates taking Hadoop from more of a prototype non-production environment into actual production. Is that right?

Jim Vogt: Absolutely, and our experience actually shows that. The team at Zettaset has tremendous experience selling and developing different pieces of technology for the enterprise, and it has spanned across network infrastructures, security, and the cloud. We have a great deal of experience with all these different types of solutions and technology, and we have delivered those to enterprises. Now we’ve listened to customers who have told us what they need to put Hadoop into big data production environments, and we’ve built a product around those needs. Our approach will absolutely increase and accelerate the adoption of Hadoop.

Jim, it's been a pleasure talking to you today to learn how Zettaset is taking Hadoop to the next level.

  • Ron PowellRon Powell
    Ron is an independent analyst, consultant and editorial expert with extensive knowledge and experience in business intelligence, big data, analytics and data warehousing. Currently president of Powell Interactive Media, which specializes in consulting and podcast services, he is also Executive Producer of The World Transformed Fast Forward series. In 2004, Ron founded the BeyeNETWORK, which was acquired by Tech Target in 2010.  Prior to the founding of the BeyeNETWORK, Ron was cofounder, publisher and editorial director of DM Review (now Information Management). He maintains an expert channel and blog on the BeyeNETWORK and may be contacted by email at rpowell@powellinteractivemedia.com. 

    More articles and Ron's blog can be found in his BeyeNETWORK expert channel.

Recent articles by Ron Powell

 

Comments

Want to post a comment? Login or become a member today!

Be the first to comment!