Built for Big Data - a Spotlight Q&A with Michael Maxey of EMC Greenplum

Originally published December 19, 2011

BeyeNETWORK Spotlights focus on news, events and products in the business intelligence ecosystem that are poised to have a significant impact on the industry as a whole; on the enterprises that rely on business intelligence, analytics, performance management, data warehousing and/or data governance products to understand and act on the vital information that can be gleaned from their data; or on the providers of these mission-critical products.

Presented as a Q&A-style article, these interviews conducted by the BeyeNETWORK present the behind-the-scene view that you won’t read in press releases.

This BeyeNETWORK spotlight features Ron Powell's interview with Michael Maxey, Senior Director of Product Marketing at EMC Greenplum. Ron and Mike talk about how organizations are tackling their "big data" analytics opportunities as they follow the lead of the Internet giants.

Mike, the landscape of enterprise architecture has obviously changed dramatically over the years. In the last 12 months we've seen some significant developments. Not only has big data become the most talked about phenomenon in our industry in recent memory, but also we've seen the introduction of alternative databases from Apache Hadoop to NoSQL. We've always had columnar approaches for handling not only big data but also data analytics for big data. Let's start by having you tell us why big data is getting so much attention and how it fits in your plans.

Michael Maxey: I think mainstream business is starting to recognize what the Internet companies already have figured out. If you think about Facebook and Google, these companies are fundamentally data-driven, agile and they've figured out a way to extract business value and, in some cases, actually monetize vast amounts of data. What enterprise, what company wouldn't want that? I think that's why we're seeing this craze around big data today. The enterprises are recognizing that there is something there, and they want to follow the lead of the Internet companies.

Well, it's no secret that having data – big or small – is of little value if it isn't available for timely and cost-effective analysis. As you're working with potential customers, what challenges do they tell you they're facing as they try to develop or improve their analytics efforts?

Michael Maxey: That's a great question, and you probably think my answer is going to be about technology but actually it's not. I think the biggest challenge our customers face is the challenge of change. What has evolved is here is kind of human nature. There are companies who are excited about it. They're the early adopters, the companies that are jumping head first into this. But for the rest of the world, the early majority or the majority in the market space, change can be difficult. I think that fundamentally we are going through an evolution. We're going beyond BI, and it's a tough change for companies to really digest. We can get into specifics – there are silos everywhere, or this tool won't scale, or this database is too slow. Those things are really easily identified and fixed, but what's harder to fix is this fear of change. That's the big challenge we see.

Mike, you talk about changes and challenge. If I were a customer looking at this big data effort coming my way, how would you help handle that challenge as well as other challenges?

Michael Maxey: As I said, the problem is change, and the important answer is to get started now. You need to get started in a way that's easy to comprehend and easy for the business to say yes to. What we have is a powerful set of what we call analytic labs that we offer to our customers. What's interesting about these labs is they're really big data proofs of concept (POCs). It's not the classic POC of speeds and feeds in data warehousing. With big data, it's a new kind of POC, and we do it with an analytic lab. Let me explain our analytics labs. It's the combination of our unified analytics platform and, more importantly, our experts, the team of data scientists. They're experts in the analytic tools, and many are PhDs. We bring that team along with our platform into the organization and work on the customer’s data with the customer's team, doing a project and extracting real value out of it. We think a lab is the right way to do it because it brings everybody to the table, it brings value from day one, and that's really how we're helping customers overcome that challenge of change.

Well, that seems very innovative. Basically, they can derive a benefit from doing one of these labs in just a couple of days when your team comes in. Is that right?

Michael Maxey: Absolutely. We've had great success with these. It's a different view for the business. They're very used to vendors trying to sell them a product and then walking away. This a very different approach.

Are there any industries that have more big data challenges than others, and can you give us some examples?

Michael Maxey: When we think of big data, we like to think of an opportunity not a challenge but I do want to answer your question. So to that, I think that if there are challenges, it's probably coming from the industries that require an additional layer of governance with what they do. Two sectors come to mind – one is healthcare and the other is the finance sector. It's a pretty big challenge to think about how patient privacy fits into the world of big data and how to protect that privacy but still be able to mine all that information. With all the governance and security requirements around those two sectors, healthcare and finance are probably the industries that have bigger opportunities – or bigger challenges – in front of them.

Do you see other industries that will be looking at big data as well?

Michael Maxey: Well, I think all industries are really looking at it. I'm actually in Chicago today for a conference around predictive analytics, and it's really interesting the wide array of different companies. There's everything from the retail sector, which has been doing analytics for 30+ years, to new emerging startups and everything in between. I think to your earlier question around the hype and the noise, I think it's definitely permeating lots of industries. But I think the challenges are greater for those industries that have the heavy regulation and tougher privacy rules.

Now I recently read that you and other vendors have announced a strategy related to Apache Hadoop. Hadoop is everywhere. Can you tell us about it and why Greenplum choose to focus on Apache Hadoop?

Michael Maxey: I think Hadoop is really big, and we made a decision around that as we support our customer needs. We try to make it clear that our company, Greenplum, is not just a database. We've been using a lot of that messaging, and there's a reason for it. Don't get me wrong, we have an award-winning, very powerful MPP database, but it only addresses about half the story. That's because a lot of the data these companies are looking to extract value from is unstructured, and Hadoop is the right platform for unstructured data.

This goes back to my first comment about mainstream companies learning from the Internet companies. It is Hadoop that these Internet companies are using to become data-driven and monetize their data. It's no secret these companies run them on Hadoop or on MapReduce-style frameworks. For the industry, what's exciting about it is that it’s open source. It's a project, and there's a lot of movement behind it. But what's not so exciting about it is that it can be unreliable. For us at Greenplum, we recognize it as a platform for processing unstructured data. It's an excellent complement to our database, and a way of providing this co-processing of both structured and unstructured data. We're really committed to making Hadoop and that technology enterprise-grade, and that's how we see our responsibility for this distribution.

This is really, I think, a Titanic shift in our markets. This whole movement toward big data reminds me of when relational databases came into being. It's much bigger than what most people normally look for from a new technology. Would you be in agreement with that as well?

Michael Maxey: I definitely agree with that. I think that there's always been sort of a need and a want to analyze all the data, but for a long time the tools weren’t there. With the emergence of these databases – they're very powerful like Greenplum – and tools like Hadoop, it is very transformational.

Well Mike, I know you've been around the technology world for long time. I always like ask people what's going to happen in the future. What recent technology development do you think will have the greatest impact on enterprise business intelligence and analytics going forward?

Michael Maxey: That’s a great question. For us, it's the one thing that we would call unified analytics. As I've just said, it's that combination of the database and Hadoop to do both structured and unstructured data processing and that's almost a complete solution. But if you go back to your earlier question about how do you extract value from the data that isn’t available, that's another very important point.

We recognize that, and what we're building is a very important productivity layer that we call Chorus. It’s built for collaboration, accessibility, visibility of data – both structured and unstructured – for this new team of people that we've been calling the data science team. It's really no longer a business intelligence (BI) team. And by the way, you might want to rename the BeyeNetwork to the DataSciNetwork! Now it is more than BI. It’s about a team that's bigger than BI that includes the data platform administrator, DBAs, data scientists, analysts, engineers, and – really importantly – also includes the line of business. A platform that allows them to process both structured and unstructured data and allows them to be more productive with that visibility and accessibility is the next stage going forward. We wrap all of that up into something we call our unified analytics platform. That's where we're headed.

I share your thoughts, and I actually think that business intelligence and analytics have now expanded from what we consider the traditional BI world to incorporate the operational side of the world, the process side – embedding analytics into operational processes. I really think that instead of saying that BI is that small component, it now actually encompasses the whole organization. And in reality, that really fits very well into the EMC Greenplum focus. EMC is always been focused on storing all of the data that's been in the enterprise, and then made some very big acquisitions in the past from a document perspective, which is obviously the unstructured world. And now, by incorporating Greenplum into the EMC stack, they really have combined everything that we need to do to go after big data. Do you agree?

Michael Maxey: We certainly agree that and believe that. I think the other thing to keep in mind is another advantage I see: EMC and Greenplum don't have legacy. We don't have a 30-year-old database or 20-year-old database that we're trying to stretch into a new place. We've architected our database from the ground up over the last eight years, and it’s built for this big data market. And there's some freedom in being a vendor that doesn't have 20 years of legacy that I think others are challenged with. I think we're well positioned. EMC is doing some very big things, and you'll continue to see that as we move forward.

Mike it's been over a year since EMC acquired Greenplum. How do you feel the effort and the integration is going?

Michael Maxey: It has been fantastic. I can tell you there’s been a ton of growth. We're over 500 people now. We were about 150 at time of acquisition so we more than tripled in a year. As you can imagine, a lot of that's going into engineering. And it's just an example of the investment EMC has put into Greenplum. EMC has a good history of acquisitions. VMware is the shining star, but even beyond that they're very good at letting the startup they acquired or the smaller companies that they acquired continue to run their course rather than change things dramatically. And I've witnessed that from the inside and the outside, and it's been great. We couldn't be happier with it.

Well, that's good to hear. Thank you for taking the time to talk with me today about big data and the capabilities of EMC Greenplum.

  • Ron PowellRon Powell
    Ron, an independent analyst and consultant, has an extensive technology background in business intelligence, analytics and data warehousing. In 2005, Ron founded the BeyeNETWORK, which was acquired by Tech Target in 2010.  Prior to the founding of the BeyeNETWORK, Ron was cofounder, publisher and editorial director of DM Review (now Information Management). Ron also has a wealth of consulting expertise in business intelligence, business management and marketing. He may be contacted by email at rpowell@wi.rr.com.

    More articles and Ron's blog can be found in his BeyeNETWORK expert channel. Be sure to visit today!

Recent articles by Ron Powell



 

Comments

Want to post a comment? Login or become a member today!

Be the first to comment!