We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.


Why Big Data Requires Advanced Analytics: A Q&A with Richard Winter of WinterCorp

Originally published December 2, 2014

This BeyeNETWORK article features Ron Powellís interview with Richard Winter, CEO and founder of WinterCorp. Richard and Ron discuss the effect of big data on analytics, the technologies that enable advanced analytics and how companies are benefiting from big data analytics.
One of the major aspects of big data is variety. Itís not talked about a lot, but what are the implications of variety with analytics?

Richard Winter: Variety is a very interesting subject. I think it was one of the key issues that drove the whole development of big data. Users today have many kinds of information in their analytic data environment and not just tabular, structured data, but also they have image data, textual data and graphically structured data. All of these different types of data call for different storage structures and different analytical techniques. The first generation of big data solutions and platforms took into account that you needed to accommodate this variety. But you could say it stopped there. As analytic work has progressed, it has become increasingly clear that answering any important business question often involves multiple steps, and the various steps may involve different types of data and different analytic techniques.

An example of this is if youíre trying to do a promotion in a consumer retail business such as a telecommunications business and you want to get the maximum return on your promotional dollar. You might ask the question, ďWho among our customers are both the most happy customers and the most influential customers?Ē Thatís really one question. If you can promote to them and you could spend all of your promotional dollars on them, then your dollars would actually be multiplied because the happy customers would buy your new product and then theyíd tell their friends about it. So theyíre, in effect, doing your promotion for you. So to answer this question, the happy part has to do with text analytics. The influential part has to do with graph analytics. These are different structures, and they require literally different engines to do a good job at scale.

Is that why weíre starting to see the development of new architectures to support this type of analytics?


Richard Winter: Well, that is exactly right. In fact, what has happened in the last year is two architectures along these lines have emerged. Apache Hadoop in version 2 has an architecture called YARN. On the Hadoop platform, YARN enables running and using multiple different engines for different analytic challenges.

The first generation of Hadoop had MapReduce as the only application programmer interface. Everything had to be done with MapReduce. Well MapReduce is very versatile, but that doesnít mean itís optimal for the huge variety of structures and analytics that you encounter in big data. Apache YARN lets you run not only MapReduce, but also Giraph, which is a graph engine, and a variety of other engines for other specialized needs alongside MapReduce.

SNAP is the new architecture of the Teradata Aster Discovery Platform. Itís also a good data platform. Whereas it, in the past, had one interface for users, it now has SQL, MapReduce, a graph engine, an R engine, and the ability to provision for more engines. So itís also offering this multi-engine approach.

Those are both reactions to this trend.

If you look at SNAP, it is taking advantage of efficiencies for those specific types of data.

Richard Winter: It is, as is YARN, but itís doing something more, which is actually very interesting. It has all these different engines at the top level, but in the middle it has an optimization engine that will optimize across all these different types of analytics. It also has an execution engine that will execute the plans developed by the optimizer. And, it has other components in the middle that work across the whole spectrum. What that means is that when you ask one of these questions Ė such as who are the happy and influential customers Ė that requires a multi-step answer, with Aster SNAP you can write that question in one SQL query. And then SNAP will optimize the performance of that entire analytic process. That means it will decide the best order in which to do these analytic steps. And it will also optimize within a step. The significance of this is being able to ask this remarkably complex question with one SQL statement. SQL is a higher level, non-procedural language so you donít have to tell the system how to navigate to the data, in what order to do the joins or which kind of join is most efficient. All of those questions are automatically resolved by the system. Using SQL, you just tell it what you want and it figures out the best procedure for doing it. And thatís like YARN in the kind of question that you can address with it, but itís different in that the degree of automation is so much higher.

You could almost say this is advancing analytics through automation.

Richard Winter: Thatís right.

What are you hearing from customers on this?

Richard Winter: At this point, itís the analytic leaders who are pushing the boundaries. Where weíre seeing the results is where there is a business strategy that leverages the analytics. An old example of this is when Walmart advanced to neighborhood merchandising. Thatís a business concept, and that business concept led them to create new analytic capabilities so that they could optimize the inventory of each store for the neighborhood it was in. At the time they did it, they had 2,000 stores in the United States. Once they implemented the concept, they could say that no two stores had the same inventory. And they said that if they had the same inventory in every store, they wouldnít be able to compete in todayís market.† That kind of concept has its analog today in many industries and markets. There is an analytic business strategy that is driving things forward Ė the kind of strategy where the key to doing what you want to do is to leverage something such as how customers influence each other, which is not something you could easily have known 10 or 15 years ago. Those strategic concepts in some cases are requiring advanced analytics. Now there is the built-in analytics to do that fairly easily.

You gave us a retail example, which could actually be horizontal across many industries. Do you see other industries really taking advantage of analytics as well?

Richard Winter:
I think the big new area is use of advanced analytics in the industrial setting. This is something we didnít hear about as much five or ten years ago. You have concepts like the GE Industrial Internet, where there are maybe hundreds of thousands of large industrial turbines operating around the world in factories, jet engines, trains and so on that generate data at an almost unthinkable rate Ė terabytes of this data pouring in every day. This data enables companies to better manage their operations, better engineer their products, optimize energy consumption and replace things before they fail. All these kinds of things are coming out of the industrial part of harnessing data. So itís about being able to deal with those enormous volumes of data, most of which is not interesting. Most of the time itís telling you everything is okay, so in this enormous volume of data you have to pick out the bits that are interesting and then use them for analysis and prediction.

Richard, if someone was really interested in learning more about advanced analytics, where would you point them?

Richard Winter: Iíd tell them to visit WinterCorp.com and get in touch with me if theyíd like to talk.

Do you have a specific white paper you have done?

Richard Winter:
On this newest topic that we discussed in this interview, Iíve written a report on SNAP and YARN, two architectures for big data, and thatís available through Teradata.

Richard, thank you for sharing your expertise on big data and advanced analytics.

  • Ron PowellRon Powell
    Ron is an independent analyst, consultant and editorial expert with extensive knowledge and experience in business intelligence, big data, analytics and data warehousing. Currently president of Powell Interactive Media, which specializes in consulting and podcast services, he is also Executive Producer of The World Transformed Fast Forward series. In 2004, Ron founded the BeyeNETWORK, which was acquired by Tech Target in 2010.† Prior to the founding of the BeyeNETWORK, Ron was cofounder, publisher and editorial director of DM Review (now Information Management). He maintains an expert channel and blog on the BeyeNETWORK and may be contacted by email at†rpowell@powellinteractivemedia.com.

    More articles and Ron's blog can be found in his BeyeNETWORK expert channel.

Recent articles by Ron Powell

 

Comments

Want to post a comment? Login or become a member today!

Be the first to comment!