Big Data Discovery: A Spotlight Q&A with Manan Goel of Teradata Aster

Originally published October 8, 2013

This BeyeNETWORK spotlight features Ron Powell's interview with Manan Goel, Senior Director of Product Marketing at Teradata Aster. They discuss the next-generation discovery platform architecture of Teradata Aster 6, the only fully integrated and complete solution available for big data analytics.
Teradata recently announced the Aster Discovery Platform 6. Could you give us a high-level summary of what is included?

Manan Goel: Sure. We announced the launch of Aster Discovery Platform 6, and this is the next-generation discovery platform. It’s a platform that has been purpose-built for big data analytics and discovery. It’s the first platform of its kind in the industry to bring these big data analytics and discovery capabilities to market. It primarily introduces capabilities in three major areas. The first major area is the SNAP™ framework for Discovery that includes processing, optimization and storage innovations to truly deliver integrated, blended analytics. The second capability is a new file store called Aster File Store™ that allows our customers to quickly ingest and preprocess big data. The third area includes new analytic engine for graph analysis called SQL-GR™. Aster SQL-GR™ is a native graph processing engine that makes it easy to perform powerful graph analysis, especially complex graph analysis across big data sets.

Can we dig into each of these areas and discuss their capabilities separately? When you’re referring to new processing, optimization and storage innovations, what do you mean by that?

Manan Goel: Ron, what we are hearing from the customers around big data analytics and discovery is that they want to blend multiple, diverse kinds of data together, and also analyze that data using different blended analytics tools – kind of enabling multi-genre analytics, as we call it. That’s what our new processing, optimization and storage innovations do. Aster 6 introduces the SNAP™ framework for Discovery which enables these multiple processing engines like graph, MapReduce, and SQL and multiple data stores like row, column & file stores to work together through a single user interface leveraging the integrated optimization, execution and storage capabilities. Essentially from an end-user’s perspective, they can do analytics across these multiple styles of analysis in a truly blended fashion with a single user interface. That’s what our key differentiator is. These optimization, processing and storage innovations deliver true integration across these analytics engines and data stores, as opposed to just having them co-exist.

So when you say multiple analytic genres, what do you mean?

Manan Goel: Let me give you a use case. Let’s talk about telco churn analysis. The telco industry – for example, cell phone providers – is a pretty competitive industry. It has over a 95% saturation rate. Everybody has a cell phone now. For telcos, retaining their customers is very important. In telcos, any time a key subscriber churns, anybody who is connected to him or her is 6 times more likely to churn. So if you are a telco analyst trying to manage customer churn, you want to be able to do analytics that combine social network analysis that allows you to create social graphs for your subscribers. You also want to bring in their messaging statistics to see if they’re key or not. And then if they are key subscribers, you want to see how their feeling is toward your services or product offerings and whether or not they are on a path to churn. So this analysis includes graph analysis, text sentiment analysis, social media analysis and statistical analysis. It has been really difficult to do this type of analysis with today’s tools. Aster Discovery Platform 6 truly makes that kind of analysis really easy, fast and efficient to do. From an analyst perspective, from a single interface you’d be able to write a query, which includes all of these types of analytics, and transparently the platform figures out how to most efficiently execute this and deliver the results as fast as possible. It’s as simple as that.

You also mentioned a new integrated storage system and services. What do you mean by that?

Manan Goel: I talked about the notion of diverse data sets. So what we keep hearing from customers is they also want the discovery platform to be able to quickly bring in these different data sets, retain fidelity, retain details, and not do a lot of schema work up front. What we’re doing with Aster 6 is introducing a new  object-based storage system and common storage services. This common storage system and services provide key storage capabilities like security, snapshots, replication and colocation that will allow us to quickly build type-specific data stores on top of this system. In Aster 6 as a first launch, we’re using this storage system to launch a new file store – Aster File Store™. Aster File Store™ will empower our customers to quickly ingest and preprocess multi-structured data within the Discovery Platform. Our direction going forward will be to invest more and more in these type-specific stores. We are currently looking at use cases that customers give us around email or JSON stores and invest in those. That’s where we’re going, and it will allow our customers to quickly ingest data without losing a lot of fidelity, without doing a lot of schema work up front, and analyze this data quickly through the multiple engines that we provide.

You also mentioned analytic engines. What do you mean by analytic engines and how does the integration work?

Manan Goel: In Aster 6 we have introduced a new graph analytic engine called SQL-GR™. So the Aster Platform had two engines in the past – it has a SQL engine and a SQL MapReduce engine for doing workload-specific queries. Now we’ve added one more engine to the platform. We’ve added a brand new graph engine that will really enable large-scale graph analytic use cases, all through one single SQL interface – really making it easy for analysts. They don’t have to know all the graph ins and outs. They’ll still be able to write all the graph queries and get details on that. We are using a processing framework called BSP for efficiently processing the graph queries. BSP stands for bulk, synchronous, parallel. The SQL-GR™ engine is in there doing large-scale massive, iterative graph analytics use cases. We believe this BSP-based technology will help our customers more efficiently run, large-scale, iterative graph analytics workloads and deliver the results much more efficiently. A lot of our competitors are throwing hardware at the problem. They’re putting in more memory and CPUs, whereas we believe by introducing a fundamentally new processing framework, we can deliver a lot more value to our customers in terms of how those workloads are processed.

BSP – bulk, synchronous, parallel – when you’re talking about bulk, what size are we looking at?

Manan Goel: This is massive.

Petabytes?

Manan Goel: Yes, petabytes – think of the telco use case. You could be looking at millions and millions of your subscribers. You might be creating their social graph. Think of product recommendations in retail. A retailer may have thousands of products and millions of transactions. So we’re talking millions and millions of customers and their connections, their second-level connections and their third-level connections. Building all those complex social graphs. Putting more analytic techniques on top of it like text/sentiment, path/pattern and getting much richer insights to truly deliver much higher business value.

As we look at big data, what are you doing around Hadoop integration?

Manan Goel: The file system that we are introducing is completely binary, compatible with the Hadoop file system. So all the third-party applications that support HDFS will seamless support the Aster file system. We completely support the HDFS APIs.

To me it would seem that if I am a Teradata customer, it would be obvious that I would do the discovery phase within Aster 6 from the big data perspective, and then I would take that data and operationalize it with the existing Teradata architecture and applications. Is that what’s happening?

Manan Goel: Absolutely right. We go to market with the three key technology stacks – Teradata, Aster and Hadoop – as a unified data architecture (UDA). In the UDA, each system is designed to address specific workloads. Aster is primarily designed for doing discovery analytics on big data, finding interesting trends in data. And once you have found those, you can operationalize them on a Teradata data warehouse. Hadoop is designed primarily as a data lake or a large-scale data archival platform. We have customers today who use both Aster and Teradata together, and they’re finding insights in Aster around customer journey and customer path, and they operationalize it using Teradata integrated data warehouse. A customer path to churn can be operationalized with the customer lifetime value in a Teradata data warehouse, and that can help organizations drive their operational decisions.

I’ve seen Hadoop identified in many ways, but this is the first time I’ve heard it described as a data lake. That’s pretty unique. If I’m not already a Teradata customer, why would I want to start my analytics and discovery efforts with Aster 6?

Manan Goel: Aster 6 is designed, as I said before, from the ground up for doing discovery. In Aster 6, we have made a lot of investments around having a single, high-level query language for discovery across these multiple execution engines. So SQL is that language through which you can write a graph query, a statistical query, a text sentiment analysis or a path pattern analysis query, as opposed to other technologies in the market where the interfaces are different and the management frameworks are different. You express queries differently. So Aster 6 truly makes it easy for customers to do discovery – not only bringing these different data types together but also providing a single, common interface for writing queries across multiple execution engines. The platform has integrated processing and optimizations that make it seamless and transparent to run these queries most efficiently, and it delivers the results in the easiest, fastest and most intelligent way.

I know you just recently launched Aster 6. Have you received any significant customer feedback yet?

Manan Goel: Yes, it has been pretty exciting for us. We’ve been sharing the news with industry analysts, market makers and customers. The customers are really excited about it and the use cases that we’re talking about. The customers are taking us in new directions and they’re really excited about it, and we truly believe that Aster 6 brings innovative capabilities to the market. As time goes by, we’ll be happy to talk to you more about the use cases and the implementations that we’ve seen around it. But for now, it’s very exciting. It’s too early in the launch right now, but already we have seen a lot of excitement and interest from our customer base.

It’s been great talking with Manan. I look forward to speaking with you again in 6 months.

  • Ron PowellRon Powell
    Ron, an independent analyst and consultant, has an extensive technology background in business intelligence, analytics and data warehousing. In 2005, Ron founded the BeyeNETWORK, which was acquired by Tech Target in 2010.  Prior to the founding of the BeyeNETWORK, Ron was cofounder, publisher and editorial director of DM Review (now Information Management). Ron also has a wealth of consulting expertise in business intelligence, business management and marketing. He may be contacted by email at rpowell@wi.rr.com.

    More articles and Ron's blog can be found in his BeyeNETWORK expert channel. Be sure to visit today!

Recent articles by Ron Powell



 

Comments

Want to post a comment? Login or become a member today!

Be the first to comment!