We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.


High-Performance Data Movement Across All Platforms: A Q&A with Barry Thompson of Tervela

Originally published January 16, 2013

This BeyeNETWORK spotlight features Ron Powell's interview with Barry Thompson, Founder and Chief Technology Officer of Tervela. Ron and Barry talk about Tervela’s entrance into the big data and data warehousing markets.
Barry, for those in our audience who may not be familiar with Tervela, can you give us a brief overview of the company?

Barry Thompson: Tervela is a high-performance data fabric company. We founded it as a research project in 2004 and started building the company out in 2005. Our original focus was on high-performance messaging, and then we later focused on high-performance data movement across all the platforms, whether it be for big data, data warehousing, file movement, etc.

You recently announced Tervela Turbo, which is your first product directed to the big data and data warehousing market, which is a major focus for our audience. Can you tell us your value proposition for that segment of the market?

Barry Thompson:  Tervela Turbo is a core foundational big data movement technology for transactions and high-performance guaranteed data delivery. It extends those technologies, which today are responsible for over four-and-a-half trillion dollars of transactional volume a day, into the data warehousing and big data market with adapters and  frameworks that allow it to talk to Oracle, Teradata, and big data solutions like Hadoop.

From an operational perspective, real-time applications are becoming mission-critical. In fact, everything seems to be moving toward real-time. Can you explain the key benefits that Tervela Turbo provides for real-time analytics?

Barry Thompson: The Tervela Turbo infrastructure treats all data, whether it’s batch data or non-real-time data as well as true real-time data, as real-time data streams. So when the data arrives in the Tervela infrastructure, it's instantly available to any consuming application, whether it's true real-time analytics or batch analytics, simultaneously. A lot of people are actually using Tervela as a gold source for information to their ETL engine, their Hadoop infrastructure, their data warehouse infrastructure to augment the lifecycle, or their true real-time analytics. It allows institutions to provide a single source of data and use and consume that data in multiple destinations so you can have augmented real-time analytics to supplement your real-time worldview, as well as feed your historical and/or data warehouse environments in a batch mode from a single source.

Lately we have been seeing tremendous interest in Hadoop and other big data systems. How does Tervela Turbo integrate data from these big data sources?

Barry Thompson: Well, Tervela actually can effectively be a sandwich around Hadoop. We actually can take data and feed Hadoop and effectively provide an enterprise layer of resiliency around Hadoop in terms of getting data to one or more instances of Hadoop in disaster recovery situations, etc. But on the other side of Hadoop, we can use real-time analytics in the data fabric itself to trigger when batches are run in Hadoop, to actually get data out of Hadoop, as well as take the data from Hadoop and do high-performance loading into your data warehouse or data analytics infrastructure – again, whether that is the Teradata and Oracles of the world or providing batch delivery into ETL or other analytic solutions.

Hadoop has been a struggle for many in our audience due to the fact that it requires a certain level of talent. How do you help companies integrate data from these big data sources like Hadoop?

Barry Thompson: The big challenges with Hadoop are twofold. One is how do I deploy Hadoop, get data into Hadoop and manage it in a way that checks the boxes that my enterprise puts on technology adoption from the perspective of enterprises resiliency, disaster recovery, mission-critical applications, etc. From a data delivery perspective, Tervela ensures the data gets into Hadoop. We address some of the resiliency challenges around named nodes and scalability of Hadoop and ensure that you can check the box under disaster recovery for asymmetric clusters of Hadoop.

In addition, for getting data out of Hadoop, there are a lot of tools today that allow you to design batches and run them in Hadoop, but once that batch is run, that data needs to get into your destination system. So we’ll get that data out of Hadoop and deliver into whatever data architecture you want to deal with – the data warehouses and other analytic solutions. But more importantly, a lot of people are using Hadoop to augment their existing ETL infrastructure. Whether it’s ETLT or some other model, this may run the full lifecycle that loads the initial image of the warehouse and then Hadoop provides trigger-based batches that drive augmentation of data. I’ll give you an example.

Let's say I'm doing interest rate sensitivities in my Teradata implementation for risk analysis on Wall Street, and I need to update those sensitivities every time the interest rates move a tenth of a basis point. So when that happens, it will actually trigger a batch in Hadoop, pick the updated data and update those tables directly in Teradata in real-time. As a result, in the world you see from Teradata, although it may have an ETL-based backbone for the initial images, the modifications for that data can be overlaid in real-time from Hadoop batches.

You mentioned ETLT. Our audience is very familiar with ETL and ELT, but what is ETLT?

Barry Thompson: There is actually a whole alphabet soup of new models that are coming out. But ETLT is really the construct of taking data from your enterprise and loading it into an intermediary raw data warehouse or lightly augmented data warehouse, something like a Hadoop data repository. Then from there, as the new data analytics come online, your data miners will actually go in and run against the raw data and generate the reports or the data that’s then going to go into the end data warehousing environment. So what you end up with is two layers of transformation – one transformation of data coming out of Hadoop and then another transformation as that data is pushed into your warehouse and is applied to the tables within the actual Teradata or other environment.

What is really interesting in where that model is used is there are some basic questions. How often do your data miners go back to the source? If the answer is more than two or three times a month, all of a sudden it starts to look really attractive from a total cost and flexibility perspective to move from your ETL or ELT model to an ETLT model. The second question is how often do you change or add new ETL jobs? If you're changing or adding ETL jobs at a rate of more than two or three a month, then all of a sudden this ETLT model starts to look very attractive from a cost benefit analysis perspective and from a responsiveness function, meaning from the time the business wants something to the time they've gotten what they need in the data warehousing world. The response time shrinks drastically by moving to this ETLT model.

Companies today, especially the large enterprises that you work with like financial institutions, have very well-defined information architectures. From an implementation perspective, does Tervela Turbo require major changes to an existing architecture?

Barry Thompson: It does not. Actually, Tervela Turbo will sit alongside your existing ecosystem. It will coexist and then augment it with the new functionality. So one of the things, in an entirely different use case, is when someone wants all of the data coming out of ETL to be drop copied into, let's say, Hadoop. And we’ll actually intercept the load process and dual feed both into Teradata, for example, and into Hadoop from that single job coming out of ETL. We'll also do change data capture out of existing solutions like a Teradata or an Oracle and load Hadoop directly as well. So our architecture is designed on the loading side for intercepting data to fit seamlessly into existing architectures so you don’t have to change it, and also to deliver this additional functionality in a very rapid time frame with a very low barrier to entry in terms of human capital.

Can you give us some examples of how your customers are benefiting from Tervela Turbo?

Barry Thompson: I have kind of hinted at a couple of those. The first is the ETLT model. That's providing customers with much faster business responsiveness – meaning from when a business wants a new model to the time they can actually see it – and a much lower total cost around onboarding new analytics and new data areas to be analyzed. It provides a much more cost-effective solution.

The second area is augmentation of real-time data – meaning your ETL infrastructure is great for your batch data but you need to push real-time data either into your existing analytics, like a Teradata (where real-time data is part of the solution) that's getting augmented during the day, or to take that data and merge it with real-time data in part of a hybrid data mart solution or some sort of a real-time analytic solution, mixing the both static information and the batch information with real-time information in a unified solution.

I think the last area will be around the ”enterpriseification,” if I can use that word, of Hadoop, meaning checking the disaster recovery box, checking the guaranteed data delivery so it can be used for more than just weblogs. You can start using it for transactional data and other data to ensure the data got there.

Well, you mentioned disaster recovery. Over the past few years, we witnessed many natural disasters such as the recent Hurricane Sandy on the East Coast. To be prepared for events like that hurricane, companies have to be sure their data and applications are protected from loss. How can Tervela Turbo help in those situations?

Barry Thompson: Tervela Turbo helps in a couple of ways. At the highest level, once data is placed on Tervela, it's guaranteed to be available in multiple locations. We handle the underlying infrastructure that ensures data is available in multiple locations; but, more importantly in a high-performant way. That means within half a millisecond of network transmission time, it's persisted in multiple locations. So that from a data assurance perspective starts to check the box.

And then whether you're in a dual load situation where you're loading multiple data repositories from a single update, whether you’re in a hot, hot scenario where you’re updating data in two databases or two locations and they need to be available in both or whether it's change data capture coming from the updates in one database ensuring that they're available in another, all of those are kind of the same from Tervela's technology underpinnings. But they're very different use cases. Let's say we're loading Hadoop. We can guarantee data is persisted in two data centers very, very rapidly, and assure that the data is written into Hadoop and succeeded in the rights to Hadoop so that your data is not persisted in the underlying data repository. The same is true of enterprise warehouses or change data capture and data replication. We will do that much faster than traditional solutions to ensure there's less risk of data loss and move a lot more data more cost-effectively than other solutions.

Barry, did you get any feedback from the financial institutions you mentioned that were supported by Tervela during the recent hurricane?

Barry Thompson: Most of those institutions that we deal with have Tervela deployed for the high-performance transactions infrastructure, and those folks have had disaster recovery and fault tolerance in place for years, built on top of our platform.

What we found though is that it really drove the next level of discussions, specifically around Hadoop. Where Hadoop had been used in one cluster or two clusters for unique data, it really drove the conversation for those who had been putting off operationalizing Hadoop and putting off creating service-level agreements for their end-user constituents. Those teams realized they really needed to have a disaster recovery solution. I think Hurricane Sandy drove those conversations much more aggressively and shortened the time frames to delivery.

On the data warehousing side, it is less about the hurricane driving those conversations. Analytics are moving much more from what I would call the middle and back office to front office, decision-making, mission-critical applications. As the increased use of that data and the move to more real-time or shorter time windows for that data, what's happening is they're having to check the disaster recovery box because it's being used more fully throughout the firm.

There is question I'm sure our audience will have from a competitive perspective. Why haven’t they heard about the use of Tervela in data warehousing and business intelligence?

Barry Thompson: The reason, in most cases, is because although we've been doing big data, we've been doing it in the real-time transaction space. We're doing ten billion events a day. As our customers in those segments saw the requirements scaling up for their existing data warehousing and traditional big data solutions, they've sort of married us up with those too. With Tervela Turbo, we’re announcing our entrance into the market with packaged solutions, instead of just one-off adapters, across a broad swath of technologies.

Barry, thank you for taking the time to introduce Tervela to our audience and explaining Tervela Turbo’s role in data warehousing and business intelligence.

  • Ron PowellRon Powell
    Ron is an independent analyst, consultant and editorial expert with extensive knowledge and experience in business intelligence, big data, analytics and data warehousing. Currently president of Powell Interactive Media, which specializes in consulting and podcast services, he is also Executive Producer of The World Transformed Fast Forward series. In 2004, Ron founded the BeyeNETWORK, which was acquired by Tech Target in 2010.  Prior to the founding of the BeyeNETWORK, Ron was cofounder, publisher and editorial director of DM Review (now Information Management). He maintains an expert channel and blog on the BeyeNETWORK and may be contacted by email at rpowell@powellinteractivemedia.com. 

    More articles and Ron's blog can be found in his BeyeNETWORK expert channel.

Recent articles by Ron Powell

 

Comments

Want to post a comment? Login or become a member today!

Be the first to comment!