Data Virtualization Now Mainstream: A Q&A Spotlight with Suresh Chandrasekaran of Denodo

Originally published February 20, 2012

BeyeNETWORK Spotlights focus on news, events and products in the business intelligence ecosystem that are poised to have a significant impact on the industry as a whole; on the enterprises that rely on business intelligence, analytics, performance management, data warehousing and/or data governance products to understand and act on the vital information that can be gleaned from their data; or on the providers of these mission-critical products.

Presented as Q&A-style articles, these interviews conducted by the BeyeNETWORK present the behind-the-scene view that you won’t read in press releases.

This BeyeNETWORK Spotlight features Ron Powell's interview with Suresh Chandrasekaran, Senior Vice President of Denodo. The benefits of data virtualization and its various use cases in today’s enterprises are the topics discussed by Ron and Suresh.

Suresh, data virtualization means many things to many people. Could you explain how Denodo defines data virtualization and how it is used in enterprises today?

Suresh Chandrasekaran: Data virtualization is both a data integration approach and a technology that provides unified data access across disparate data sources to multiple business applications and users, delivered on demand.

I emphasize a few things: unified data access, disparate data, delivered to multiple applications, and on demand. From a business user perspective, they simply have to access this virtual data layer that delivers whatever information they want, the way they want it, and when they want it. It doesn't matter if they access the information directly from their smartphones, through a portal, a web user interface (UI), a business intelligence (BI) tool, or an enterprise application UI. Data virtualization provides more flexible, agile access to information as a service, with the appropriate security and service levels added.

Now, I can also respond from an IT perspective. Data virtualization is definitely becoming a mainstream way of doing data integration to enable faster delivery of data solutions with agility at a lower cost. We are seeing two trends in how data virtualization is used. The first is using data virtualization to address specific project needs for accessing disparate data sources in a more agile, real-time fashion. Think of this as adding another tool to the IT integration toolbox. The second is more architectural adoption of data virtualization to create a unified data services layer in the enterprise. Denodo, given its deep experience, has helped a lot of customers evolve from the first pattern to the second and from simple read-only uses to more complex, read-write enterprise data sharing uses, and that's important to get the most out of the technology.

Suresh, can  you please expand on the project level and infrastructure uses for data virtualization and provide some examples of each?

In the first pattern of use, which is to solve specific project needs, data virtualization is typically replacing a more traditional, more costly way to integrate the data. It could be custom coding; it could be ETL or data replication. There are very specific goals for each project – it's time to market, it's access to more sources, fresher or real-time information, etc. Data virtualization is better able to deliver these goals and also provide more agility to changes through the virtual layer. But it is also used to complement other integration tools in the same project.

Agile BI and analytics projects are a great example of this. Biogen Idec started using data virtualization for real-time sales reporting. At Telefonica, they use it for consolidated network monitoring for what's called ITIL [Information Technology Infrastructure Library] performance dashboards. They have all these different networks, phone lines and systems, each being monitored independently, but if they want to do causality to determine how the service level is being impacted by different components of the network, those ITIL performance-monitoring dashboards help in real time.

A major U.S. insurance company integrated marketing analytics databases with operational data so they can up-sell and cross-sell insurance policies. In the Harvard hospitals, they have a unit called RMF/CRICO that provides health safety and quality portals, aggregating information from many sources. In all of these use cases, access to real-time data, enabling access to new data sources, and providing flexibility are key drivers.

Denodo also brings additional capabilities to data virtualization, which extend its use to Web, semi- and unstructured data, typically in the context of competitive BI, social media and big data integration. For example, Banco Santander, a major bank in Europe, uses data virtualization to automate the extraction of Web data to perform debt tracing. They have a lot of receivables and defaulted debt that they need to collect, and they use Web-based sources, including some social media sources, to gather information on the file around the debtors.

Navteq, which is now a subsidiary of Nokia, uses Denodo to integrate point-of-interest location data, which can be restaurants, gas stations, parks, etc. That data is in both Hadoop and a data warehouse, but they integrate it with the real-time feeds from Facebook, Twitter and Foursquare, which show the activity in those locations. So the user can get a sense of not only the static information about the location, but the dynamic information as well. For example, if I’m a Chicago University student, is this the hottest bar for me to go tonight?

The second pattern is providing a unifying data services layer across the enterprise – into an architecture that supports both business intelligence and business process use interactively. In that sense, virtualized data services become another layer to complement the application services and business services in a loosely coupled, service-oriented architecture providing unified data governance, and becomes a strategic platform for companies to evolve how they access and provision data across the enterprise. Some examples of adoption are Oppenheimer Funds, Cisco, and Vodafone. As in the case of Biogen Idec, it is also very typical to start with a project-based approach as they did with sales reporting and then evolve toward more of a unified data layer architecture approach. Either way, I would say that data virtualization has become a mainstream approach to integrate data and deliver unified data access through data services.

The recent Forrester Wave on data virtualization profiled the entire data virtualization market. The top four companies in data virtualization were IBM, Informatica, Denodo and Composite. Could you explain how an independent company like Denodo could be so favorably ranked with IBM and Informatica in this report?

Suresh Chandrasekaran: That is because Denodo is a specialist and innovator for over 10 years, with data virtualization being our only focus. In a sense, you can think of Denodo as the Ferrari in this market, at the same time providing the economic advantages and ease-of-use of a Toyota Prius or a Chevy Volt. But, getting more serious, I think Denodo drives its advantages in this market from three factors. First, we had the architecture and vision for data virtualization right from the beginning. We did not start with a replication-oriented mind-set, and we did not start with a structured data-oriented mind-set. We started with a vision to unify enterprise data with Web/cloud and unstructured data seamlessly, and extended the relational model to support heterogeneous data types natively, better optimize queries and deliver unified data services in many formats. The extended relational integration model gives us an advantage to easily access and combine relational data, hierarchical data, semantic data, the new forms of cloud, NoSQL and big data from a source perspective. We also have an easier time providing more flexibility from the application or user perspective, supporting different query types and data provisioning modes. The key advantage for Denodo was that we started with that fundamental architecture and vision for data virtualization.

The second factor is that as a pure-play data virtualization player, we have been at this longer than anyone else in the market. Composite comes closest and IBM and Informatica have broad integration experience, but for data virtualization specifically Denodo has been doing this longer. Naturally, we have the most demanding and large-scale applications of data virtualization anywhere. Our feature sets are very deep and are being pushed to a high level of maturity by companies like Cisco, Vodafone, Nokia, Chick-fil-A, Banco Santander, Inditex and Oppenheimer.

The third factor, and perhaps a sensitive one, is that we don’t have the conflict of interest and baggage some of the other companies with a more traditional data integration mind-set have. We don't have to worry that data virtualization will cannibalize redundant data stores and ETL revenue. This means we are really motivated to help our customers achieve the most success with data virtualization, beyond simple use cases to "check the box" and typically we are rated very highly for customer service and best practices support.

If you look at the Forrester Wave's specific scores, there were key categories like distributed data access, the breadth of data sources, data services integration, and data services delivery. In all of those categories, and in revenue growth, we scored a perfect five out of five. That complements the idea of the depth and breadth of the feature set, particularly the integration of the different components.

One of the major trends we're seeing is Agile business intelligence. How does data virtualization and Denodo's product enable that trend?

Suresh Chandrasekaran: So Agile business intelligence, taken in a broader context, has several underlying trends – faster initial time to BI solution, agility to change that, self-service BI, and fresher or real-time data. The traditional waterfall approach of defining, justifying and meeting requirements – building a very large infrastructure, ETL to feed the data warehouse, which then feeds multiple data marts, which then feed BI – is giving way to more of a self-service BI discovery approach to the data. In other words, show me the enterprise data in all its different forms, expose it to me as self-service data services, allow me to integrate other sources easily, and let me do exploratory analytics and build what I need to build.

Also, whether it’s coming from the user or from IT, how do you expose or provide reports for fast time to market and with very rapid changes? In order to do that, you eliminate several layers of physical data replication, which provides the agility to make changes, add new data sources, change the data transformations, etc. That’s one approach that not only enables self-service BI, but also provides the infrastructure for Agile BI.

The second aspect of Agile BI is that business users are always coming up with ways to access information faster than what IT can provide. For example, in almost any company today, you can get a Google Alert faster than you can get a competitive BI report. So how do you now access and integrate all of these additional sources such as unstructured, cloud, and social media data? This is another area where data virtualization, and particularly Denodo, is clearly enabling the trend – we can now provide hooks into all of these sources. For example, at Genentech, if I do clinical trial investigator selection, I can put hooks in to look at the attributes of physicians, what clinical trials they're doing for competitors, etc., from other sources in addition to my own CRM and clinical trial system. The business intelligence to feed the selection of those physicians can be much faster and more business-driven.

The points I like to make on how Denodo enables the trend toward Agile BI are the infrastructure for agility and the self-service aspects of Agile BI, and the broadening of the sources from which you can draw business intelligence. Both are key aspects of how we enable that trend.

Enterprises today are also integrating their enterprise data warehouses with the cloud and incorporating data feeds and real-time social media data. Is data virtualization key to making that happen?

Suresh Chandrasekaran: For real-time data, absolutely. And even though for feeding a data warehouse you can use ETL or other code, data virtualization is a necessary technology for easing access and integration with the emerging sources. The traditional data integration technologies associated with the data warehouse were primarily around relational data, whereas data virtualization has the ability to integrate different data models – relational data, semantic data, hierarchical data, NoSQL (not only SQL), Hadoop and Web-based formats. Data virtualization makes it easier to take all these different sources, which are always changing in their shape and format, and model them together to produce an extension to the enterprise data warehouse.

Also, data virtualization provides the capability to combine the delivery of the information in a hybrid mode. In other words, you can take replicated data, which is in the data warehouse, combine it with real-time query optimization, use caching as needed and even publish the information through an event-driven message bus or enterprise service bus (ESB).

As a result, data virtualization enables many new data integration patterns that complement an enterprise data warehouse. For example, the pattern could be an enterprise data warehouse extension, where I take part of the information in the warehouse and extend it with real-time information from other sources. It could be used to create a virtual enterprise data warehouse by combining existing departmental, product-based, or domain-based data warehouses. It could be used to feed an enterprise data warehouse with semi-structured and unstructured data.

Another pattern is delivering the data warehouse analytics and exposing that as a data service to applications and users in the cloud. We have many examples for each of these patterns. But the two key points I mentioned earlier – the ability to integrate different data models and the ability to deliver the information in a hybrid mode – enable Denodo in particular and data virtualization in general to combine data warehouse and traditional data stores with cloud, social media and real-time sources.

Another major trend is mobile business intelligence. How does data virtualization impact an enterprise's efforts to go mobile?

Suresh Chandrasekaran:
When we get to mobile, we're now on the consumer side of this equation of data virtualization, and a key aspect is that mobile BI requires browsable data delivered in small chunks by a mobile friendly interface. Denodo scored 5.0 in the Forrester Wave, the highest score possible, on data services delivery. The reason for that is we support delivery of information in 14 different formats. Not just SQL, but also SOAP Web services, REST (Representational State Transfer), JSON (JavaScript Object Notation), AJAX (Asynchronous JavaScript and XML) widgets, SharePoint webparts, Java portlets, RSS feeds, etc. We have the most publishing options.

Also, mobile BI requires different paradigms on how you want to query the information – not just the classic SQL query. You might want to browse the information using links. You might want to search for some data. You might want to do semantic searches using SPARQL. So we not only publish the data in multiple formats, but we also support multiple query paradigms.

The last aspect of supporting mobile BI is the delivery aspect which, as I mentioned, is in small chunks. For instance, two of the capabilities are intelligent caching and hybrid delivery modes (push vs. pull, real-time vs. scheduled), asynchronous, etc. That’s why when people are looking at mobile BI applications, they need to look very closely at which data virtualization platform provides the broadest range of options from a data services delivery perspective. We have several mobile operators and equipment providers as customers, who recognize the advantage that Denodo provides.

You gave us some customer examples of how people are using data virtualization. Could you also highlight how some of your customers are working with your unified data layer approach to data virtualization?

Suresh Chandrasekaran: We talked about the unified data layer approach to data virtualization as the broader approach, where you're incorporating not only real-time data but replicated data stores, SOA architecture, etc., to create that unified data layer across all kinds of information. Some of our customers have taken that approach from the get-go. I'll give two examples contrasted with several other customers who may do very project-specific implementations of data virtualization.

Oppenheimer Funds, one of the largest mutual fund companies, has taken the unified data layer approach to provision business intelligence for their fund managers, reporting into their financial performance portals, attribution, accounting, etc. They're creating a series of data services across several underlying sources, exposing them for multiple users. It can be used for pure analytics in a business intelligence sense. It can be used to support customer service or operational applications, or to deliver operational business intelligence into process applications. There are several combinations of how they can use these data services that are being built. Cisco Systems is doing the same thing with respect to smart services, as they call it, within the Cisco services initiative. Both of these customers have told their stories in the World Series on Data Virtualization, a series of expert talks on this topic, which is accessible from the Denodo website. Also on our website are white papers on high-value use cases segmented into analytical, operational and data management scenarios.

Another example is Biogen Idec, a leading biotech company that didn't start off with the unified data layer approach. They started off by building a set of data services for simple sales reporting, and then they built a set of data services for HR reporting. They recognized that they needed to build these in a reusable fashion, so they started building base-level data services. On top of that, they built subject-matter-oriented data services, and then on top of that they built BI data services and operational data services that are combined with application services. This contrasts with the approach of customers who start off from a tactical implementation of data virtualization and evolve to the unified data layer approach. I highly encourage your readers to access the World Series on Data Virtualization, which has several customer examples and their approaches to data virtualization and the Forrester Wave: Data Virtualization report and webinar, provided complements of Denodo.

Thank you Suresh for providing our readers with a solid understanding of data virtualization and Denodo's product capabilities.

  • Ron PowellRon Powell
    Ron has an extensive technology background in business intelligence, analytics and data warehousing. In 2005, Ron founded the BeyeNETWORK, which was acquired by Tech Target in 2010.  Now an associate publisher at TechTarget, Ron continues to lead the BeyeNETWORK, providing editorial direction and supporting the sales team. Prior to the founding of the BeyeNETWORK, Ron was cofounder, publisher and editorial director of DM Review (now Information Management). Ron also has a wealth of consulting expertise in business intelligence, business management and marketing.

Recent articles by Ron Powell


Related TechTarget Editorial Content


 

Comments

Want to post a comment? Login or become a member today!

Be the first to comment!