Tools that have been offering access to non-SQL data sources using SQL for a long time are the data virtualization servers. Most of them allow SQL access to data stored in spreadsheets, XML documents, sequential files, pre-relational database servers, data hidden behind APIs such as SOAP and REST, and data stored in applications such as SAP and Salesforce.com.
Most of the current SQL-on-Hadoop engines offer only SQL query access to one or two data sources: HDFS and HBase. Sounds easy, but it's not. The technical problem they have to solve is how to turn all the non-relational data stored in Hadoop, such as, variable data, self-describing data, and schema-less data , into flat relational structures.
However, the question is whether offering query capabilities on Hadoop is sufficient, because the bar is being raised for SQL-on-Hadoop engines. Some, such as SpliceMachine, offer transactional support on Hadoop in addition to the queries. Others, such as Cirro and ScleraDB, support data federation: data stored in SQL databases can be joined with Hadoop data. So, maybe offering SQL query capabilities on Hadoop will not be enough anymore in the near future.
Data virtualization servers have started to offer access to Hadoop as well, and with that they have entered the market of SQL-on-Hadoop engines. When they do, they will raise the bar for SQL-on-Hadoop engines even more. Current data virtualization servers are not simply runtime engines that offer SQL access to various data sources. Most of them also offer data federation capabilities for many non-SQL data sources , a high-level design and modeling environment with lineage and impact analysis features, caching capabilities to minimize access of the data source, distributed join optimization techniques, and data security features.
In the near future, SQL-on-Hadoop engines are expected to be extended with these typical data virtualization features. And data virtualization servers will have to enrich themselves with full-blown support for Hadoop. But whatever happens, the two markets will slowly converge into one. Products will merge together and others will be extended. This is definitely a market to keep an eye on in the coming years.
Posted February 24, 2014 3:39 AM
Permalink | 1 Comment |