Many of the NoSQL systems have built-in data replication features--data is automatically stored multiple times. In fact, the developers can set how many replicas have to be created. However, the replication features of NoSQL systems are limited to a homogenous environment. It's not possible to use these features when, for example, data has to be replicated from a NoSQL system to a classic SQL system.
Today, most data replication products can't replicate from or to NoSQL systems. However, if they can in the future, what will be important is that they handle the non-relational concepts of NoSQL systems efficiently. The keyword here is efficiently. Most existing data replication tools have been designed and optimized to copy data between SQL systems. So, they have been optimized to efficiently process relatively short records with a fixed structure. However, NoSQL records are not always short and fixed with respect to structure. NoSQL systems support a wide range of concepts:
- Many NoSQL systems, including the key-value stores, the document stores, and the column-family stores support extremely long records. These records can be magnitudes longer than what is common in SQL systems. Current data replicators have been optimized to replicate short records.
- Almost all NoSQL systems support tables in which each record can have a different structure. This is new for data replication products. For example, what will that do to compression algorithms that assume that all records have the same structure?
- Document stores and column family stores support hierarchical structures. If that type of data has to be replicated into SQL systems, it has to be flattened somehow. The challenge is to do that very fast. But can it be done fast enough? Data replicators are usually not strong at transformations, because it slows down the replication process too much.
- Column-family stores support what the relational world used to call repeating groups. The same as for hierarchical structures, how can they be mapped to relational structures by the data replication tools efficiently.
Note: For more on this topic, see the whitepaper Empowering Operational Business Intelligence with Data Replication.
Posted April 29, 2013 1:16 AM
Permalink | No Comments |



