For more and more users OBI is crucial. For example, consider operational management and external parties, such as customers, suppliers, and agents. If we give them access to data to support their decision making processes, in many cases, only operational data is relevant.
But how do we develop BI systems that show operational data? In PowerPoint we can draw an architecture in which BI reports directly access operational databases. And on that PowerPoint slide all seems to work fine. Not in real life, however. Running a BI workload on an operational database can lead to interference, performance degradation, performance instability, and so on. In other words, the operational environment is not going to enjoy this.
This is where data replication can come to the rescue. With data replication we can create and keep a replica of an operational database up to date without interfering with the operational processing. When new data is inserted, updated, or deleted in the original operational database, the replica is updated accordingly and almost instantaneously. This replicated database can then be used for operational reporting and analytics.
Data replication as a technology has been around for a long time, but so far it has been used primarily to increase the availability and/or to distribute the workload of operational systems. My expectation is that data replication will be needed for implementing many new OBI systems. For these products to be ready for BI, besides supporting classic data replication features, such as minimal interference, high throughput, and high availability, they should also support the following three features that are important for BI:
- Easy to use and easy to maintain: Until now, data replication has been used predominantly in IT departments, and not so much in BI departments or BI Competence Centers. So within these BI groups a minimum of expertise exists with data replication and knowledge on how to embed that technology in BI architectures. Because of this unfamiliarity, it's important that these products are easy to install, easy to manage, and that replication specifications can be changed quickly and easily. A Spartan interface is not appreciated.
- Heterogeneous data replication: In many organizations the database servers used in these operational environments are different from the ones deployed in their BI environments. Therefore, data replication tools should be able to move data between database servers of different brands.
- Fast loading into analytical database servers: More and more analytical database servers, such as data warehouse appliances and in-memory database servers, are used to develop data warehouses and/or data marts. These database servers are amazingly fast in running queries. What we don't want is that data is loaded in these products using simple SQL INSERT statements. It will work, but it will be slow. Almost all of these products have specialized interfaces or utilities for fast loading of data. It's vital that data replication products exploit these interfaces or utilities.
Posted March 6, 2013 8:29 PM
Permalink | No Comments |