We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.


Is the Data Delivery Platform a Federated Architecture?

Originally published March 16, 2010

In a number of articles published on BeyeNETWORK.com, I introduced, explained, and defined the data delivery platform (DDP). The DDP is a new architecture for developing business intelligence (BI) systems where data consumers are decoupled from the data stores. One of those articles included the definition for the data delivery platform:


The data delivery platform is a business intelligence architecture that delivers data and metadata to data consumers in support of decision making, reporting, and data retrieval, whereby data and metadata stores are decoupled from the data consumers through a metadata-driven layer to increase flexibility and whereby data and metadata are presented in a subject-oriented, integrated, time-variant, and reproducible style.

In addition to publishing articles on the data delivery platform, for the last twelve months I have presented the data delivery platform at many events and conferences. Those publications and presentations led to quite a number of reactions. And, as can be expected, there were many positive and a few somewhat less positive reactions. But there was one issue that kept coming back: Is the data delivery platform a federated architecture and/or is it the same as a federation server? Therefore, to avoid confusions and misconceptions in the future, this article tries to clarify the relationship between, on one hand, the data delivery platform and, on the other hand, the federated architecture and the federation server. Note that federation servers are sometimes called enterprise information integration (EII) tools.

Comparison with the Federated Architecture

Some data warehouse specialists classify the data delivery platform as a federated architecture, which, in a way, is correct. In a federated architecture, a set of possibly heterogeneous data stores are presented to data consumers as one logical data store. To the data consumers, it looks as if they have logged on to one large database. Without knowing it, they can join data from those different data stores. This is what the DDP tries to do as well: multiple data stores are federated into one logical data store. However, this is not the only feature of the DDP, itís just one. So, the DDP can be a federated architecture, but a federated architecture does not automatically adhere to the definition of the DDP.

I say that the DDP ďcan beĒ a federated architecture and not that it ďisĒ a federated architecture. Imagine that a DDP-based system is developed right on top of one data store, and that data stores holds all the data needed by the data consumers. In that case, we do have the decoupling of data stores and data consumers, but there is no federation. So, the data delivery platform doesnít have to be a federated architecture, but in most cases it will be.

Sometimes you hear the claim that in a federated architecture, the only data stores accessed are production databases. In his article What a Data Warehouse is Not, Bill Inmon writes about a speaker at a conference who claims exactly that: ďÖa data warehouse was really all the old legacy systems connected by software that could access the data.Ē If thatís the meaning of the term federated architecture, then the DDP should not be classified as one! The DDP does not dictate what type of data stores are accessed. Those data stores can be production databases, but also data warehouses, operational data stores, and data marts.

Comparison with the Federation Server

Next, how does the DDP compare to a federation server? A federation server is a tool that technically links a number of data stores together and makes them accessible as one integrated data store. With a federation server, we can develop a federated architecture. Today there are many federation servers available on the market. Here are just a few (in alphabetical order): Composite Information Server, Denodo Platform, EntropySoft Content Federation Server, IBM InfoSphere Information Integrator, Informatica On-demand Integration, Information Builders EII, Oracle BI Server, Software AG Information Integrator, SQLstream EII, and XAware XA-iServer.

So, a federation server is a specific type of tool. That means that a comparison with the DDP wouldnít make sense, because the latter is a business intelligence architecture Ė itís a set of guidelines for developing business intelligence systems, and itís not a solution by itself. A federation server, on the other hand, is not an architecture, but a tool that can be used to develop a business intelligence system. It would be like comparing a blueprint of a building with the type of concrete used for the foundation of that same building.

Note that what applies to a federated architecture also applies to a federation server: it does not matter whether a database accessed by a federation server is a production database, a data warehouse, or a data mart. For such a tool, a database is a database.

Although we canít compare the DDP with a federation server, a relationship between them does exist, because the functionality offered by federation servers does resemble the core functionality of the DDP. The expectation is that federation servers will form the heart of many DDP-based business intelligence systems that will be developed the coming years. Some of these products have been around for quite a number of years and are definitely mature enough. In fact, organizations are employing these products today, even in very large environments.

If we look at the current features of federation servers, a DDP requires more functionality than these products can offer today. Letís describe some of these omissions.

First of all, most of the federation servers support only one query language. In most cases, itís SQL and sometimes XQuery; quite often, MDX is not supported, which is unfortunate, because MDX is pretty popular in the business intelligence space. A requirement of the DDP is that different query languages can be used.

Secondly, some federation servers or federation technologies donít support an open API. These tools can only be accessed by reporting and analysis tools developed by that same vendor. Many BI tools support some kind of semantic layer that is metadata-driven and that hides the structure and the technical details of the underlying data stores. In a way, that layer operates as a federation server. However, most of them have a proprietary API. This means you canít develop reports using tools from other vendors that access the data through the layer and reuse the metadata specifications. The DDP requires that any tool from any vendor should be able to access all the data stores. Only then will we be able to share specifications.

A third omission is that not all forms of metadata are handled by most federation servers. In most tools, only technical metadata is stored and can be queried. A DDP also requires that business and operational metadata is stored and that those forms of metadata are accessible by users and can be used by all kinds of tools.

A fourth omission is that most federation servers only support the pull model. They have not been designed to support pushing data from the data stores to the data consumers. More and more database servers appear on the market that make this push model possible, such as IBM InfoSphere Streams, Streambase, and Microsoft Insight (which is expected to be released in 2010 together with SQL Server 2008 R2).

But the most important omission is that some federation servers have limited decoupling capabilities. Imagine that a particular report has to be changed; it might well be that the structure of the data store has to be changed accordingly to make the report change possible. Or, imagine that we want to move data from a relational data store to an MDX-based data store in the hope to improve query performance. Most federation servers will demand that if that change takes place, the reporting tool switches from using SQL to MDX as well. The reason for this demand is simple: they canít translate SQL to MDX. And replacing all the SQL code by MDX code is a major undertaking. And we are assuming that the federation servers support MDX, which most of them donít.

Note that these omissions should not stop organizations from adopting the DDP. As indicated, we can develop business intelligence systems using federated servers today. As we all know, itís quite easy to sit at your desk and come up with a perfect set of features for tools. Itís another story if you are a vendor and you have to implement those features.

Summary

To summarize, the DDP can be classified as a federated architecture (but only if a federated architecture does allow the access of all types of data stores, including data warehouses and data marts). But to say that we can develop a perfect DDP purely by using a current federation server is an oversimplification. If you want to develop a DDP right now using one of the commercially available federation servers, there is a big chance you will still need to develop some extra functionality yourself Ė maybe youíll have to combine two federation servers, or perhaps youíll have to extend it with some additional tools. In a nutshell:

DDP-based business intelligence system = federation server++

Hopefully, the vendors of federation servers will see the advantages of the DDP for developing business intelligence systems and they will understand the need for the extra features described herein. It might make them decide to upgrade their products soon.

  • Rick van der LansRick van der Lans

    Rick is an independent consultant, speaker and author, specializing in data warehousing, business intelligence, database technology and data virtualization. He is managing director and founder of R20/Consultancy. An internationally acclaimed speaker who has lectured worldwide for the last 25 years, he is the chairman of the successful annual European Enterprise Data and Business Intelligence Conference held annually in London. In the summer of 2012 he published his new book Data Virtualization for Business Intelligence Systems. He is also the author of one of the most successful books on SQL, the popular Introduction to SQL, which is available in English, Chinese, Dutch, Italian and German. He has written many white papers for various software vendors. Rick can be contacted by sending an email to rick@r20.nl.

    Editor's Note: Rick's blog and more articles can be accessed through his BeyeNETWORK Expert Channel.

Recent articles by Rick van der Lans

 

Comments

Want to post a comment? Login or become a member today!

Posted March 17, 2010 by reve@compositesw.com

Rick -

You continue to do the IT industry a good service with your DDP insights. 

And as a provider of of a "Federation Server"(Composite Information Server) otfen deployed to enable a "Federated Architecture", we are pleased to see that you have set a high bar regarding the capabilities required.

If any readers would like to know more about where Composite's offerings stack up on an absolute basis to overall requirements and/or relative to the ommissions you list, they can reach out to me directly as reve@compositesw.com

- Bob

Is this comment inappropriate? Click here to flag this comment.