Blog: Rick van der Lans Subscribe to this blog's RSS feed!

Rick van der Lans

Welcome to my blog where I will talk about a variety of topics related to data warehousing, business intelligence, application integration, and database technology. Currently my special interests include data virtualization, NoSQL technology, and service-oriented architectures. If there are any topics you'd like me to address, send them to me at rick@r20.nl.

About the author >

Rick is an independent consultant, speaker and author, specializing in data warehousing, business intelligence, database technology and data virtualization. He is managing director and founder of R20/Consultancy. An internationally acclaimed speaker who has lectured worldwide for the last 25 years, he is the chairman of the successful annual European Enterprise Data and Business Intelligence Conference held annually in London. In the summer of 2012 he published his new book Data Virtualization for Business Intelligence Systems. He is also the author of one of the most successful books on SQL, the popular Introduction to SQL, which is available in English, Chinese, Dutch, Italian and German. He has written many white papers for various software vendors. Rick can be contacted by sending an email to rick@r20.nl.

Editor's Note: Rick's blog and more articles can be accessed through his BeyeNETWORK Expert Channel.

I still hear people say that data virtualization is just a new name for data federation--old wine in new skins. Not true! If we look at the current generation of data virtualization servers, we have to conclude that they offer a lot more functionality than data federation servers.

Data federation is run-time technology that makes it easy for an application to access a heterogeneous set of data stores. In this case, the data federator deals with all the different API's, the different database languages, it will try to optimize access to those data stores by doing distributed join optimization, and it will handle all the issues of distributed transactions.

In my book on data virtualization (Data Virtualization for Business Intelligence Systems), I define data federation as follows:

Data federation is an aspect of data virtualization where the data stored in a heterogeneous set of autonomous data stores is made accessible to data consumers as one integrated data store by using on-demand data integration.

Data virtualization is much more than data federation. Here are some of the features supported by data virtualization servers today:

  • Self-service, iterative, and collaborative development
  • (Canonical) data modeling
  • On-demand data profiling and data cleansing
  • Full support for the entire development life cycle: business glossary, information modeling
  • Extensive data integrity features
  • Extensive master data management features
  • Integration of different data integration styles, including ETL, ELT, and replication
In a nutshell, where data federation is primarily run-time technology, data virtualization supports the entire system development life cycle. So, it supports modeling and design as well. Or, to you use popular terminology, data virtualization = data federation++.

If you want to know more about this topic, attend my session at the The Data Virtualization Experts Forum.


Posted September 21, 2012 1:41 AM
Permalink | No Comments |

Leave a comment