Blog: Rick van der Lans Subscribe to this blog's RSS feed!

Rick van der Lans

Welcome to my blog where I will talk about a variety of topics related to data warehousing, business intelligence, application integration, and database technology. Currently my special interests include data virtualization, NoSQL technology, and service-oriented architectures. If there are any topics you'd like me to address, send them to me at rick@r20.nl.

About the author >

Rick is an independent consultant, speaker and author, specializing in data warehousing, business intelligence, database technology and data virtualization. He is managing director and founder of R20/Consultancy. An internationally acclaimed speaker who has lectured worldwide for the last 25 years, he is the chairman of the successful annual European Enterprise Data and Business Intelligence Conference held annually in London. In the summer of 2012 he published his new book Data Virtualization for Business Intelligence Systems. He is also the author of one of the most successful books on SQL, the popular Introduction to SQL, which is available in English, Chinese, Dutch, Italian and German. He has written many white papers for various software vendors. Rick can be contacted by sending an email to rick@r20.nl.

Editor's Note: Rick's blog and more articles can be accessed through his BeyeNETWORK Expert Channel.

In a series of blogs I am answering some of the questions a large US-based, health care organization had on data virtualization. I decided to share some of their questions with you, because some of them represent issues that many organizations struggle with.

One question they had was related to their development approach: "Would it make sense to design the topmost layer of our virtual tables as close enough as possible to what the reports need? Then, create tables in Microsoft Access with the same table structure as those virtual tables and let the report programmers start building reports against these Microsoft Access tables. Meanwhile, the core team designs the data virtualization model (mapping the virtual tables to the real data sources). This way the two activities (1) programming of reports and (2) building of the virtualization model can be executed in parallel. In two months, when both activities are complete, they meet, and we switch the reports to point to the virtual views rather than to the Microsoft Access tables. Is this a sound approach, or are we stretching it too far?"

My direct response was: "No, you're not stretching it at all, I think you're getting it." What I added was that the only issue might be that minor SQL dialect differences can exist between that of the data virtualization server and Microsoft Access.

Their response: "Good point. We think this can be overcome. We can create the Microsoft Access tables. Then use that as a data source for the data virtualization server, create virtual tables that correspond 1:1 to the tables in Access. Next, have programmers code reports against those virtual tables that point to the Access tables. Finally, when the reports are ready, we redirect the virtual tables to the real data sources."

Evidently, this is the preferred approach, because this means that the reports always access the same virtual tables, even if the switch is made from Microsoft Access to the real data source. This redirecting of the virtual tables is completely transparent to the reports, and they will run unchanged. In addition, redirecting involves almost no work at all.

Just to be clear, I am not recommending Microsoft Access as the preferred platform for developing virtual tables, but I like how this customer is thinking about how to use the power of data virtualization servers to come up with a very efficient and agile development approach. Because data virtualization servers decouple the real data sources from the reports, changes (even drastic ones) can be made to the data sources without having to change the reports, or vice versa. In fact, this is why it's often said that data virtualization makes business intelligence systems more agile.

Note: For more information on data virtualization, I refer to my new book "Data Virtualization for Business Intelligence Systems" available from Amazon.


Posted September 14, 2012 2:09 AM
Permalink | 1 Comment |

1 Comment

Although pointing virtual tables to temporary source tables is a smart solution for parallel development of reports and mappings, one should not forget the importance of the existence of (preferably consistent and realistic) data in these source tables.

Report definitions are not approved if they are not tested against populated data sources. Only then is the moment when report definitions come alive and perhaps even trigger new insights resulting in more or less drastic definition changes.

Leave a comment