We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.


Blog: Rick van der Lans Subscribe to this blog's RSS feed!

Rick van der Lans

Welcome to my blog where I will talk about a variety of topics related to data warehousing, business intelligence, application integration, and database technology. Currently my special interests include data virtualization, NoSQL technology, and service-oriented architectures. If there are any topics you'd like me to address, send them to me at rick@r20.nl.

About the author >

Rick is an independent consultant, speaker and author, specializing in data warehousing, business intelligence, database technology and data virtualization. He is managing director and founder of R20/Consultancy. An internationally acclaimed speaker who has lectured worldwide for the last 25 years, he is the chairman of the successful annual European Enterprise Data and Business Intelligence Conference held annually in London. In the summer of 2012 he published his new book Data Virtualization for Business Intelligence Systems. He is also the author of one of the most successful books on SQL, the popular Introduction to SQL, which is available in English, Chinese, Dutch, Italian and German. He has written many white papers for various software vendors. Rick can be contacted by sending an email to rick@r20.nl.

Editor's Note: Rick's blog and more articles can be accessed through his BeyeNETWORK Expert Channel.

January 2013 Archives

In this series of blogs I'm answering common questions on data virtualization. In this blog I address the question: "Do data virtualization products support data security?" In most cases what is meant is whether these products can hide some of the data that is being made available through the data virtualization server.

The issue is the following. When a data virtualization server has been connected to many different data sources and when a user has access to that data virtualization server, potentially he has access to a vast amount of data. Should he be allowed to access all that data, or should certain pieces be hidden? For most organizations the answer is that users are not always allowed to access or change all that data.

All data virtualization servers support a form of data security we usually call authorization. Authorization rules can be defined to control which user is allowed to access which data elements. This is somewhat similar to assigning privileges to users with the GRANT statement in SQL. The following types of privileges are normally supported by data virtualization servers: read, write, execute, select, update, insert, and grant.

Privileges can be granted on the table level, the column level, the row level, and the individual value level. Table-level and column-level privileges are supported by all data virtualization servers. If a user receives a table-level privilege he can see or access all the data in that table. When the user only receives the privilege on a set of columns, some columns will stay hidden.

In some situations, authorization rules have to be defined on a more granular level, namely on individual rows. Imagine that two users may query the same virtual table, but they are not allowed to see the same set of rows. For example, a manager may be allowed to see the data of all the customers, whereas an account manager may only see the customers for whom he is responsible. Row-level privileges have as effect that if two users retrieve data from the same virtual table, they see different sets of rows.

The most granular form of a privilege is a value-level privilege. This allows for defining privileges on individual values in a row. The effect is that some users have access to particular rows, but they won't see some of the values in those rows, or they only see a part of the values. Defining value-level privileges is sometimes referred to as masking.

To summarize, data virtualization products offer a rich set of data security options. Besides the mentioned authorization rules, encryption is usually supported for messages that are being send or received, and so on.

For more information on data security and for more general information on data virtualization I refer to my book "Data Virtualization for Business Intelligence Systems."

Note: If you have questions related to data virtualization, send them in. I am more than happy to answer them.


Posted January 25, 2013 1:19 AM
Permalink | No Comments |
In this series of blogs I'm answering popular questions on data virtualization. In this blog I address the question: "Do data virtualization products support updates and transactions?" Sometimes this question is phrased as "When I use data virtualization, can I only query the data sources, or can I also change the data in the underlying data sources."

In most situations, when data virtualization is deployed for accessing data stored in data warehouses, staging areas, and data marts. Data is read from those data sources by the data virtualization servers, but is not updated, inserted, or deleted. And because, currently, data virtualization servers are predominantly used in BI and reporting environments, some people may get the feeling that these products do not allow to or cannot update, insert, or delete the data. This is not true.

Some of the data virtualization products were initially designed to be deployed in SOA environments where they could be used to simplify the development of transactions on databases and applications. So, although the focus of some vendors has shifted to BI environments, the ability to run transactions still exists. Most products allow data in data sources (even non-SQL-based sources) to be changed. The data virtualization handles all the transactional aspects of those changes. They even support distributed transactions: when data in multiple data sources are changed, those changes are treated as one atomic transactions. And they support heterogeneous distributed transactions; data in different types of data sources are changed.

Evidently, data can only be changed and distributed transactions can only be executed on data sources that allow and support such functionality. For example, data can't always be changed if the underlying data source is a web service, a Word document, or an external data source. In such situations, data cannot be changed or you're just not allowed to change it.

To summarize, data virtualization servers allow data in the data sources to be changed, and they can guarantee the correct handling of transactions. This makes them suitable for, for example, creating a modular set of services that can be used by applications to change the data. These services hide where data is stored, whether the data has to be changed in multiple systems, how the data is stored, and so on. Note, though, that support for updates and transactions differs between data virtualization servers.

For more information on updates and transactions and for more general information on data virtualization I refer to my book "Data Virtualization for Business Intelligence Systems."

Note: If you have questions related to data virtualization, send them in. I am more than happy to answer them.


Posted January 23, 2013 7:30 AM
Permalink | No Comments |
Many people still associate the company Quest Software with Java and C# program development. Products such as Toad for Oracle and Toad for SQL Server have been and still are very popular among programmers. What many do not know is that Quest, now owned by Dell, has a very powerful set of business intelligence tools, including Toad Data Point and Toad Decision Point.

Toad Data Point is a cross-platform query and reporting tool designed for data analysts. It allows for visual query building and workflow automation. In addition, and this is a strong point of Quest's BI portfolio, it has very strong data connectivity capabilities. Besides being able to access many classic SQL databases through standard ODBC/JDBC drivers, it comes with a client version of Quest Data Services. Quest Data Services can be seen as a modern data federation product. It simplifies access to data sources and allows users to join data from multiple data sources, including data from Hadoop via Hive.

Toad Decision Point is a cross-platform reporting, analytical, and visualization tool designed for decision makers. It's a highly graphical environment. It also comes with a client version of Quest Data Services. However, with this version of Quest Data Services a wider range of data sources can be accessed, including applications such as Salesforce.com, and NoSQL data sources such as Hadoop, MongoDB, and Cassandra. Some of these NoSQL databases support non-relational concepts, such as super-columns. Quest Data Services makes it possible to access those non-relational concepts in a relational style. This opens up all the valuable data stored in these systems to a large set of tools, reports, and users.

The third core product of Quest's BI stack is Toad Intelligence Central which is a server version of Quest Data Services (the one supported by Toad Decision Point). Because it runs on the server, it makes data source access much more efficient. Because Toad Intelligence Central is server-oriented, users and reports can share specifications. The IT department can enter and manage these centralized specifications making it easier for users to develop and maintain their reports.

In short, the Quest BI portfolio is definitely worthwhile studying. Especially now that they have the backing of Dell itself, their product set could become a serious challenger to the established BI vendors.


Posted January 11, 2013 8:28 AM
Permalink | 2 Comments |