The Requirements of the Data Delivery Platform
Originally published February 4, 2010
In this article, a list of requirements for the data delivery platform (DDP) is described. The DDP is an architecture for developing business intelligence (BI) systems where data consumers (such as reports and spreadsheets) are decoupled from data stores (such as data warehouses, data marts, and staging areas). The primary goal of this decoupling is to get a higher level of flexibility. In the article The Definition of the Data Delivery Platform, published at BeyeNETWORK.com, the following definition for the DDP was presented:
The data delivery platform is a business intelligence architecture that delivers data and metadata to data consumers in support of decision making, reporting, and data retrieval, whereby data and metadata stores are decoupled from the data consumers through a metadata-driven layer to increase flexibility and whereby data and metadata are presented in a subject-oriented, integrated, time-variant, and reproducible style.
To describe a particular concept as accurate as possible, a definition is required. A definition also helps to explain what that concept is. Having a clear definition can also avoid endless discussions, because different people might have different opinions about what that specific concept is.
However, even if the definition of a concept is perfect, itís hard to include all the requirements of that concept in its definition. For example, this is Bill Inmonís popular definition for the term data warehouse:
A data warehouse is a subject-oriented, integrated, time-variant, nonvolatile collection of data in support of managementís decision-making process.
Although this definition is widely used, and although most specialists have a good understanding of what is meant with all the terms used in the definition, it still leaves many questions unresolved. For example, should the tables in a data warehouse be normalized or not? Should all the tables be stored physically, or is a more virtual solution acceptable too? Should all the data be organized in tables at all? Does the term data include unstructured data? The definition doesnít answer those questions, which could lead to different interpretations, misunderstandings, and confusion.
Inmon is aware of this, and therefore in some of his other articles and books he does describe some additional requirements. For example, in his article What a Data Warehouse is Not, he writes: ďA much more rational way to build the data warehouse is to use the relational model.Ē This means that a requirement for a data warehouse is that its table structures are normalized and are not designed as stars or snowflakes. In other words, a database containing tables that are not properly normalized cannot be called a data warehouse. More requirements can be found in other books and articles.
Thus, having additional requirements (in addition to a definition) is not uncommon for describing a concept in more detail. The main reason they are needed is that itís very hard to come up with a definition that describes exactly what a particular concept is, and that is still readable and not a full page long. This is definitely true for non-mathematical concepts such as a data warehouse. In a way, itís unfortunate that in our field we donít have more precise definitions, such as E = MC2, but alas.
To avoid confusion and misconceptions with respect to the data delivery platform, this article lists the minimum requirements of a DDP-based system. The list should also give readers a more detailed understanding of what the DDP is and what it isnít. In addition, if an organization wants to develop their own DDP-based business intelligence system, the list will also inform them about the minimum requirements.
The Requirements of the Data Delivery PlatformTo bring some structure in the list of requirements, they have been classified in eight categories:
Non-Functional Requirements of the Data Delivery PlatformThe list of requirements above does not include non-functional requirements such as those related to performance, availability, scalability, and concurrency. The reason is that in general these requirements will not determine whether something adheres to a particular definition. For example, whatever the definition of the word "car" is, even if my car is incredibly slow, itís still a car; in fact, even if it canít be driven anymore, itís still a car. Or, if my computer has crashed, it would still be called a computer because it would still adhere to the definition of computer and to the additional functional requirements. Similarly, if the query performance of a specific data warehouse is bad, itís still a data warehouse; a data mart with availability problems is still a data mart; and likewise, a DDP-based system is still a DDP-based system even if problems occur with aspects such as performance, availability, scalability, and concurrency. To summarize, in most cases non-functional requirements do not determine whether something conforms to a definition.
In most cases, non-functional requirements apply to specific solutions and are subjective. The car I drive is fast enough for me, but may be too slow for other drivers. The same applies to a DDP-based business intelligence system developed for a specific customer. Its query performance might be fast enough for them, but maybe itís too slow for another customer. It all depends on what the customer wants and needs. But whatever the performance is and whatever the customer thinks of it, itís still a DDP-based business intelligence system. It will be the responsibility of the vendors and the developers to build and deliver solutions that conform to the non-functional requirements demanded by customers and users.
CommentsI do welcome any comments, remarks, and improvements to further improve and complete this list of requirements of the data delivery platform.
Recent articles by Rick van der Lans
Copyright 2004 — 2020. Powell Media, LLC. All rights reserved.
BeyeNETWORK™ is a trademark of Powell Media, LLC