We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.

A Classification Scheme for Data

Originally published April 1, 2010

Rick van der Lans introduces a classification scheme that would be suitable for indicating what types of data are stored in which data sources.

You must be a member to view this content. Membership is free and gives you full access to all content across the Network. Login or sign up today.

Sorry there was a problem with your login or password. Please try again.
Member Login

Not a member? Signup now!
Forgot your password? Get it now.



Want to post a comment? Login or become a member today!

Posted April 12, 2010 by Karien Verhagen

You are right Rick. Your study is a brave start but it needs additions.

A complication of your classification is that the difference between factual and descriptive data is not that obvious. Some years ago an article called : A fact is a Fact when stated Fact explained that f.i. a policy can be a fact (for intermediates and policy vendors) or a dimension for a damage claim. I can look it up for people who are interested. Mail me at forbisscholing@cs.com.

Secondly there is a big bulk of “composed” data. Fact data usually have attributes like start time or end time. The completion time is very important to evaluate good effective transport, but it is recorded nowhere. On the spot you could quickly calculate the completion time, but how about “overall profit” or “Margin” over the first three months, containing all kinds of often conditional calculations over billions of little transactions? In other words the constructions that accountants are so interested in these days for compliancy and transparency reasons? In your description of where to find what, I see no place to store this kind of data.

You are right that there should be a well guarded place to store budgets, plans, forecasts in a better, more reliable way. How can BI facilitate the evaluation of the effectiveness of measures taken if the rules to measure the facts against them, are not safely stored in a controlled way with its own evolvement in history? Measuring against constantly changing or adaptable spreadsheets ? I think not!

Is this comment inappropriate? Click here to flag this comment.

Posted April 7, 2010 by


Great article! It nicely integrates differents views on data. I think the classification scheme is of significat added value when design an enterprise architecture. Common EA frameworks (like TOGAF and Capgemini's IAF) distinghuish 4 domains: business, information, application and technology architecture. The information architecture typically models the information / data that is used by business services (i.e. the business architecture) and provided by applications (i.e. application architecture). At the information architecture level, the classification scheme can be used to provide a logical grouping of data. This will enhance the understanding of the kind of data and information an enterprise needs.

Another important question is how each type of data is to be supported by applications. The proposed classification schema introduces a new way how applications can be defined, i.e. applications that specifically address certain types of data. Now, at the application level it is my belief that applications should be distinguished between 'data oriented'-applications and 'user-oriented'-applications. Data oriented applications provide data services to other applications (data or user oriented applications). MDM systems are typically data oriented applications. A data warehouse, when seen as an application, can also be regarded as a data oriented application. The classification schema provides additional criteria to define especially data oriented applications, e.g. a predictive data-application containing sales forecast that is provided to other applications. Need to think this one over but the idea of using your scheme for this purpose is challenging! I think we should think about guiding principles that determine when a particular type of application containing a particular type of data (according to your scheme) should be defined in the application architecture.

Finally, guidelines or principles are required that prescribe when the classification of data changes. From actual to historical is rather straightforward, but when does predictive data becomes actual, e.g. when it's baselined?

Good stuff!

Wouter van Aerle

Is this comment inappropriate? Click here to flag this comment.