We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.

Blog: Ronald Damhof Subscribe to this blog's RSS feed!

Ronald Damhof

I have been a BI/DW practitioner for more than 15 years. In the last few years, I have become increasingly annoyed - even frustrated - by the lack of (scientific) rigor in the field of data warehousing and business intelligence. It is not uncommon for the knowledge worker to be disillusioned by the promise of business intelligence and data warehousing because vendors and consulting organizations create their "own" frameworks, definitions, super-duper tools etc.

What the field needs is more connectedness (grounding and objectivity) to the scientific community. The scientific community needs to realize the importance of increasing their level of relevance to the practice of technology.

For the next few years, I have decided to attempt to build a solid bridge between science and technology practitioners. As a dissertation student at the University of Groningen in the Netherlands, I hope to discover ways to accomplish this. With this blog I hope to share some of the things I learn in my search and begin discussions on this topic within the international community.

Your feedback is important to me. Please let me know what you think. My email address is Ronald.damhof@prudenza.nl.

About the author >

Ronald Damhof is an information management practitioner with more than 15 years of international experience in the field.

His areas of focus include:

  1. Data management, including data quality, data governance and data warehousing;
  2. Enterprise architectural principles;
  3. Exploiting data to its maximum potential for decision support.
Ronald is an Information Quality Certified Professional (International Association for Information and Data Quality one of the first 20 to pass this prestigious exam), Certified Data Vault Grandmaster (only person in the world to have this level of certification), and a Certified Scrum Master. He is a strong advocate of agile and lean principles and practices (e.g., Scrum). You can reach him at +31 6 269 671 84, through his website at http://www.prudenza.nl/ or via email at ronald.damhof@prudenza.nl.

I have written numerous times about efficient and effective ways of deploying data in organizations. How to architect, design, execute, govern and manage data in an organization is hard and requires discipline, stamina and courage by all those involved. The challenge increases exponentially when scale and/or complexity increases.

On average a lot of people are involved and something of a common understanding is vital and unfortunately often lacking. As a consultant it is a huge challenge getting management and execution both on the same page, respecting the differences in responsibilities and the level of expertise regarding the field we are operating in.

An abstraction functions as a means of communication across the organization. An abstraction to which the majority can relate to and that can be used to manage, architect, design, execute, govern and manage the field that is at stake.

For data deployment I came up with the so-called '4 Quadrant model' (4QM).

Posted August 16, 2013 5:53 AM
Permalink | 2 Comments |

W.Edwars Deming once said: "The customer is the most important part of the production line."

In terms of engineering information-assets (getting relevant data to the end-user and doing valuable stuff with it) this still holds big value, but...there is a misconception I would like to stipulate. The misconception that users in an organization are 'customers'. I hear that quite a lot and I find it disturbing and dangerous.

Please, please, please...do not...do never treat the recipients or users of information assets within your organization a customer. These so-called customers are paid by the same people as you, whether you are an engineer, designer, architect, manager or the freakin coffee machine. You and this so-calledcustomer are part of an organisation....

According to wikipedia, an organisation is a social entity that has a collective goal and is linked to an external environment. 

... joined by collective goals - that is why you are organised in a single entity called 'organisation'. Customer are those people or entities in the external environment - outside the organisation. Got it? 

These users, that are sometimes referred to as 'customers' are, as Deming intended, a vital part of the production process of Information assets. But with users there is more....They need to be involved and are as accountable as anyone in the organisation involved in making relevant information-assets with scarce resources.

These users can not say 'I am a customer, thou have to listen to me' or 'the data sucks, you gotta fix it' or 'the data does not reconcile, I will never use this again' or 'I know we have tool X, but I just acquired tool Y'. It is a jointly effort with joint accountabilities between engineer and user, where both are joint in the common goals of the organization.

So, if you are a BICC manager, an ETL developer, a data modeler, a BI consultant or whatever, and someone comes up to you saying 'I am a customer and I want X' - you know what to say.

Stop treating internal users of information assets as customers. 


Posted June 23, 2013 4:21 AM
Permalink | 2 Comments |

This blog has been inspired by Martijn Evers blogpost, by Barry Devlin's work on Business Integrated Insight, by some very forward thinking customers and is bugging me for some time now.

Back in 2005 I gave masterclasses 'Data Warehousing in Depth' at the knowledge institute CIBIT in the Netherlands. One part of the class was pondering about the future of data warehousing. It was my favorite part and I remember that I always reminded the class as to the root-reason for a data warehouse; a physical construct to overcome deficiencies like performance, history, auditability, data quality, usability, ...

And I remember drawing two circles; one for data in the operational environment and one for data in the informational environment. Both represented physical environments, in other words; the data needed to physically move from the operational to the informational environment.

What would be necessary to blend the two environments, I then asked;

  •  "Sources need to maintain history and adhere to strict auditability rules"
  • "More and faster hardware and parallelization options in both hard- and software"
  • "A common vocabulary of data, reflected in the operational systems"
  • "Enterprise-wide data like master- and reference data needs to be maintained pro-active (upstream) and centrally"
  •  ..

These are all still very valid points, but I would add a more profound one:

Data Virtualization guru's preach to us the Information Hiding1 pattern. A pattern perfectly suited for the decoupling information systems and their data. These Data Virtualization guys and gals say that the software that supports this data virtualization is the new pinnacle for decoupling operational and informational environments and make the Data Warehouse2 (eventually) obsolete. 

My opinion; 'they' are right and they are wrong. Yes, Information hiding is a pattern (there are more) that enables decoupling of the information environment and the operational environment. But this distinction is somewhat flawed - it is a distinction that originated from the early 90's that reflected deficiencies of using registered data for decision support activities.

We might wanna make an effort in trying to get rid of these deficiencies....

In the last 20-30 years, the data-model and its instantiations (the data) where directly based on the Information systems. The data-model and the data were by-products. The data-model fitted the information system. The Informational environment was born to overcome the deficiencies that came with this approach.

I want to make a plea for shifting this process-oriented thinking and designing to data-oriented thinking and designing. Make the data smarter instead of making countless little/big information systems with their own 'data-store'. It is not that odd; information systems and the business processes they support are so much more susceptible to change than the data is. Data is - in its very nature - extremely stable over time.

This plea is not new; start with an Information Model of your business, construct a conceptual model and slowly design your way down to the logical, physical model and/or canonical model3. Information systems are now made to fit the data-architecture and not vice versa. These information systems are somewhat decoupled from the data architecture. These information systems need to use the Information Hiding pattern.

Now, the principle of Information Hiding and the technology that can handle this pattern can flourish. Data Virtualization technology can be used to its full potential, but the same applies to BPM technology or technology which is based on service oriented architectures.

So, now we have the following list of requirements that are necessary to blend operational- and information environments:

1.     A data-oriented design and architecture of information system as opposed to a process oriented design and architecture

2.     "Sources need to maintain history and adhere to strict auditability rules"

3.     "More and faster hardware and parallelization options in both hard- and software"

4.     "A common vocabulary of data, reflected in the operational systems"

5.     "Enterprise-wide data like master- and reference data needs to be maintained pro-active (upstream) and centrally" 


And I will add another three:


6.     Centralized design and maintenance of business rules

7.     Data security and privacy law and regulations are enforced on the data-level.

8.     An organizing framework for establishing strategy, objectives, and policies for (corporate) data4


It is impossible to be completely thorough in this list, there are indeed many more, but this is a blog.....not a book ;-)

The mentioned criteria can be mapped to a discipline that is about to reach a critical mass in terms of body of knowledge, technology support, rigor in science and relevance in practice; Data Management and Data Governance. 

Back in 2005, in my masterclasses, I mused over the future of data warehousing. The future is now, the journey will not be easy, but the rewards are substantial. It truly is the future of the Sense and Respond5 organization.



David L. Parnas, 1972, On the criteria to be used in decomposing Systems into modules

2 Data Virtualization vendors often claim to make the Enterprise Data Warehouse obsolete (referencing to the Boulder BI Brain Trust meeting beginning of this year where a vendor made this claim). They confuse a technology (data virtualization software) with an architectural construct.

3 A Canonical model is a design pattern used to communicate between different data format. It needs to be based on the logical model of the organization.

4 Jill Dyche and Evan Levy, note from the blog-post-author; I adapted the quote by adding brackets between 'corporate'

5  Stephen H. Haeckel, 1999, Adaptive Enterprise: Creating and Leading 

Posted January 22, 2013 4:53 AM
Permalink | No Comments |

There is something going on for some time now, decades even. It all started with the arrival of the Internet where people voluntarily contributed data to, well, everyone who was interested. Data about themselves, their relationships, their adventures, their careers. This data was shared with consent of the owner of the data - although not everyone knew what the data was used for. So one might say that there was consent, but not informed consent.

Lets take it a step further and imagine....

What if our data  - generated by others - was given back to us and we could consent in an informed manner, that we wish to share this data for the greater good? For example; my tax information is my data, it is about me and I want to decide whether or not others can use this data. Or, suppose I can get a hold ofmy location/GPS data, showing all my movements. Or my point of sale data from the grocery store, showing my eating patterns. Suppose I can even get a hold of the data of the last MRI I took, my genome data or the data of my last blood test or even the data of a clinical test I was in?


What if I could decide to contribute this data (consensually) for the public good, where my privacy was Freedom-road-signstill being honoured? What if dozens of people would decide that? What if millions of people would decide that? Clinical research would never be the same again. We would be able to scan for patterns in seas of data consisting of environmental data and healthcare data. No more clinical trials with just 2000 people and ever-increasing smarter statistics. In this setting the healthcare specialists, the quants, the sociologists and the behavioural scientists would have an unprecedented test bed of data. Is there a correlation (or even causality) between aspects of travelling, career, eating patterns, social status and cancer? Suppose even several generations would contribute their data; what would that mean for clinical research? Mindblowing.... 

In the above I discussed data that was about myself and so I should be the one who should decide whether or not to share. But what about data that is ours? The government heavily sponsors research in many countries. Research on biology, behavioural science, economic science, climate science etc.. Shouldn't the data generated by this research be public domain? I think it should...

What about data created by government - which is us. Data about im- and export movements, data regarding employment, schooling, law enforcement, crime, etc.. 


What would this democratization of data mean with regard to innovation? I think it would truly ignite a burst of possibilities and a huge potential for our general wellbeing. And no - I am not referring to the challenge of marketing handbags to middle age ladies (quote somewhat paraphrased from Neil Raden).

No, set the data free to go for the real challenges we face; decreasing poverty, climate control, improving healthcare, scarcity of resources, economic stability and decreasing crime.

This blogpost is hugely inspired by John Wilbanks - google the guy (!) -, all the Open Data initiatives of the world where governmental agencies free up their data, the technological possibilities of data storage, data deployment, data enrichment, data visualization and advanced analytics and finally...this blog is inspired by a deeply felt wish and conviction that our field of knowledge (data management and data utilisation) can make a contribution to a better place for us to live in.

Posted January 4, 2013 8:50 AM
Permalink | No Comments |

I have always been fascinated by the true origins of modern-day phrases or trends in my domain - Information Management, data management in particular. It is like a challenge I give to myself, a puzzle waiting to be solved. Why you say? Well, Aristotle said it already:

'If you would understand anything, observe its beginning and its development'.

I tend to collect first the modern-day writings about it, mostly by practioners. Then I go to the on-line science libraries and browse through ACM journals, MIS Quarterly journals, European Journal of Information Systems, IBM Systems Journal, Decision Support Systems, Journal of Management Information Systems and lately the journal of Data and Information Quality Research. And I am forgetting a whole lot. But, since the field of information management is a relatively young science, I tend to eventually end up in the more or less classic science domains; psychology, mathematics, engineering, etc.. If I took it any further I would probably end up with philosophy and Theology ;-) and discover the meaning of life.... 

Being on such a quest is like opening up an unprecedented series of presents given to me by brilliant men and women. There is so much out there that can easily be applied to other domains, for example, the information management domain.

With 'Data Quality' the same applied. I started with books of Thomas Redman, aka the data doc, of course Larry English, Danette McGilvray, David Loshin, Jack Olson and also Arkady Maydanchik can not be missed. And one cannot overlook the books written by Yang Lee, Richard Wang and Leo Pipino. The majority of these books however (with the exception of Lee, Wang and Pipino), lack the scientific rigor, the kind of Design Research approach as introduced by Alan Hevner in 2004 (published in MIS Quarterly). And although this type of research is relatively young, there are many scientific based papers out there that more or less adhere to several of the Design Research pre-requisites that aim to have scientific rigor and relevance in practice.

Since 2004 many papers on data quality have been published that are really precious to me, but it just was not good enough for me. I had not reached the true origins yet, so I felt. So I broadened the scope to 'Quality' in general. Quality in a manufacturing/engineering/services context pointed me in the direction of Shewhart, Demming, Juran, Crosby, Feigenbaum, Ishikawa, and also Peter Drucker. Boy - did I enjoy the writing of these guys (sorry, they were all men).

However, I slowly digressed into various domains that opened up Pandora's box; the domain of coping with change, management theory, decision theory, group processes, system theory, system dynamics and much more. And although I studied on a university, economics, this was all new to me. 

Still not sure whether I have not being paying attention back in college or my university just sucked.

In between I entered into the field of Quality Software Management, not that odd I would say; on an abstract level one might argue that it is the sum of the above combined with software engineering and my own professional domain and the projects I undertook. Back then (and I still do) I felt that Gerald (Jerry) Weinberg seemed to have captured the soul of all these quality people combined with system theory, system dynamics, software engineering, a profound human perspective and a keen view on leadership and management (and why many current management models simply disfunction).

If anyone want to really go on a quest regarding 'agile software development'; do not bother, start by reading the books of Jerry Weinberg. You will not find the word 'agile', but you will recognize it.

These books (and he wrote a whole lot) put me on a roller-coaster (which I am still on) that included exploratory testing, self-organizing teams, leadership, Kanban/Scrum/XP, CMMI, Six Sigma, etc...

I have so many books now, so many papers, so many subjects, so many loose ends.....it is ridiculous.

And it all started with 'data quality'....

Am I done yet?

Hell no

Will I ever be done?

Hell no

Is it fun?

Hell yes

I need a second life, and a third...

Posted December 15, 2012 3:50 AM
Permalink | No Comments |
PREV 1 2