Oops! The input is malformed!
Originally published November 28, 2011
“Data ownership” is a relatively new term. I do not think it was much in vogue before 2000, though no doubt it was used somewhere prior to that. My friend Michael Scofield had written about data ownership in the late 1990s, but may have been ahead of his time.
Trying to get a definition of data ownership is not easy, and in most cases (but not all) the term is misleading. This stems from the basic concept of ownership, which means to have legal title and full property rights to something. If we accept this as the correct definition of ownership, then data ownership must be to have legal title to one or more specific items of data. However, this cannot be what is generally meant. If it were, then anyone assigned as a data owner could take the data they were told they owned and sell it. That would probably result in the firing of individual from his or her job or even possibly facing civil charges or jail time.
It is pretty self-evident, then, that data ownership does not have a literal meaning – at least not usually. So why are we using the term? I think the most probable reason is that data ownership is an analogy rather than a defined term.
We expect that if someone owns something they will take care of it – more so than a renter, for instance. However, this is not always true. If a person acquires a piece of property at no cost – say as a gift or by inheritance – they may not always take as much care of it as someone who had to work hard to afford to buy it. The person who receives free property often does not appreciate it as much as the person who had to work to get it. Hence, ownership has limits as an analogy because ownership does not always imply a high degree of responsibility.
There is another unspoken assumption in the analogy: Owners always know how to take care of their property. Yet this does not correspond to reality. For instance, first-time homeowners are often unaware of the full scope and detail of property management, even though they are eager to fulfill their responsibilities. Similarly, nobody would expect a child who inherited a house to be able to care for it adequately, no matter how much the child wished to do so.
Thus, ownership seems to be a poor analogy if the point in common between holding legal title (true ownership) and data ownership is an expectation of effective exercise of responsibility for something. Yet, it does seem to be this aspect of responsibility that is at the core of the analogy of data ownership.
A further problem with the analogy is that it is not uncommon to hear about people “owning” problems. This seems to imply both responsibility and accountability. In true ownership, it is difficult to see how an owner can be accountable for not looking after their property properly (unless it causes harm to someone else or damages their property). Furthermore, it is very difficult to think of problems as property, because a problem is something we want to get rid of. So this is yet another poor analogy, but it may affect the way we think about data ownership if we are focusing only on the problems in data. In other words, data ownership might be conceived as assigning responsibility and accountability for preventing and fixing problems in a particular set of data. Such a concept might be too narrow for successful data governance.
Interestingly, it is not always the owners who are claiming ownership. In my experience, those who are responsible for data governance “discover” data owners or assign data ownership. The so-called owners may never have really thought of themselves as such until they were told they were.
Here we come to a fairly significant problem. Suppose the individuals responsible for data governance have only a very nebulous idea of what data governance is or cannot conceive of a concrete set of governance tasks for every distinct class of data and for every mode of usage of data. Such individuals may be tempted to simply avoid having to identify and define a whole range of data governance tasks by telling other people they are data owners. This places the burden of figuring out what data governance is on the unfortunate data owners.
It is not as if we do not have the means to express precision about data governance tasks. RACI (Responsible, Accountable, Consulted, Informed) matrices are a very good tool to capture the details of who is involved in what role for each governance task. Furthermore, RACI matrices can easily be understood by business users. Those responsible for data governance would do better to generate detailed RACI matrices rather than simply identifying people as data owners.
Yet there is one area where data ownership is real and has tremendous implications. This is in the area of data that is purchased as a service from data vendors. Perhaps this is most commonly seen in the financial services industry, where data on financial instruments, indexes, benchmarks, prices, and corporate actions has to be purchased from data vendors. Other classes of data can be sourced from vendors too – e.g., corporate structures. The licensing of such data has many issues around intellectual property. Basically, the data vendors regard the data as their property and limit redistribution and derivation in the licensing agreements.
Such restrictions present enormous data management challenges. It is not easy to prevent redistribution – after all, how do you know what downstream applications are doing with the data? The same is true of using vendor-supplied data to derive or compute other data. Yet the vendors do enforce their contracts and are always on the lookout for violations, some of which can result in hefty settlements.
The problem here is determining to what extent data is the property of a supplier, particularly if the data can be found in the public realm (which it often can be). It is true that the vendor is collecting the data, standardizing it, attaching metadata, and so on, but do these actions really make the data the property of the vendor? At the moment, it would seem that the vendors have the upper hand, but the costs to their clients are quite considerable and there is pressure being applied to the vendors. We will have to wait to see how this plays out.
This brief overview of data ownership shows that the term is rarely meant literally and is actually a misleading analogy. It can too easily provide cover for not thinking clearly and in a sufficiently detailed way about data governance. However, there is an important – though not extensive – area where true data ownership is a reality that presents a number of important issues that seem to be difficult to resolve. Hopefully, as data governance matures, the poor analogies of data ownership will be replaced with more detail and more clarity.
SOURCE: What is Data Ownership?
Recent articles by Malcolm Chisholm