My friend Bill recently wrote an article on Time-Value of Information, in which he declared that the value decreases exponentially over time. I have no argument there when it comes to data sets that are non-metadata. However, where metadata is involved, I believe that the value of the some metadata actually increases in conjunction with utilization. Conversely - metadata that is not utilized drops in value like a stone in water. In this entry we dive into specific attribute based valuation, and begin exploring a hard method for finding / assigning value.
If you examine the trends of SOA, Master Data Management you might quickly discover that master metadata is fairly useless without definition: things like How to use it, when to use it, where to use it, what it means, how it was aggregated, if it was aggregated, and where it came from - each of which help to define the element from a corporate perspective. You may also quickly discover that while corporate perspectives are nice & necessary, they don't necessarily fit the bill for your division / Line of business / subject area of expertise.
At that point, you're left wondering about the data itself and it's relevance to your piece of business that you manage. Now, if there's no metadata at your level - even if the master data is "fresh or current and quality cleansed" will it be of value to you and your decision making? It might, depending on how intuitive the data set is, and how widely utilized and defined the data is as a standard. But if either the metadata is missing, and / or your level of understanding of the data set is missing, then the value of the data could very well be zero - because you can't apply data to aggregations, predictions and answer sets unless you understand how to use it, and what it is defined as.
For instance, take a dollar figure defined as gross sales (nothing more). Gross sales for your line of business or your unit may be what you're looking for, but if there's no additional metadata, you may not know if this is gross sales for the corporation or gross sales from a financial perspective, or gross sales from a "prospect only" perspective. Without additional metadata, the gross-sales figure cannot be utilized in your report, therefore it is useless (zero value) to you.
I would argue that at this point, metadata is the most valuable piece of information to you, and that metadata (which is kept current with the business) is the single most valuable asset that the business has, and keeps a "straight line" of valuation that Bill has written about. I would even argue that the metadata valuation is 2x what the data is, because with centralization, common services for data distribution, and web-services (SOA's) metadata really becomes the keys to the kingdom for everyone who touches/access that information.
I believe that Master Metadata (enterprise metadata for those of you out there) has a straight line time-valuation (again as long as it is kept current with the enterprise definitions). Master Data is mostly valueless without metadata definitions and understanding - even if the metadata is implied within someone's head - and they understand how the business runs.
The next set of questions around Master Metadata, and Master Data include how do you actually perform a valuation of your data set?
Here's a couple of items that discuss their opinion on the matter, then I'll share with you mine.
Oil and Gas Exploration - Value of Information In one of their slides they say that the value of their information is the difference of the project value with the information and the project value without the information - well that doesn't go a long way to telling you HOW to calculate the value of information.
The Value of Information In this article, written in 2002, they discuss how information get's a valuation in the first place. I would say that there are some manners in which particular pieces of information can be assigned an estimated and assessed value based on importance to the organization, which of course is driven by utilization and metadata definition.
Surprisingly, quality of the data plays a tertiary role in the value of the asset. In my mind, poor quality data only means a poor quality decision, or the wrong decision - wich could in theory cost you your job, or land you in court. Quality of the data is a goal, an intangible value that can only be measured after the outcome of the decision is known.
Here, a well written article discusses a lot about why you want to measure ROI, and value of Information - but they don't provide the how details except at a 50k foot level.
Now, here's one way to "measure value" of the data, I've blogged on this before (here). But even then I didn't go into detail.
Here's a ONE way of looking at hard value of information, but again - it relies on knowing and understanding the use of the data (master metadata) - we cannot overlook the importance of master metadata, and the fact that without it, this case would make NO SENSE ($$) at all.
Suppose you are a credit card company, and your goal right now it is to stimulate existing customers into spending more money. You're method of choice is to go direct and personalized marketing; a direct mail campaign. Now, what data is important (of value) to you in reaching your prospective customers? Let's say, at a minimum, address, city, state zip are most valuable - knowing where to send the direct mail.
Next on the list might be their name and gender - especially if you are personalizing. Followed by how much money they spent on their cards last year - while you might not print this information for direct mail, it might be utilized to decide which personalized package is sent to the customer. Ok, so we have the basic elements (or so we think).
What if, by accident, you don't consider AGE of an individual, what if you don't know the age of an individual? What happens if you accidentally send a huge incentive package to your best customerâ€™s daughter who's only 11 years old? What if your best customer becomes irate, and switches credit card companies - how does that affect your revenue stream? What's the value of that single piece of information in your decision process?
Ok - so what I'm getting at is this: Information valuation is only applicable based on the TIME at which the question is being asked, and the QUESTION that in fact is being answered by the data. In this case, a direct marketing campaign with highly specialized packages of data being sent out. Two key elements / attributes (among all that are mentioned) are the association of the child to the parent, and the age. Another key element is the address.
There could be a direct cost of not having this information. Let's say you don't have the address, but you send an offer to your best customers' best friend across the street, but not your best customer - what happens when your best customer finds out they didn't get the offer? Worse yet, they wanted the offer, and were just about to spend a couple million dollars on their (your) credit card? What's the COST of doing business without that address? You've just found the value of the CELL of information.
I think valuation fluctuates by row, and by cell (column) - based on the surrounding data set. I would say that you could compute an overall average cost for a missing element of information, and compound that cost through multipliers by putting your customers into market-basket analysis. The market baskets would change depending on the question. Again, an overall value (average) exists to all information, and then there is a specific valuation based on segmentation of data sets (even beyond customer).
More to come on this, if you're interested, drop your questions into the comments below - I'll do my best to answer them going forward.
CTO, Myers-Holum, Inc
Posted January 14, 2007 4:54 AM
Permalink | No Comments |