Business Intelligence Network business intelligence resources

Blog: David Loshin

Main

January 24, 2008

Costly Bundler Blunder

I got a postcard from Verizon today. It said:

"We recently sent you a letter in which we advertised a Verizon bundle package of Verizon FiOS Internet and Verizon FiOS TV service. This letter was mailed by mistake and the services described in the letter have never been offered by Verizon under those terms.

We apologize for this error.

Verizon Consumer Marketing"

OK, seriously, I am finding it hard to get my head around this. The offer came in one of those pseudo-overnight envelopes that marketers often use to make their letter seem more credible - you know, cardboard weight with a zip-pull - not cheap. So this company:

- Drafts a marketing letter,
- Prints tens of thousands of copies,
- Custom prints tens of thousands of fancy cardboard envelopes,
- Puts them into fancy cardboard envelopes, and
- Mails them.

Actually, I am guessing about the number - it could be orders of magnitude greater, for all I know.

I find it hard to believe that the internal governance and control over marketing would not have stopped the process after the marketing letter had been drafted if it contained erroneous information, so I am curious as to what has really happened. I mean, in fact, having sent out the previous letter, the company actually did offer the services under those terms, but perhaps, due to some error, was not prepared to honor that offer.

In any event, my guess would be that there were some significant negative business impacts related to this bundle blunder - actual hard costs for materials and postage, as well as softer costs relating to organizational trust.

October 10, 2006

Data Quality in the News - 60 Minutes

This past week, 60 Minutes had a story on the (lack of) quality of the data on the national "No-Fly" list. Apparently (just as all of you thoughtful readers should have already expected), the list is rampant with names of dead people (including 14 of the 19 9/11 hijackers) and people unlikely to be traveling (e.g., Saddam Hussein and convicted and jailed terrorist Zacarias Mousaoui). In addition, the limited identifying information on the list causes increased scrutiny of those whose identities are falsely matched positively to names on the list.

Actually, the deficiencies in the quality of the data managed by the Terrorist Screening Center had already been discussed in a report issued by the Department of Justice over a year ago, in which a "major Quality Assurance effort" was put underway to "ensure that records of highest priority for correction are addressed by a record-by-record search."

July 21, 2006

Data Quality, Prescriptions, and Impact

Yet another article on impacts of poor data quality in healthcare. Apparently, medication errors (e.g., incorrecly transcribed prescriptions) kill 7000 people and conservatively incur costs of $3.5 billion per year.

June 13, 2006

Sunopsis and Trillium: Another Data Quality and Data Integration Partnering

I was just emailed a press release telling me that Sunopsis, a data integration tools vendor, is now partnering with Trillium to provide data quality tools integrated with their integration suite. This is a nice development considering that Sunopsis had entered into an agreement with Similarity Systems not too long before Similarity was acquired by Informatica, effectively quashing the Sunopsis deal.

May 4, 2006

Reaction to Gartner on Data Quality

Last week Gartner released its "Magic Quadrant" for Data Quality Tools vendirs, as reported in this news item. I wonder, though, with all the consolidation going on, and the focus on value-added applications that need to embed data quality technology (e.g., MDM, CDI, CRM, SCM, and other three-letter acronyms), whether the concept of "data quality" tools may soon be outdated? If data quality is imperative to the success of any data-oriented business application, then quality concepts must be architected into the fabric of the application development environment.

My prediction: infrastructure companies (RDBMS, Enterprise Architecture, Data Modeling tools, Application Development, Metadata Management Repositories, etc.) will soon be incorporating parsing, standardization, and linkage as part of their offerings.

April 18, 2006

Surprise! Poor Data Quality Costs a Lot of Money!

A recent study suggests the costs of poor data quality to Dutch business exceeds €400 million yearly. According to the article, the results of a survey of 20,000 Dutch organizations found that "the total amount of €400 million consists of costs that are calculated based on directly quantifiable aspects, such as wrongly addressed invoices and product deliveries which do not arrive at the right addresses."

February 8, 2006

Yet Another DQ Acquisition...

As I noted just days ago, in the shadow of Informatica's acquisition of Similarity Systems, it is reported that Firstlogic, one of the few remaining independent data quality tools vendors has been is planned to be acquired by Business Objects. The $69 million cash transaction certainly seems to be much more appealing than the suitor in their previous engagement.

January 30, 2006

Informatica To Acquire Similarity Systems

As might have been predicted, last week Informatica announced that it was acquiring UK-based Similarity Systems, a provider of data quality tools. Similarity, itself recently in the acquisition business, must have long appeared to be a suitable candidate for Informatica, having watched IBM swallow up DQ and ETL tools vendor Ascential (hear more about that one in my interview with Scott McNabb!) early in 2005. As I discussed back when Firstlogic announced their (later aborted) sale to Pitney-Bowes, it has become fashionable for information movement companies to embed a data quality solution.

The announcement is good news for both Similarity and for Informatica. Similarity Systems is probably the current thought leader among European Data Quality tools vendors, having adopted a more process-oriented, full-cycle approach to information quality improvement. Informatica, long able to provide a full data quality solution, now acquires some of the pieces missing from its repertoire, along with additional focused expertise and potential for some degree of market expansion.

The next question: how will the acquisition affect Informatica's relationships with Firstlogic and Trillium?

January 23, 2006

Minnesota, Medicare, Data Quality, and Master Data

More controversy swirls around the new Medicare regulations. According to this grand Forks Herald article, bad data associated with Medicare customer systems has resulted in a cost of over $2,000,000.00 to the state of Minnesota. According to the story, "In many cases, the massive and nationwide array of computer problems, bad data (emphasis mine) and overwhelmed Medicare and drug-plan phone lines prevented pharmacists from verifying that a customer was eligible for the deep subsidy - and sometimes unable to find out if that customer even was enrolled in a drug plan."

Two problems reported: bad data and inability to find out if that customer was enrolled.

The first problem is left unspecified, as if it is clear what makes the data bad. The second indicates a less-than reliable master customer repository, which suggests that quality expectations were not well-specified before the law changed.

What is interesting, though, is that this is a good example of cost impacts that are both quantifiable and attributable to poor data quality (whether it be the nebulous "bad data" or the more precise master data management failure). The simple costs are the gaps in coverage, such as the $2.2M that MN is paying the pharmacists for the 38,000 claims (which works out to a little less than $60.00 per claim). The more important, yet more difficult to quantify cost involves the loss of confidence in the ability to provide low-cost medication to the people who need it most.

Moreover, the confusion doesn't really end there. In this article from the Seattle Times, there is talk of failure in communication and information exchange regarding covered medications, enrollment, administration, and workflow bottlenecks.

So here is my last comment, which is intended to reflect on those who constantly ask me for examples of ROI models for data quality. Had everything gone smoothly, with no data issues, there would not have been many incurred costs other than those to ensure high-quality data, which paradoxically implies that there is no measurable return on that investment. Perhaps we should stop trying to use ROI models and consider the fact that good planning and vigilance might provide ample, yet unremarkable, rewards?

December 13, 2005

Somebody Actually Reads This Stuff!

Hey - I was quoted in a recent article about finding value in "dirty data." Apparently the author, Hannah Smalltree, read my blog entry back in August on "Dirty Data and Embedded Knowledge," and decided to follow up on the concept with others in the DQ field, including Ted Friedman from Gartner, and Ramesh Menon from Identity Systems. Let me know what you think!

December 12, 2005

A Dream Case for DQ ROI

As a data quality practitioner who always preaches the value of determining return on investment, I relish the opportunity when a simple data quality problem has significant impact. Well, last week, my dream came true. As reported in December 9th's Washington Post, erroneously placed sell orders will end up costing a financial services firm approximately $225 million dollars (yes, I said million).

Apparently, the company "mistakenly sold 610,000 shares of J-Com Co. at 1 yen (less than 1 cent) per share, instead of fulfilling a client's request to sell just one share at 610,000 yen ($5,080)." The problem occurred because of an erroneous data entry mistake that was not caught at the time the order was placed.

The impacts of this mistake bubbled through the Japanese market, sending the Nikkei average down 1.95% (wow).

Actually, the Tokyo Stock Exchange is admitting some responsibility as well. This will probably result in some government directives to improve their business processes to enable the determination of suspect transactions such as this one.

November 22, 2005

Simple Overloading

I recently came across a curious overloaded use of a database table attribute: one column, called "Verification Status Code" contained a code indicating the result of a process of verifying the connection between a customer identification number and a supplied customer name. The attribute took on some values such as:

"The customer identifier and name were correctly verified as identical to our records"
"A corrected identifier was provided for the supplied customer name"
"The customer identifier and name were matched using the ALPHA process"
"The customer identifier and name were matched using the BETA process"
"The customer identifier and name could not be verified"

Apparently, the codes used indicate two pieces of information. The first is whether the name and identifier were correctly verified within the system or not, and the second was the process used to correctly verify the data. This suggests an embedded business rule associated with the application, in that it first checks to see whether the code is one that indicates verified data, and then it performs different actions based on which process was used.

Anyone have any other experiences with this kind of overloading? Let me know - I will add this as a rule class to my business rule-based data quality techniques. Email me (loshin@knowledge-integrity.com)

November 14, 2005

FirstLogic/Pitney Merger Off

Well, apparently the FTC's request for additional information on the Pitney Bowes acquisition of Firstlogic was concerning enough to scuttle the takeover.

September 5, 2005

Data Quality in Korea

I just got back from a whirlwind trip to Seoul, where I, and fellow B-Eye-Network expert Bill Inmon had been invited to present tutorials at the 2005 Korea Database Grand Conference. Although this was the third year in a row being invited to present as a guest speaker at this conference, this was the first year I was able to attend. My original speaking proposal was about data profiling, but apparently the conference organizers, the Korea Database Promotion Center (if you can read Korean), expressed an interest in my presenting a high-level tutorial on building a data quality program. I also prepared a case study on developing a data standards program.

It is very encouraging to see a large number of attendees at the conference, and especially so to see how many people had an interest in learning more about data quality. Despite the communications gap, the simultaneous translation between English and Korean (and back again) worked well, and the questions asked were quite good ones.

Another interesting things was the fact that a number of the sponsoring vendors focused on data quality and business intelligence solutions, and many had a desire to learn more about penetrating the US markets for products and services.

Overall, I give the conference organizers high marks for putting together a top-notch program!

September 1, 2005

Data Quality Tools Consolidation Continues

I just read a press release that said that Pitney Bowes "has signed a definitive agreement to acquire all of the remaining
outstanding shares of Firstlogic for approximately $50.3 million." Firstlogic, one of the few remaining independent data quality tools vendors, will now become a wholly-owned subsidiary of Pitney Bowes, which acquired data quality tools vendor Group 1 a year ago April. It will interesting to see how, if at all, the two tool suites will be combined. Note that prior to Group 1's acquisition, Group 1 itself had acquired ETL vendor Sagent. Is Pitney Bowes looking to get into the ring with IBM on the ETL/Data Quality Front?

What does this bode for the DQ Tools industry?

Continue reading "Data Quality Tools Consolidation Continues" »

August 23, 2005

Dirty Data and Embedded Knowledge

Is it better to clean data on intake or after it has been processed?

Let's say you have a data entry process in which names and addresses are input into a system. At some point within your processing, that same data (name and address) will be forwarded to an application performing a business process, such as printing a shipping label. However, it is not necessarily guaranteed that the individual whose name and address was input will ever be sent anything.

You desire to maintain clean data, and you are now faced with two options: cleanse the data at intake or cleanse it when it is used. There are arguments for doing both of these options...

Continue reading "Dirty Data and Embedded Knowledge" »