We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.

Blog: David Loshin Subscribe to this blog's RSS feed!

David Loshin

Welcome to my BeyeNETWORK Blog. This is going to be the place for us to exchange thoughts, ideas and opinions on all aspects of the information quality and data integration world. I intend this to be a forum for discussing changes in the industry, as well as how external forces influence the way we treat our information asset. The value of the blog will be greatly enhanced by your participation! I intend to introduce controversial topics here, and I fully expect that reader input will "spice it up." Here we will share ideas, vendor and client updates, problems, questions and, most importantly, your reactions. So keep coming back each week to see what is new on our Blog!

About the author >

David is the President of Knowledge Integrity, Inc., a consulting and development company focusing on customized information management solutions including information quality solutions consulting, information quality training and business rules solutions. Loshin is the author of The Practitioner's Guide to Data Quality Improvement, Master Data Management, Enterprise Knowledge Management: The Data Quality Approachand Business Intelligence: The Savvy Manager's Guide. He is a frequent speaker on maximizing the value of information. David can be reached at loshin@knowledge-integrity.com or at (301) 754-6350.

Editor's Note: More articles and resources are available in David's BeyeNETWORK Expert Channel. Be sure to visit today!

Recently in Business Intelligence Category

They say that data integration accounts for 80% of the effort of a data warehousing (or a variety of other enterprise application's) effort. But who are "they"? I know that the figure is often presented as the typical resource and time investment for data integration activities, but have not tracked down a source for it. I seem to recall seeing it in some data warehousing book, but do not remember which one.


Nonetheless, there is no reason for data integration to consume that amount of effort if the right steps are taken ahead of time to reduce the comfusion and complexity of ambiguous semantics and structure. I will discuss these issues at a webinar this Thursday, August 12 - hope you can make it!

Posted August 10, 2010 6:05 AM
Permalink | No Comments |

About a year ago I came across a very good book by David Parmenter called Key Performance Indicators that provided a nice breakdown of the concepts and processes associated with articulating performance measures in relation to business objectives. One nice feature was a taxonomy of measures with a great organization.

Well, I recently got my hands on the recently revised version of the book, and am definitely looking forward to reading through it. If you get a chance to read it, please share your thoughts!

Posted April 8, 2010 1:52 PM
Permalink | No Comments |

This Forbes interview of Chevron CIO Denise Coyne suggests that a 2010 focus for the oil and gas giant is data quality, although the code words employed scream Master Data Management:

"... we're going to create a pilot enterprise project to consolidate all of that information in one place."

"We have lots of data about people in one organization, another database about people in another organization. Consolidating that information to have one source of the truth, to be able to make faster, more competitive decisions more quickly, is a really important focus in 2010."

It is great that Denise Coyne has recognized the potential business value of improved data quality. 

I do hope that this sentiment is not being driven by vendors/consultants pushing the purchase of a product (first) and a long implementation (second) followed by a realization that data requirements gathering is a necessity (third) and then a need for data governance practices (last, but really should be first)

As the CIO, she, of all people, should be aware of the potential complexity of migrating a federated, distributed organization with many organically-developed business applications (and probably thousands, if not tens of thousands of desktop data assets such as spreadsheets and databases) into "one source of truth." The "truth" is that it is highly unlikely that there is one source of truth. Rather, a reasonable focus on data governance and master data management would begin with understanding what business decisions are dependent on consolidated data, who in the organization is hampered by delays in serving reports based on consolidated data, and what steps can be taken to alleviate the negative business impacts. We have seen a number of initiatives focused on "single source of truth" evolve into data governance and data quality management programs when the delivery on the promises of the MDM tool vendors are impeded by the inability to simultaneously transform the organization via good information management practices.

Best of luck!

Meanwhile, my book on Master Data Management is now (Jan 20) on sale at 51% off the cover price, and I hope someone at Chevron buys one for Denise Coyne!

Posted January 21, 2010 7:01 AM
Permalink | 1 Comment |

Apparentyly, the same issues that plagued competing US intelligence agencies immediately after the 9/11 bombings have not yet been resolved. According to this Time Magazine article, President Obama's summarized the failure to prevent terrorism suspect Umar Farouk Abdulmutallab from boarding a Detroit-bound plane was that "The U.S. government had sufficient information to have uncovered this plot and potentially disrupt the Christmas Day attack, but our intelligence community failed to connect those dots."

Yet again, we see that despite being flooded with data, there was a failure to turn that data into actionable knowledge. Apparently, according to the article, inteligence agencies knew that the suspected bomber Abdulmutallab had traveled to Yemen, a spot of brewing anti-US terrorism plots, that his father had contacted the US embassy in Nigeria to notify them of his son's activities, that no one asked whether Abdulmutallab had a US visa, or whether he should have been added to the no-fly list. Also, the fact that he purchased a one-way ticket and no checked luggage might have raised some concern as well.

Any of these events should have triggered some action, but the fact that they didn't potentially raises a different question: how often do we miss events that should trigger a security response? I am sure a lot more frequently than we'd like to believe, and that might raise your level of anxiety.

And that raises another different question: what is the probability/risk that a missed event is a critical one like the Dec 25th situation? Of course, a low probability might alleviate some of the anxiety.

However, from a data perspective, the issue is a matter of data sharing and integration - protocols for capturing the key semantic aspects of logged events could be published to a common repository that could be continuously monitored, mined and evaluated to determine when some proactive action should take place. Is MDM the answer? Maybe, or perhaps a master repository published to a cloud environment with layered data services for rapid identity resolution...

Oh, check out this interview to understand a little more about national security.


Posted January 6, 2010 6:57 AM
Permalink | 2 Comments |

Before my current career in data management, I had a previous life as a software developer, working on designing and implementing compilers for Fortran and C for massively parallel processing (MPP) computers, and while I have been working on data quality and BI for the past 13 years or so, I still have a great interest in the high performance computing space. Recently I have had the opportunity to indulge that interest with respect to learning about the distributed/parallel programming model that Google has championed called MapReduce, and its relationship to the use of analytical database management systems.

There are some similarities, some differences, and ultimately, the two paradigms are complementary when it comes to supporting end-user business needs. If you are interested in the thought processes, check out this analysis paper, funded by Vertica, which compares and contrasts both high performance approaches.

Posted October 5, 2009 5:45 AM
Permalink | No Comments |
PREV 1 2 3 4 5 6 7 8


Search this blog
Categories ›
Archives ›
Recent Entries ›