Blog: David Loshin« April 2006 | Main | June 2006 » May 23, 2006Data Provenance, Security, and ControlWidely reported today was the fact that a large number of private data from the Veteran's Administration had been stolen. A quick glance of the articles suggested some wrongdoing on behalf of the worker who brought the data to his home (which is wrong, to be sure), but in fact the data was then stolen (inadvertently) when the VA employee's home was burglarized. The thief(s) may not even be aware that they stole the data, although gauging by the attention the press is giving this, they might have realized it by now. While there are significant constraints placed on government employees to safeguard personal information, this just goes to show that the articulation point in the level of trust lies squarely at the individual. However, does that not mean that there aren't ways of tracking information as it moves (perhaps, inappropriately) from place to place? I have had some conversations about data provenance - a means for effectively tattooing data with metadata specifying when it was copied, who copied it, where it came from, etc. Having this information available would at least help in tracking down the path by which stolen data moves when it resurfaces on a credit card at some distant location. Anybody have any ideas or experience in this area? May 22, 2006Overloading Signals...I had heard about this overloaded use of technology a few years back, but a few conversations last week at TDWI reminded me of an interesting exploitation of one technology to provide a service completely unrelated to the technology's original intent. This article from last October describes how one can use cell phone monitoring to track traffic patterns through highly-traveled routes. Basically, the way mobile phones work is that they transmit and receive signals from particular antenna towers scattered across the region. As you are traveling, the mobile phone sends a message to establish its appearance within the nearest tower's range. As your phone leaves that tower's area, it will connect with another tower. As you can guess, one can calculate relative rates of speed by looking at the timestamps at which the same phone registers itself with a series of towers. When the reference space includes towers along an interstate highway, averaging those durations over a number of cell phones allows one to get an idea of how fast traffic is moving along different sections of the highway. That information can be routed back to subscribers (individuals or even news sites) to help in relieving congestion. According to the article, a number of localities have, or are interested in deploying these kinds of systems. Is this a public service piggy-backed on harmless data collection, or a potential invasion of privacy... let me know what you think? May 8, 2006Playing by the RulesWhile I was doing some random web searching, I came across an interesting web page that provides some training on finding MP3s using Google. Not that I am suggesting that search engines be used for unacceptable behavior, but my curiosity is piqued by the more general concept of "getting around the rules," and how that concept relates to the more piquant topic of compliance. There are two approaches to compliance. the first is doing what you need to do to comply; the second is seeing how much you can do to avoid being compliant. Here is a quick, although probably dated example: During the 1980s and 1990s, police would set up speed traps employing radar systems to determine how fast cars were traveling. As the goal was to identify (and punish) drivers exceeding the posted speed limit, this reflects a simple model of compliance. Drivers who were inclined to speed could react in one of two ways. The first (for the "compliers") was to drive slower (become compliant). The second (for the "avoiders") was to purchase some technology (a radar detector) that would notify the driver when the radar monitoring was taking place and allow the driver to slow down during the monitoring phase, but then resume the noncompliant behavior when there was limited risk of being caught. Do organizations opt for one or the other of these approaches? What is the risk/reward model? To look at our example, those who became compliant were penalized to some extent by having to reduce their speed and get to where they were going more slowly. There was some monetary investment on behalf of the avoiders (the cost of the radar detector), but otherwise they were rewarded for their noncompliance, since they still get to their destinations more quickly, with some limited risk of getting caught nonetheless. Is it better to be a complier or an avoider? How does an organization determine its approach, and then communicate that approach to the individuals within the organization? And lastly, I wonder whether there is some middle ground between these two options. Any comments? May 5, 2006New TDWI Report on Data QualityPhillip Russom's new report on Data Quality was released recently by the Data Warehousing Institute, and it is definitely worth a look. May 4, 2006Reaction to Gartner on Data QualityLast week Gartner released its "Magic Quadrant" for Data Quality Tools vendirs, as reported in this news item. I wonder, though, with all the consolidation going on, and the focus on value-added applications that need to embed data quality technology (e.g., MDM, CDI, CRM, SCM, and other three-letter acronyms), whether the concept of "data quality" tools may soon be outdated? If data quality is imperative to the success of any data-oriented business application, then quality concepts must be architected into the fabric of the application development environment. My prediction: infrastructure companies (RDBMS, Enterprise Architecture, Data Modeling tools, Application Development, Metadata Management Repositories, etc.) will soon be incorporating parsing, standardization, and linkage as part of their offerings. |