We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.


Blog: Barry Devlin Subscribe to this blog's RSS feed!

Barry Devlin

As one of the founders of data warehousing back in the mid-1980s, a question I increasingly ask myself over 25 years later is: Are our prior architectural and design decisions still relevant in the light of today's business needs and technological advances? I'll pose this and related questions in this blog as I see industry announcements and changes in way businesses make decisions. I'd love to hear your answers and, indeed, questions in the same vein.

About the author >

Dr. Barry Devlin is among the foremost authorities in the world on business insight and data warehousing. He was responsible for the definition of IBM's data warehouse architecture in the mid '80s and authored the first paper on the topic in the IBM Systems Journal in 1988. He is a widely respected consultant and lecturer on this and related topics, and author of the comprehensive book Data Warehouse: From Architecture to Implementation.

Barry's interest today covers the wider field of a fully integrated business, covering informational, operational and collaborative environments and, in particular, how to present the end user with an holistic experience of the business through IT. These aims, and a growing conviction that the original data warehouse architecture struggles to meet modern business needs for near real-time business intelligence (BI) and support for big data, drove Barry’s latest book, Business unIntelligence: Insight and Innovation Beyond Analytics, now available in print and eBook editions.

Barry has worked in the IT industry for more than 30 years, mainly as a Distinguished Engineer for IBM in Dublin, Ireland. He is now founder and principal of 9sight Consulting, specializing in the human, organizational and IT implications and design of deep business insight solutions.

Editor's Note: Find more articles and resources in Barry's BeyeNETWORK Expert Channel and blog. Be sure to visit today!

January 2013 Archives

MagnifyingGlassDataSpider.jpgAs we begin a new year, we are promised a move from a focus on the meaning and technology of big data to the useful and worthwhile business applications it may offer.  A timely move indeed.  Hopefully, we'll begin to hear less about analyzing Twitter streams to optimize advertising spend and more about applications with the potential to improve people's lives or the environment.  And even more hopefully, people may begin to consider the risks they run when revealing or gathering personal data on our deeply interconnected Web.

With all of the synchronicity that is the Internet, I came across two articles from the New York Times published in last week. The first, by Peter Jaret on January 14, describes how patient records, transcribed and digitized from scrawled (why do they write so poorly?) doctors' notes, anonymized and stored on the Web, can be statistically mined to discover previously unknown side-effects of and interactions between prescribed drugs.  Clearly useful and valuable work.  The second article, three days later by Gina Kolata, revealed how easily a genetics researcher was able to identify five individuals and their extended families by combining publicly-available information from the anonymized 1000 Genome Project database, a commercial genealogy Web site and Google.  Kolata quotes Amy L. McGuire, a lawyer and ethicist at Baylor College of Medicine in Houston:  "To have the illusion you can fully protect privacy or make data anonymous is no longer a sustainable position".  The underlying genetic data is used in medical research to good effect, of course, but what are the possible consequences for those individuals thus identified as insurance companies, governments or other interested parties make potentially negative assessments based on their once private genomes?

Such occurrences--and there are many of them--should be deeply disturbing to those of us involved in the business of big data and analytics.  Here are doctors, scientists and lawyers--with training in logic, ethics and law--who see the power of analytics to improve the human condition, but who seem to gloss over the wider privacy and security implications of making personal information widely available on the Web.  After all, the limits of data anonymization on the web were being discussed openly as long ago as May 2011 by Pete Warden on the O'Reilly Radar blog.  And as far back as 1997, Prof. Latanya Sweeney, now Director of the Data Privacy Lab at Harvard, could show that the combination of gender, ZIP code and birthdate was unique for 87% of the U.S. population.  

Eben Moglen, professor of law and legal history at Columbia University and Chairman of the Software Freedom Law Center, warned at re:publica Berlin in May 2012 that "media that spies on and data-mines the public is destroying freedom of thought and only this generation, the last to grow up remembering the 'old way', is positioned to save this, humanity's most precious freedom".  With media and medicine, government and retail, telecommunications and finance all gathering hoards of information about us, each for their own allegedly good purpose, the reality is now that the abuse of big data (as opposed to its use) is not only possible, but proceeding apace, even in largely democratic,Western states.

So, given that big data anonymity is "no longer a sustainable position", it should be clear that the analytics possible on today's high-powered computers is a double-edged sword; it serves us poorly to focus only on one, single, razor-sharp edge.  As we evaluate and build useful and worthwhile business analytics applications of this coming year, let us step back even occasionally to contemplate whether the profits to be earned or the discoveries to be made are worth the price of human freedom.

Posted January 23, 2013 9:06 AM
Permalink | No Comments |