Blog: Barry Devlin Subscribe to this blog's RSS feed!

Barry Devlin

As one of the founders of data warehousing back in the mid-1980s, a question I increasingly ask myself over 25 years later is: Are our prior architectural and design decisions still relevant in the light of today's business needs and technological advances? I'll pose this and related questions in this blog as I see industry announcements and changes in way businesses make decisions. I'd love to hear your answers and, indeed, questions in the same vein.

About the author >

Dr. Barry Devlin is among the foremost authorities in the world on business insight and data warehousing. He was responsible for the definition of IBM's data warehouse architecture in the mid '80s and authored the first paper on the topic in the IBM Systems Journal in 1988. He is a widely respected consultant and lecturer on this and related topics, and author of the comprehensive book Data Warehouse: From Architecture to Implementation.

Barry's interest today covers the wider field of a fully integrated business, covering informational, operational and collaborative environments and, in particular, how to present the end user with an holistic experience of the business through IT. These aims, and a growing conviction that the original data warehouse architecture struggles to meet modern business needs for near real-time business intelligence (BI) and support for big data, drove Barry’s latest book, Business unIntelligence: Insight and Innovation Beyond Analytics, now available in print and eBook editions.

Barry has worked in the IT industry for more than 30 years, mainly as a Distinguished Engineer for IBM in Dublin, Ireland. He is now founder and principal of 9sight Consulting, specializing in the human, organizational and IT implications and design of deep business insight solutions.

Editor's Note: Find more articles and resources in Barry's BeyeNETWORK Expert Channel and blog. Be sure to visit today!

1984first.png"Social networks already know who you know", "recommendation engines get much smarter", "early detection mitigates catastrophes".  Three of ten ways big data is creating the science fiction future.  These types of headlines appeal to the geek optimists in many of us.  We think that mitigating a catastrophe is certainly a good thing.  That smarter recommendations to whom we should connect and what we might be interested in buying could probably save us time, that most precious of commodities.  Most of us have grown up with a belief system that science and, by extension, technology and computers, are a sine qua non in today's world.  In truth, the world we live in today could not exist without them.

But, at what cost?

Three further headlines from the same blog: "surveillance gets really Orwellian", "doctors can make sense of your genome--and so can insurers", "dating sites that can predict when you're lying".  Perhaps these items give pause for thought.  Security cameras lurk in every corridor and public place.  And, as of last August, the NYPD has been monitoring Facebook and Twitter.  Even in our bedrooms, smart phones can be turned on remotely to monitor our most intimate indiscretions.  It's open season on our actions and communications.  Our genomes are fast becoming public property, ostensibly for our better health management; but, clearly, for better risk management--read profit--for insurance companies.  Even our thinking is being analyzed.

We're fast reaching 1984 some 30 years later than George Orwell imagined.  At least in our ability to monitor the actions, communications, genetic makeup and thoughts of an ever-increasing swathe of humanity.  As BI experts and data scientists, we celebrate our ability to gather and analyze ever more data with ever more sophistication and effort deeper granularity.  For marketeers, Utopia is a segment of one whose buying behavior is predictable with certainty.  As traders on the commodities or currency markets, our algorithms gamble on the Brownian motion of microscopic movements in prices.  For insurers, statistical averaging of risk across populations gives way to cherry picking the low-risk individuals for discounted premiums.

Am I overly pessimistic or even paranoid in imagining that big data brings risks at least as large as the benefits it promises?  Are the petabytes and exabytes of information we're gathering, storing and analyzing open to misuse?  We celebrate the role of social networking in pro-democracy movements around the world imagining that tweets and texts that are unassailable weapons for freedom, forgetting that the networks that carry them are run by big businesses whose bottom line is profit.  We reveal the secrets of our lives in dribs and drabs, in recordable phone conversations and even through the GPS tracking of our smart phones, oblivious that the technology exists to meet all the clues together, Sherlock Holmes-like, given sufficient time and money.

In my last post, I challenged us to take a step back and apply human insight to the results of big data analysis rather than take the results from statistical analyses at face value, to question the sources and play with other possible explanations before jumping to conclusions.  Now, knowing how fallible your own interpretation of big data may be, please give some consideration to the possibility that others, particularly those in positions of power, such as governments and businesses, can accidentally or deliberately misinterpret or misuse the big data resource.

But what can we do as an industry?  As individual analysts, consultants, data administrators and more?  At the very least, we can revisit the privacy and security controls we build into our systems.  Take a look at "Why you can't really anonymize your data" by Pete Warden and begin pressing the industry and academia to search for new solutions.  Look again at your business processes and evaluate if and how the use of big data subverts the intentions or ethics of how you work.  And, finally, reread George Orwell's "1984".


Posted February 17, 2012 2:47 AM
Permalink | 5 Comments |

5 Comments

Barry,

I share your sentiment 100%. Here are some suggestions:

1. As commentators, consultants and even developers of some of the tools and techniques to do this, lets adopt a code of conduct and ethics between us. CPA (Chartered Accountants to you guys) do, which says they won't assist in Ponzi Schemes, Tax evasion, fraud, etc. I know this borders on unrealistic, lacks teeth but it could get some airtime

2. We should do our best, when speaking and writing about big data, to always remind our audience that the real gold in big data is in science, medicine, the environment, fighting poverty, war, disease. Selling handbags to fashionable ladies is not the whole story.Almost every "use case" I hear about big data is how to get in the head of customers to unload more junk on them.

3. Sadly, young people don't seem to care about this as much as we do. I suspect they will at some point. Lets make an attempt to sensitize them to the dangers of losing (what's left of) their privacy.

Keep up the good fight.

-Neil Raden

Its an undeniable fact that there are risks with any technological advancement, made with good intentions to facelift human life, to get misused. Internet, too has seen the bad days and still is in constant attack. Am sure, with all the tera/peta/exa increase in bytes will bring multi-fold challenges as well. At the same time we all would agree that Internet has brought positive changes to societies across globe while it was going through the ups and downs.

We need to be concerned but not really worried. I am not being ignorant when I say that. And as Neil mentioned, there are positive forces at work always. We just need to be committed to the cause and to each other.

There has been a lot of talk in the Indian Media circles as well. Robert Sovereign-Smith, Deepak Ajwani at Think Digit and many others too have written quite often on this topic to keep, as Neil mentioned, the innocent (or call them ignorant) informed about exactly what quantum of their life is at risk.

Nice post. And don't forget the syndicated data marketplace is growing to share all this information - errors and all - across the economy. And RFID chips and readers are increasingly tracking the physical world.

It's becoming a real jungle out there. But loss of privacy isn't new, privacy has been compromised ever since your state DMV began selling your driver license records & financial institutions began sharing your SS #s. Big Data just embeds invasion of privacy more intrinsically. Responsible enterprises in some regulated sectors may remain vigilant in protecting customer confidentiality, but at best they will be islands of privacy in a sea of proliferating sensory & social data that will reveal who are where you are.

Thanks to Neil, Rakesh, William and Tony for your thoughtful comments. And special thanks to Neil for your suggestions on how we might address the problem.

Unfortunately, it's not going to be easy. As all of the comments show, the genie is well and truly out of the bottle. And given the monetary gain expected by most commercial organizations, not to mention the addiction to tracking citizen behavior exhibited by most governments, be they dictatorships or democracies, getting the genie to return to its bottle will be no small feat.

In contrast to Rakesh, I believe we should be very worried indeed. Yes, the Internet has radically changed the way we work, communicate and live in the developed world; not everybody considers all these changes to be positive. Big data introduces another, larger wave of changes to the Internet and beyond. We in the data community must be the responsible ones around what is done with big data. I'll discuss this further in a later blog post.

Leave a comment