We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.

Blog: Barry Devlin Subscribe to this blog's RSS feed!

Barry Devlin

As one of the founders of data warehousing back in the mid-1980s, a question I increasingly ask myself over 25 years later is: Are our prior architectural and design decisions still relevant in the light of today's business needs and technological advances? I'll pose this and related questions in this blog as I see industry announcements and changes in way businesses make decisions. I'd love to hear your answers and, indeed, questions in the same vein.

About the author >

Dr. Barry Devlin is among the foremost authorities in the world on business insight and data warehousing. He was responsible for the definition of IBM's data warehouse architecture in the mid '80s and authored the first paper on the topic in the IBM Systems Journal in 1988. He is a widely respected consultant and lecturer on this and related topics, and author of the comprehensive book Data Warehouse: From Architecture to Implementation.

Barry's interest today covers the wider field of a fully integrated business, covering informational, operational and collaborative environments and, in particular, how to present the end user with an holistic experience of the business through IT. These aims, and a growing conviction that the original data warehouse architecture struggles to meet modern business needs for near real-time business intelligence (BI) and support for big data, drove Barry’s latest book, Business unIntelligence: Insight and Innovation Beyond Analytics, now available in print and eBook editions.

Barry has worked in the IT industry for more than 30 years, mainly as a Distinguished Engineer for IBM in Dublin, Ireland. He is now founder and principal of 9sight Consulting, specializing in the human, organizational and IT implications and design of deep business insight solutions.

Editor's Note: Find more articles and resources in Barry's BeyeNETWORK Expert Channel and blog. Be sure to visit today!

Addo Elephant 1.JPGThere's an elephant in the room. No, not the friendly, yellow, Hadoop variety.

The large, smelly, excrement-dumping kind that no one wants to notice, never mind talk about. Allow me to name it: the combination of big data and the Internet of Things (IoT) is reinventing the nature of work. Much of employment as we know it is disappearing. And that signals the end of the consumer society we've painstakingly built over the past half century or so.

Am I being alarmist? Let's think about where big data and the IoT are going and how they are intersecting with the world of people earning and spending money. Include all the automation, connectivity and analytics that we are doing and considering. And you'll soon see what I mean. In part 1 of this short series, I look at the current state of play for big data and the Internet of Things.

Let's start with the physical. The Things of the IoT are acquiring ever more sensing abilities, as well as an increasing level of automation capabilities. Think piece-part robots yalking via the Internet and you wouldn't be far wrong. Look also at the developments in robotics in the past few years. In a recent O'Reilly Radar article, Rodney Brooks, former Panasonic Professor of Robotics at MIT and founder of Rethink Robotics, says "We're at the point with production robots where we were with mobile robots in the late 1980s... The advances are accelerating dramatically." The Rethink Robotics videos show some agonizingly slow-motion action, but it doesn't need Clayton Christensen to recognize a potential disruptive innovation here. The process about to be disrupted is the manual labor involved in a whole variety of repetitive but loosely bounded activities on assembly, packaging and similar production lines. Their manufacturers promote the idea that these new robots can work safely alongside humans, but the cost equation speaks more to replacement of manual labor than coexistence. Even at this early stage of development, Rethink Robotics' Baxter sells from $25,000. The impact on employment seems fairly obvious.

Baxter Robot.jpgBrooks, quoted in the same Radar article by Glen Martin, also targets robots for health care, especially care of the elderly, suggesting that demand for such workers is outstripping supply. "Again, the basic issue is dignity. Robots can free people from the more menial and onerous aspects of elder care, and they can deliver an extremely high level of service, providing better quality of life for seniors." And even less of the desperately needed social interaction that alleviates loneliness and disconnection among the elderly, given the more likely staff replacement vs supporting robotic role. Although you shouldn't discount YDreams Robotics' forthcoming desk lamp, which can sense your mood and call a friend or relative via smartphone and tell them you're in need of cheering up. The scenarios get creepier, right up to and including Spike Jonze's new movie, her. But from the viewpoint of this blog, the bottom line is another set of jobs under threat.

Information-based jobs are no safer, also under threat from multiple directions. Simpler tasks are automated. Sports scores and stock market prices can be spun into passable stories already by automated processes, once the jobs of cub journalists. More complex tasks are dissected into component parts, with the simpler parts automated and the human role often restricted to repetitive or supervisory activities. This is much the same process that took place in industrial factories in the 19th century. Craft workers who, skillfully and lovingly, made complete finished articles were displacing by production workers whose activities were increasingly optimized to produce simple piece parts in the cheapest and most soulless manner. Pattern recognition and machine learning have seen enormous advances as training data sets become ever larger. Human translators are being displaced by Google Translate and its ilk. According to Frey's and Osborne's estimate in their 2013 The Future of Employment, "47 percent of total US employment is in the high risk category, meaning that associated occupations are potentially automatable over some unspecifed number of years, perhaps a decade or two. It shall be noted that the probability axis can be seen as a rough timeline, where high probability occupations are likely to be substituted by computer capital relatively soon." The probabilities produced by their model run from 99% for telemarketers, freight agents, tax preparers and more to 0.3% for emergency management directors and first line supervisors. Frey and Osborne cover both physical and information-based jobs, but it's interesting to note how many information-based jobs are in the high probability range. It's also worth noting how much of the "services industry", that darling of developed Western economies, falls in the higher ranges of probability.

Thumbnail image for Business unIntelligence Cover.jpgSo, there is a problem here. Current technological trends will very clearly impact employment and society. The IT industry is confident that net effect will be positive. Many economists seem to be fence-sitting... but that's fairly normal. However, we'll take a closer look in Parts 2 and 3.

For a broader and deeper view of the business and technological aspects of this topic, please take a look at my new book: Business unIntelligence: Insight and Innovation Beyond Analytics and Big Data.  

Posted February 4, 2014 12:05 PM
Permalink | 1 Comment |
Mart.jpgBusiness unIntelligence emerged from my questioning of the fundamental assumptions underpinning BI in all its forms, from the enterprise data warehouse to big data analytics. The belief I'm questioning in this post is that the target audience of BI is everyone in the business. This springs from the very reasonable premise that BI should be used more widely throughout the organization than it currently is. But, somewhere along the way, I don't know when, the idea emerged that BI must be used by everybody in the enterprise, if we are to gain full business benefit. BI vendors thus lament low-penetration of their tools, agonizing over how to make them simpler, more appealing. Data marts, echoing the bright and breezy WalMarts and Kmarts of retail, were introduced in the mid-1990s as an alternative to the dark and dismal data warehouse. Today, self-service BI, self-service analytics, self-service everything will solve world BI hunger.

I'm sorry, for me, that dog don't hunt.

This post was triggered by Paxata's mission statement, which aims to "Empower EVERY PERSON in the enterprise to find and prepare analytical information..." In truth, Paxata is building a very impressive and powerful adaptive data preparation tool. But, their mission statement almost derailed my interest. Really, each and every person in the organization should be finding and preparing analytical information? From the janitor to the CEO? I'll return to Paxata later in this post, but first, let me check if anybody else feels the same level of discomfort with this concept as I do.

Let's start with self-service on a personal level. I'm pretty comfortable with self-service when I visit a computer retailer. Put me in a perfume store, and I need an assistant--quickly. My wife has the opposite experience. I conclude from this (and many other examples) that self-service works only if the self-server has (i) sufficient understanding of what she's trying to do and (ii) considerable knowledge of what is available, its characteristics and where it is in the store. My experience in BI is that business users typically satisfy the former condition but fail regularly on the latter. In my original 1988 data warehousing paper (or contact me if you want a copy), I distinguished between dependent and independent users. It was only the second group who satisfied both conditions above. At the simplest level, we may divide business users into two such groups, as I did back then. In reality, of course, it's more subtle. Some business users understand statistics, many don't. Some are more attuned to using information; others rely (often correctly) on their intuitions. In chapter 9 of Business unIntelligence: Insight and Innovation Beyond Analytics and Big Data I discuss many of the ways in which decisions are influenced by many things other than information. So, it's horses for courses at a personal level: self-service BI only works for some of the people some of the time.

Organizationally, the idea that everyone is involved in decision making that demands analytical support is contrary to all concepts of division of labor and responsibility. The above-mentioned janitor has no incentive to analyze his cleaning performance. There still exist a myriad of tasks in every organization that are menial, performed by rote. Managers of such areas have, since the time of Fredrick Winslow Taylor in the early 1900s, been analyzing such work and finding ways to streamline it. While we recognize today that production line workers do have knowledge to contribute to process improvement, such knowledge is tacit, and unlikely, in my view, to be quantified by the workers themselves. Rather, that elucidation depends on another type of skill, that of independent users, power users or, as they like to be called today, data scientists. These are people whose skills and interests bridge the business/IT divide.

Which brings me back to Paxata and the concept of the biz-tech ecosystem, the symbiotic relationship between business and IT demanded by the speed and span of modern business. Finding and preparing data has long been the principal bottleneck of all BI. It is a type work that really requires a balanced mix of business acumen and IT skill. Its exploratory and ad hoc nature defies the old development approach of business requirements statements thrown over the fence to IT.

What's required is an exploratory environment for diverse data types that seamlessly blends business acumen and IT skills. This is precisely what Paxata does. Based more on the data content than on the metadata (field or column names and types), business analysts explore the actual contents of data sources and their inter-relationships in a highly visual manner that uses color and other cues to direct attention to aspect of interest identified by heuristics within the tool itself. Data can be simply cleansed and transformed, split and joined. The interface is deliberately spreadsheet-like, another comfort zone for the business analyst. But, unlike a spreadsheet, all action are recorded and tracked; they can be rolled back and they can be repeated elsewhere, vital aspects of the level of data governance needed to make this solution capable of being put into production. It's hard to describe in words; you really need to try it or see the demo. See further descriptions from Joseph A. di Paolantonio and Jon Reed.

It's tools like this that make the biz-tech ecosystem real, that blend business and IT data skills and knowledge for easier application to real business needs. They enable people from the business side of the enterprise to enhance their IT abilities, and vice versa, removing the barriers to data exploration and preparation that stand between the information and full business value. They make it easier for more people to become independent or power users, data scientists, or whatever they choose to call themselves, making their jobs easier, faster and more productive. That is the vital and visionary work that Paxata (and other vendors like them) are doing in this era of exploding data varieties and information volumes.

Business unIntelligence Cover.jpgBut I still believe that this role and these tools will never be for everyone. What do you think?

News flash: Business unIntelligence is now available as an ebook on Kindle and on Safari.

Posted January 23, 2014 2:29 AM
Permalink | No Comments |
astrologer.jpgMore years ago than I care to remember, I took a strong interest in predicting behavior. Besotted by a woman who blew hot and cold, I turned to astrology seeking to know the future of the relationship. The prognosis was far from positive... and ultimately correct.

Modern thinking is highly skeptical of such esoteric arts. Correctly so in the case of many practitioners. But it has far less reason, in my opinion, to dismiss much of the underlying rationale and value of these approaches. That, however, is a topic for another forum. My point here is that I did not believe the prediction; I didn't want to. And that's part of a rather gory truth of any predictive tool.

As we flip the calendar to 2014, it is clear that the audience for predicting people's future behavior has grown far beyond the ragged ranks of lonely lovers. Marketers and advertisers, once content with psychological and sociological generalizations, have become increasingly besotted by big data and the promise it offers of knowing the customer so intimately that her individual and specific future behaviors can be predicted. And, probably of more value, influenced. This, of course, is the power and the glory of predictive analytics. But, as already noted, this blog's title actually says "gory". And here, in detail, is why.

Thumbnail image for Business unIntelligence Cover.jpgWhile writing in Business unIntelligence: Insight and Innovation Beyond Analytics and Big Data about some of the uses of big data way back in February, 2012, I came across Charles Duhigg's New York Times piece on Target's ability to predict not only the pregnancy of female (obviously) customers but also their likely delivery dates. Over the New Year break, I finally found time to read Duhigg's book, The Power of Habit, in which that story first appeared. My point in my own book, and in a related blog Death by a thousand analytics, was that using big data in this manner is both a clear invasion of privacy and likely to creep people out. But Duhigg's book deals with a much broader topic--human habitual behavior and unconscious thought patterns--that has far deeper implications for predictive analytics and decision support in general.

The message of The Power of Habit is fairly easily summarized. A very significant proportion of our thinking, decision-making and action at personal, organizational and societal levels is entirely habitual. Given a particular cue (which can be anything from a thought to an action), a routine set of behaviors is immediately and automatically engaged, which provides an anticipated reward. Simply put: cue => routine => reward. This mechanism operates, both in its set up and execution, in one of the most primitive areas of the brain, the basal ganglia, a structure that derives in evolutionary terms from the earliest chordates. It is, in fact, a basic survival mechanism that enables us undertake a wide range of necessary life-preserving activities, from preparing food to getting home, with minimal expenditure of energy and attention. This, of course, initially freed the brain to deal with novel, life-threatening situations and more recently for higher-brain functions such as empathy and reasoning. However, as anyone who has ever tried to give up biting their nails or mid-afternoon snacking can testify, changing habitual patterns can be very difficult indeed, even those which are counterproductive or even downright dangerous.

This very primitive mental operation of cue-routine-reward has two (at least) important consequences for predictive analytics.

For marketing, the drive will be to move increasingly to an understanding of specific, individual habits and their cues, in order to create opportunities to influence behavior. This is more difficult than it sounds, as such cues are often difficult even for the individuals themselves to identify. Similarly, rewards are often far more subtle and diverse than might be imagined. The story of how Pepsodent toothpaste was sold to an American public famed for poor oral hygiene in the early 1900s has become a classic marketing exemplar of how to create a habit and a sales success. But, according to Duhigg, the actual cue and reward involved were long mistaken. Furthermore, although most Western people are by now inured to mass manipulation by advertising, the much more personal and private knowledge used for such individual influencing may raise resistance to such approaches. From the point of view of privacy, I am of the opinion that it most certainly should.

In the case of personal and organizational decision making support, habitual and other unconscious patterns are a serious impediment to efforts to become "data-driven". My unfortunate lovelorn experience of disbelieving a predicted but undesired outcome is a basic behavior pattern well-recognized in psychology but seldom mentioned in business intelligence. Habitual beliefs and behaviors at an organizational level are so deeply embedded and invisible to participants that they stymie efforts even to recognize problems, never mind take effective action.

The bottom line is that no matter how much data you gather or how elaborate the models you generate, Business unIntelligence operates finally, in full glory or gory fullness, in the minds of your customers and business users.

Posted January 6, 2014 3:40 AM
Permalink | No Comments |
Sex_Lies_and_Videotape.jpgIn this last blog of 2013, I want to make some serious points that should be front and center (he said coyly) for all analytic, BI and big data professionals next year. In fact, they should be already. And if not, I urge you to use some reflection time over the coming holiday to make them so.

Let's start with sex... toys. Consider the value of deep analytic big data monitoring to enforce the following two verified laws in the USA: (i) you may not have more than two dildos in the same house in Arizona and (ii) it's illegal to own more than six dildos in Texas. (I refuse to speculate on why Texans may be less susceptible to moral degeneracy than citizens of Arizona.) I suspect most of us are mildly amused by such legislation. Many have probably long taken comfort in the belief that it is unenforceable; after all, who can imagine the local cops breaking into your home on suspicion that you are hoarding sex toys? But wait. Haven't the authorities in the US and other countries been rummaging through your digital drawers for years now, and without any just cause for suspicion? Hasn't at least one supermarket chain collated and analyzed data to determine if its (female) customers have recently become pregnant? Haven't two technology companies applied for patents to spy on (sorry, gather behavioral data to enable better targeted advertising to) you via your living room TV? How far will government and business go in mining our personal lives, our sexuality, our bodies, our social circles to (allegedly) protect us, cure us or sell us more (unnecessary) stuff.

ThanksforSharingPOSTER.jpgThe sad truth is that we have lost most of our privacy already, having entered into a Faustian pact to share, both knowingly and unwittingly, the details of our daily lives. That knowing part--the Facebook likes and Google +1s--may be said to represent a conscious tradeoff by the person sharing between a loss of privacy and a perceived increase in social capital or useful contextual information. Even the acceptance that our smartphones report our location minute by minute is driven by a consensual belief that we may be offered a coupon for a nearby coffee shop at any moment. The payoff for ultimate traceability. Apple iBeacon allows newer iPhones or Android phones with Bluetooth Low Energy devices to track their--and your--position in space with centimeter precision. Which aisle in the supermarket are you in? What about some very specific retail therapy recommendations? These, and other soon to emerge toys, have the addictive quality of sex to many of the current generation of CMOs and proponents of big analytics.

However, the smartphone is but the pioneer species of the internet of things, the ultimate in small toys for big data boys. As sensors become ever smaller, cheaper and ever more powerful, the utopian vision is of systems that respond instantly to our needs, that anticipate our very expectations. We are promised houses that know we're home and adjust lighting and heading accordingly. Wrist bands that sense when we awake in the morning, so that the coffee can be brewed, or know that our elderly aunt has not moved for the past hour and may have slipped and broken her hip. Software to the rescue, hardware to alleviate yet another chore. According to Computerworld, Gartner predicts that half of all BI implementations will incorporate machine data from the internet of things by 2017. Research conducted by EMA and 9sight during the summer of 2013, suggested that the future is already here; machine-generated data overtook human-sourced information as big data sources among our respondents. And the more details of our mundane activities that become available in the internet of things for analysis and correlation, the more specific and identifiable our individual patterns of behavior become and the more difficult it is to retain any degree of anonymity. IBM's 5 for 5 this year includes "a digital guardian that will protect you online" within 5 years. But it's mostly about security rather privacy, and already years too late.

Thumbnail image for Business unIntelligence Cover.jpgRealists may like to consider how useful such tools would have been to the Stasi in German Democratic Republic (East Germany) or the Soviet Union's KGB in the second half of the last century. Would the world be as we know it if they had? The more dystopian among us conjure up visions of Franz Kafka's The Trial or George Orwell's 1984 only 30 years late in arrival. But, it is not my intention to propose neo-Luddism. The big data jinni is already well out of the bottle. Despite the silly-bugger games we are currently playing with it, big data does hold the possibility to understand and address many of the most intractable environmental, climatic and social issues we face today. Although, as I've mentioned repeatedly in "Business unIntelligence: Insight and Innovation Beyond Analytics and Big Data", our success in acting upon, never mind solving, them will depend far more on our very human intention towards what we desire than on all the data, software and hardware we apply.

So, as we look to 2014, I urge all of us to keep three resolutions in our minds, a subtle matrix underpinning every business need we evaluate and every design decision we make:

  1. Understand and account for the relationship between big data and the traditional core business information long created and processed in existing operational and informational systems; each complements and contextualizes the other

  2. In gathering and analyzing big data, consider how its use can impact personal privacy, especially the ways in which such data can be combined with other big data sets and compromise anonymity

  3. Perhaps most importantly, consider the universal/global impact of the project: how does it contribute to or mitigate the real-world issues of environmental degradation, over-production and consumption, economic instability, and more. In short, how does it support the best in humanity and the world?

These, especially the last, may sound like utopian dreams. But, consider the almost unimaginable power unleashed by the unprecedented growth and interconnection of information currently in process. We are unleashing a potential for good or ill far greater than that created by a small team of physicists in Los Alamos during the Manhattan Project.

I wish you all a Happy Christmas and a Peaceful and Prosperous New Year, within whatever tradition you choose to celebrate this time of year.

Posted December 18, 2013 9:15 AM
Permalink | No Comments |
sounion_athens1.jpgTraditionally, BI has been a process-free zone. Decision makers are such free thinkers that suggesting their methods of working can be defined by some stogy process is generally met with sneers of derision. Or worse. BI vendors and developers have largely acquiesced; the only place you see process mentioned is in data integration, where activity flow diagrams abound to define the steps needed to populate the data warehouse and marts.

I, on the other hand, have long held - since the turn of the millennium, in fact - that all decision making follows a process, albeit a very flexible and adaptive one. The early proof emerges in operational BI (or decision management, as it's also called) where decision making steps are embedded in fairly traditional operational processes. As predictive and operational analytics has become increasingly popular, this intermingling of informational and operational is such that these once distinctly different business behaviors are becoming indistinguishable. A relatively easy thought experiment then leads to the conclusion that all decision making has an underlying process.

I was also fairly sure at an early stage that only a Service Oriented Architecture (SOA) approach could provide the flexible and adaptive activities and workflows required. I further saw that SOA could (and would need to) be a foundation for data integration as the demand for near real-time decision making grew. As a result, I have been discussing all this at seminars and conferences for many years now. But every time I'd mention SOA, the sound of discontent would rumble around the room. Too complex. Tried it and failed. And, more recently, isn't that all old hat now with cloud and mobile?

All of this is by way of introduction to a very interesting briefing I received this week from Pat Pruchnickyj, Director of Product Marketing at Talend, who restored my faith in SOA as an overall approach and in its practical application! Although perhaps best known for its open source ETL (extract, transform and load) and data integration tooling it first introduced in 2006, Talend today take a broader view and offers data focused solutions, such as ETL and data quality, as well as open source application integration solutions, such as enterprise service bus (ESB) and message queuing. These various approaches are united by common metadata, typically created and managed through a graphical, workflow-oriented tool, Talend Open Studio.

So, why is this important? If you follow the history of BI, you'll know that many well-established implementations are characterized by complex and often long-running batch processes that gather, consolidate and cleanse data from multiple internal operational sources into a data warehouse and then to marts. This is a model that scales poorly in an era where vast volumes of data are coming from external sources (a substantial part of big data) and analysis is increasingly demanding near real-time data. File-based data integration becomes a challenge in these circumstances. The simplest approach may be to move towards ever smaller files running in micro-batches. However, the ultimate requirement is to enable message-based communication between source and target applications/databases. This requires a fundamental change in thinking for most BI developers. So a starting point of ETL and an end point of messaging, both under a common ETL-like workflow, makes for easier growth. Developers can begin to see that a data transfer/cleansing service is conceptually similar to any business activity also offered as a service. And the possibility of creating workflows combining operational and informational processes emerges naturally to support operational BI.

Thumbnail image for Business unIntelligence Cover.jpgIs this to say that ETL tools are a dying species? Certainly not. For some types and sizes of data integration, a file-based approach will continue to offer higher performance or more extensive integration and cleansing function. The key is to ensure common, shared metadata (or as I prefer to call it, context-setting information, CSI) between all the different flavors of data and application integration.

Process, including both business and IT aspects, is the subject of Chapter 7 of "Business unIntelligence: Insight and Innovation Beyond Analytics and Big Data".

Sunset Over Architecture (SOA) image: http://vorris.blogspot.com/2012/07/mr-cameron-you-are-darn-right-start.html

Posted December 12, 2013 3:46 AM
Permalink | No Comments |


Search this blog
Categories ›
Archives ›
Recent Entries ›