Blog: Barry Devlin As one of the founders of data warehousing back in the mid-1980s, a question I increasingly ask myself over 25 years later is: Are our prior architectural and design decisions still relevant in the light of today's business needs and technological advances? I'll pose this and related questions in this blog as I see industry announcements and changes in way businesses make decisions. I'd love to hear your answers and, indeed, questions in the same vein. Copyright 2014 Mon, 17 Mar 2014 05:33:30 -0700 Automating the Data Warehouse (and beyond) eco-skyscraper-by-vikas-pawar-2a.jpgIn an era of "big data this" and "Internet of Things that", it's refreshing to step back to some of the basic principles of defining, building and maintaining data stores that support the process of decision making... or data warehousing, as we old-fashioned folks call it. Kalido did an excellent job last Friday of reminding the BBBT just what is needed to automate the process of data warehouse management. But, before the denizens of the data lake swim away with a bored flick of their tails, let me point out that this matters for big data too--maybe even more so. I'll return to this towards the end of this post.

In the first flush of considering a BI or analytics opportunity in the business and conceiving a solution that delivers exactly the right data needed to address that pesky problem, it's easy to forget the often rocky road of design and development ahead. More often forgotten, or sometimes ignored, is the ongoing drama of maintenance. Kalido, with their origins as an internal IT team solving a real problem for the real business of Royal Dutch Shell in the late '90s, have kept these challenges front and center.

All IT projects begin with business requirements, but data warehouses have a second, equally important, staring point: existing data sources. These twin origins typically lead to two largely disconnected processes. First, there is the requirements activity often called data modeling, but more correctly seen as the elucidation of a business model, consisting of function required by the business and data needed to support it. Second, there is the ETL-centric process of finding and understanding the existing sources of this data, figuring out how to prepare and condition it, and designing the physical database elements needed to support the function required.

Most data warehouse practitioners recognize that the disconnect between these two development processes is the origin of much of the cost and time expended in delivering a data warehouse. And they figure out a way through it. Unfortunately, they often fail to recognize that each time a new set of data must be added or an existing set updated, they have to work around the problem yet again. So, not only is initial development impacted, but future maintenance remains an expensive and time-consuming task. An ideal approach is to create an integrated environment that automates the entire set of tasks from business requirements documentation, through the definition and execution of data preparation, all the way to database design and tuning. Kalido is one of a small number of vendors who have taken this all-inclusive approach. They report build effort reductions of 60-85% in data warehouse development.

Conceptually, we move from focusing on the detailed steps (ETL) of preparing data to managing the metadata that relates the business model to the physical database design. The repetitive and error-prone donkey-work of ETL, job management and administration is automated. The skills required in IT change from programming-like to modeling-like. This has none of the sexiness of predictive analytics or self-service BI. Rather, it's about real IT productivity. Arguably, good IT shops always create some or all of this process- and metadata-management infrastructure themselves around their chosen modeling, ETL and database tools. Kalido is "just" a rather complete administrative environment for these processes.

Which brings me finally back to the shores of the data lake. As described, the data lake consists of a Hadoop-based store of all the data a business could ever need, in its original structure and form, and into which any business user can dip a bucket and retrieve the data required without IT blocking the way. However, whether IT is involved or not, the process of understanding the business need and getting the data from the lake into a form that is useful and usable for a decision-making requirement is exactly identical to that described in my third paragraph above. The same problems apply. Trust me, similar solutions will be required.


]]> Data warehouse Mon, 17 Mar 2014 05:33:30 -0700
Big Data, the Internet of Things and the Death of Capitalism? Part 5 Parts 1, 2, 3, 4 and 4A of this series explored the problem as I see it. Now, finally, I consider what we might do if my titular question actually makes sense.

Mammoth kill.jpgTo start, let's review my basic thesis. Mass production and competition, facilitated by ever improving technology, have been delivering better and cheaper products and improving many people's lives (at least in the developed world) for nearly two centuries. Capital, in the form of technology, and people--labor--work together in today's system to produce goods that people purchase using earnings from their labor. As technology grows exponentially better, an ever greater range of jobs are open to displacement. When technology displaces some yet to be determined percentage of labor, this system swings out of balance; there are simply not enough people with sufficient money to buy the products made, no matter how cheaply. We have not yet reached this tipping point because, throughout most of this period, the new jobs created by technology have largely offset the losses. However, employment trends in the past 10-15 years in the Western world suggest that this effect is no longer operating to the extent that it was, if at all.

In brief, the problem is that although technology produces greater wealth (as all economists agree), without its transfer to the masses through wages paid for labor, the number of consumers becomes insufficient to justify further production. The owners of the capital assets accumulate more wealth--and we see this happening in the increasing inequality in society--but they cannot match the consumption of the masses. Capitalism, or perhaps more precisely, the free market then collapses.

Let's first look at the production side of the above equation. What can be done to prevent job losses outpacing job creation as a result of technological advances? Can we prevent or put a damper on the great hollowing out of middle-income jobs that is creating a dumbbell-shaped distribution of a few highly-paid experts at one end and a shrinking swathe of lower-paid, less-skilled workers at the other? Can (or should) we move to address the growing imbalance of power and wealth between capital (especially technologically based) and labor? Let's be clear at the start, however, turning off automation is not an option I consider.

My suggestions, emerging mainly from the thinking discussed earlier, are mainly economic and social in nature. An obvious approach is to use the levers of taxation--as is done in many other areas--to drive a desired social outcome. We could, for example, reform taxation and social charges on labor to reduce the cost difference between using people and automating a process. In a similar vein, shifting taxation from labor to capital could also be tried. I can already hear the Tea Party screaming to protect the free market from the damn socialist. But, if my analysis is correct, the free market is about to undergo, at best, a radical change, if employment drops below some critical level. Pulling these levers soon and fairly dramatically is probably necessary; this is an approach that can only delay the inevitable. Another approach is for industry itself to take steps to protect employment. Mark Bonchek , writing in a recent Harvard Business Review blog, describes a few "job entrepreneurs" who maximize jobs instead of profits (but still make profits as well), including one in the Detroit area aimed at creating jobs for unemployed auto workers.

Moving from the producer's side to the consumer's view, profit aside, why did we set off down the road of the Industrial Revolution? To improve people's daily lives, to lessen the load of hard labor, to alleviate drudgery. The early path was not clear. Seven-day labor on the farm was replaced by seven-day labor in the factory. But, by the middle of the last century, working hours were being reduced in the workplace and in the home, food was cheaper and more plentiful; money and time were available for leisure. In theory, the result should have been an improvement in the human condition. In practice, the improvement was subverted by the mass producers. They needed to sell ever more of the goods they could produce so cheaply that profit came mainly through volume sales. Economist Victor Lebow's 1955 proclamation of "The Real Meaning of Consumer Demand" sums it up: "Our enormously productive economy demands that we make consumption our way of life... that we seek our spiritual satisfaction and our ego satisfaction in consumption... We need things consumed, burned up, worn out, replaced and discarded at an ever-increasing rate". Of course, some part of this is human nature, but it has been driven inexorably by advertising. We've ended up in the classic race to the bottom, even to the extent of products being produced with ever shorter lifespans to drive earlier replacement. Such consumption is becoming increasingly unsustainable as the world population grows, finite resources run out and the energy consumed in both production and use drives increasing climate change. As the president of Uruguay, Jose Mujica, asked of the Rio+20 Summit in 2012, "Does this planet have enough resources so seven or eight billion can have the same level of consumption and waste that today is seen in rich societies?"

My counter-intuitive suggestion here, and one I have not seen raised by economists (surprisingly?), is to ramp down consumerism, mainly through a reinvention of the purposes and practices of advertising. Reducing over-competition and over-consumption would probably drive interesting changes in the production side of the equation, including reduced demand for further automation, lower energy consumption, product quality being favored over quantity, higher savings rates by (non-)consumers, and more. Turning down the engine of consumption could also enable changes for the better in the financial markets, reducing the focus on quarterly results in favor of strategically sounder investment. Input from economists would be much appreciated.

But, let's wrap up. The title of this series asked: will automation through big data and the Internet of Things drive the death of capitalism? Although some readers may have assumed that this was my preferred outcome, I am more of the opinion that capitalism and the free market need to evolve rather quickly if they are to survive and, preferably, thrive. But, this would mean some radical changes. For example, a French think-tank, LH Forum, suggests the development of a positive economy that: "reorients capitalism towards long-term challenges. Altruism toward future generations is a much more powerful incentive than [the] selfishness which is supposed to steer the market economy". Other fundamental rethinking comes from British/Scottish historian, Niall Ferguson, who takes a wider view of "The Great Degeneration" of Western civilization. In a word, this is a topic that requires broad, deep and urgent thought.

For my more IT-oriented readers, I suspect this blog series has taken you far from basic ground. For this, I do not apologize. As described in Chapter 2 of "Business unIntelligence", I believe that the future of business and IT is to be joined at the hip. The biz-tech ecosystem declares that technology is at the heart of all business development. Business must understand IT. IT must be involved in the business. I suggest that understanding the impact of automation on business and society is a task for IT strategists and architects as much, if not more, as it is for economists and business planners.

Image from: All elephant photos in the earlier posts are my own work!

]]> Internet of Things Tue, 11 Mar 2014 06:47:55 -0700
Big Data, the Internet of Things and the Death of Capitalism? Part 4A Parts 1, 2, 3, and 4 of this series explored the problem as I see it and examined the views of some economists and other authors. This was supposed to be the final part, where I answered the question of the title. But, instead I have a bonus blog... inspired by Brynjolfsson and McAfee's Second Machine Age.

Addo Elephant B.JPG"The Second Machine Age: Work, Progress, and Prosperity in a Time of Brilliant Technologies" by Erik Brynjolfsson and Andrew McAfee, published as recently as last January, has been lying half-read on my Kindle desk for some weeks, mainly because its early chapters overlap with much of the technological material I'd referenced already. But a couple of tweets in response to Part 4 (thanks to Michael @mjcavaretta and Patrick @itworks sent me back to reexamine the content. There I found enough interesting thinking to fill a new post. I have to say that I was also hoping that the authors might come to my rescue as I searched for answers to my question above.

The central discussion of the book about the economic effect of advances in technology is couched in terms of bounty and spread. The former is the overall benefit accruing to the economy and the people whose lives are affected. The authors believe this bounty is enormous and growing, although they do admit that traditional economic measures, such as GDP, are no longer adequate. Nonetheless, my gut feel is that technology has, by many measures, improved the lot of humanity, or has the potential to do so. Spread is perhaps more easily quantified: it is the gap in wealth, income, mobility and more between people at the top and bottom of the ladder. And it has been demonstrably growing wider in recent years--which is socially and economically bad news. The authors' question then is whether the growth in bounty can counteract that in spread. Will the rising tide raise all boats, even though the super-yachts may be raised considerably further than the simple rowboats or even the rafts made of society's junk?

The news is bad. The authors report that "between 1983 and 2009, Americans became vastly wealthier overall as the total value of their assets increased. However... the bottom 80 percent of the income distribution actually saw a net decrease in their wealth." Combining this with a second observation that income distribution is moving from a normal (bell) curve towards a power law curve with a long tail of lower incomes, my conclusion is that the negative effect of spread is outpacing the positive lift of bounty. Note finally, that the above discussion is focused on the developed economies. A brief visit to one of the emerging economies should suffice to convince that the inequality there is far greater. The old saw that the rich get richer and the poor get poorer appears increasingly appropriate.

And Brynjolfsson and McAfee do end up agreeing with me in Chapter 11 as they sort through various arguments on the relative importance of bounty and spread. "Instead of being confident that the bounty from technology will more than compensate for the spread it generates, we are instead concerned about something close to the reverse: that the spread could actually reduce the bounty in years to come."

Turning their attention to the main underlying cause, the advance of technology and the possibility of "technological unemployment", they come desperately close to my argument: "there is a floor on how low wages for human labor can go. In turn, that floor can lead to unemployment: people who want to work, but are unable to find jobs. If neither the worker nor any entrepreneur can think of a profitable task that requires that worker's skills and capabilities, then that worker will go unemployed indefinitely. Over history, this has happened to many other inputs to production that were once valuable, from whale oil to horse labor. They are no longer needed in today's economy even at zero price. In other words, just as technology can create inequality, it can also create unemployment. And in theory, this can affect a large number of people, even a majority of the population, and even if the overall economic pie is growing."

But, they instantly--and, in my view, unjustifiably--veer back to the new economic orthodoxy that machines are more likely to complement humans than displace them and then focus on that scenario, suggesting that we must focus on equipping humans for this role through better and different education. While accepting that there are certainly instances where this is true, I see no evidence in recent years that this is the primary scenario we need to address. Only in Chapter 14, do they finally make some "Long-Term Recommendations" that suggest how changes in taxation might be required to rebalance the inequality in income and unemployment that they see eventually emerging. In this they echo both Martin Ford and Tyler Cohen, as we've seen earlier.

But I cannot help but feel this is far too little and far too late. So, what would I suggest? I believe we need some rather counter-intuitive thinking. And that is the long-promised topic of the real Part 5.

]]> Internet of Things Tue, 04 Mar 2014 05:46:35 -0700
Big Data, the Internet of Things and the Death of Capitalism? Part 4 As seen in Parts 1, 2 and 3, mainstream economists completely disagree with my thinking. Am I alone? It turns out that I'm not...

Baby elephant.JPGSurely I'm not the first to think that today's technological advances have the potential to seriously disrupt the current market economy? I eventually found that Martin Ford, founder of an unnamed software development company in Silicon Valley, wrote a book in 2009: "The Lights in the Tunnel: Automation, Accelerating Technology and the Economy of the Future". It turns out that his analysis of the problem is exactly the same as mine... not to mention more than four years ahead of me! Ford explains it very simply and compellingly: "the free market economy, as we understand it today, simply cannot work without a viable labor market. Jobs are the primary mechanism through which income-- and, therefore, purchasing power-- is distributed to the people who consume everything the economy produces. If at some point, machines are likely to permanently take over a great deal of the work now performed by human beings, then that will be a threat to the very foundation of our economic system."

Ford-graph.pngFor the more visually minded, Ford graphs capability to perform routine jobs against historical time for both humans and computer technology. I reproduce this simple graph here. Ford posits that there was a spurt in human capability after the Industrial Revolution as people learned to operate machines, but that this has now largely leveled off. As a general principle, this seems reasonable. Computer technology, on the other hand is widely accepted (via Moore's Law) to be on a geometric growth path in terms of its general capability. Unless one or both trends change dramatically, the cross-over of these two lines is inevitable. When technology becomes more efficient than a human at any particular job, competitive pressure in the market will ensure that the former replaces the latter. At some stage, the percentage of human jobs and their associated income that is automated away will be enough to disrupt the consumption side of the free market. Even increased uncertainty about income is often sufficient to cause consumers to increase savings and reduce discretionary spending. This behavior occurs in every recession and affects production in a predictable manner; production is cut back, often involving further job losses. A positive feedback cycle of reduced jobs drives reduced spending and drives further job losses. In today's cyclical economy, the trend is eventually reversed, sometimes through governmental action, or at times by war--World War II is credited by some for the end of the Great Depression. However, the graph above shows no such cyclical behavior: this is a one-way, one time transition.

Of course, the 64 million dollar questions (assuming you agree with the reasoning above) are: where we are on this timeline and how steep is the geometric rise in technological capability? It is likely that both aspects differ depending on the job involved. For some jobs, we are far from the inflection point in the technology curve, while others are much closer. For information-based jobs, the rate of growth in capability of computers may be very close to the Moore's Law formulation: a doubling in capacity every 18 months. Physical automation may grow more slowly.  But the outcome is assured: the lines will cross. Ford felt that in some areas we were getting close to the inflection point in 2009. The presumed approximate quadrupling of technological ability since then has not yet, however, tipped us over the edge of the job cliff, although few would argue the extent of the technological advances in the interim. Of course, Ford--and I--may be wrong.

If would seem that the hypothesis put forward by Ford should be amenable to mathematical modeling, I have found only one attempt to do so, in an academic paper "Economic growth given  machine intelligence", published in 1999 by Robin Hansen. Given the title, I hoped that this paper might provide some usable mathematical models capable of answering my questions. Unfortunately, I was disappointed. My mathematical skills are no longer up to the equations involved! More importantly, however, Hansen's framing assumptions seem strong on historical precedent (surely favoring continuation of the current situation) and fail to address the fundamental issue (in my view) that consumption is directly tied to income and its distribution among the population. Furthermore, Hansen has taken a largely dismissive attitude to Ford's thesis, as demonstrated by Hansen's advice to Ford in an online exchange: "he needs to learn some basic economics".

So, the hypothesis that I and, previously, Ford have put forward so far remains both intuitively reasonable and formally unproven. For now, I ask can any data scientist / economics major take on the task of producing a useful model of our basic hypothesis.

In the final part of the series, I look at what a collapse of the current economic order might look like and ask what, if anything, might be done to avert it.

For a broader and deeper view of the business and technological aspects of this topic, please take a look at my new book: Business unIntelligence: Insight and Innovation Beyond Analytics and Big Data.  

A bonus Part 4A follows!

]]> Business unIntelligence Tue, 25 Feb 2014 03:34:43 -0700
Big Data, the Internet of Things and the Death of Capitalism? Part 3 Part 1 introduced the elephant in the room: a few of the ways that current technological advances affect jobs and society.  Part 2 explored the economics. Here I dissect some further recent economic thinking.

Elephant Rears.JPGAs we saw in my previous blog, the topic of the impact of technology on jobs has generated increased interest in the past few months. Tyler Cowen's Sept. 2013 "Average is Over" was one of the earlier books (of this current phase) that addressed this area. The pity is that, in the long run, he avoids the core of the problem as I see it: if technology is replacing current jobs and failing to create new ones in at least equal numbers, who will be the consumers of the products of the new technology?

In the early part of his book, Cowen builds the hypothesis that technology has been impacting employment for some decades now, creating a bifurcation in the job market. As the jobs that can be partially or fully automated expand in scope and number, people with strong skills that support, complement or expand technology will find relatively rich pickings. Those with more average technologically oriented skills do less well. If their skills are relatively immune to technological replacement--from building laborers to masseurs--they will continue to be in some demand, although increased competition will likely push down wages. However, the key trend is that middle-income jobs susceptible to physical or informational automation have been disappearing and continue to do so. 60% of US job losses in the 2007-2009 recession (choose your dates and name) were in the mid-income categories, while three-quarters of the jobs added since then have been low-income. The switch to low-paying jobs dates back to 1999; it's a well-established trend.

The largely unmentioned category in Cowen's book, beyond a nod to the figures on page one, is the growing numbers of young, long-term unemployed or underemployed, even among college graduates. I suggest we are actually seeing a trifurcation in the jobs market, with the percentages of this third category growing in the developed world and the actual numbers--due to population growth--exploding in the emerging economies. I don't have easy access to the statistics. However, given that the income of the first two categories, whether directly or indirectly, must support the third (as well as the young, the elderly and the ill), parts of the developed world seem pretty close already to the relative percentages at which the system could break down. Current ongoing automation of information-based jobs may well tip the balance.

Cowen's prescription for the future is profoundly depressing, even in the US context, at which the book is squarely aimed. To paraphrase, tax the wealthy and capital more. Just a little. Reduce benefits to the poor. Maybe a little more. And let both the poorly paid and unemployed move to cheaper accommodation far from expensive real estate. " a few parts of... the warmer states... [w]e would build some 'tiny homes'... about 400 square feet and cost in the range of $20,000 to $40,000. ... very modest dwellings, as we used to build in the 1920s. We also would build some makeshift structures, similar to the better dwellings you might find in a Rio de Janeiro favela. The quality of the water and electrical infrastructure might be low by American standards, though we could supplement the neighborhood with free municipal wireless (the future version of Marie Antoinette's famous alleged phrase will be 'Let them watch internet!')."

If your response is "look what happened to Marie Antoinette", Cowen declares that, given the age demographic of the US, revolution is an unlikely outcome. Consider, however, that this is a US view. In much of the developing world, the great transition from Agriculture to Manufacturing is only now underway. Some of that is supported by off-shoring from the West, some by local economic and population growth. However, robotic automation is already on the increase even in those factories. Foxconn, long infamous as Apple's iPhone hardware and assembly "sweatshop", announced as far back as 2011 that it planned on installing up to 1 million robots in its factories by this year. A year ago this week, they announced"We have canceled hiring entry-level workers, a decision that is partly associated with our efforts in production automation."

Last week, the Wall Street Journal reported that Foxconn is working with Google with the aim of bringing Google's latest robotics acquisitions to the assembly lines to reduce job costs. Google also benefits: it gains a test bed for its purported new robotic operating system as well as access to Foxconn's mechanical engineering skills.  According to PCWorld, Foxconn's chairman, Terry Guo, told investors in June last year: "We have over 1 million workers. In the future we will add 1 million robotic workers. Our [human] workers will then become technicians and engineers."  The question of how many of the million human workers would become technicians and engineers seemingly went unaddressed. Echoing comments in Part 2 of this series, Foxconn are also reported to be looking to next-shore automated manufacturing to the US. To complete the picture, Bloomberg BusinessWeek meanwhile reported that Foxconn was also off-shoring jobs from China to Indonesia. A complex story, but the underlying driver is obvious: reduce labor, and then distribution, costs by any and every means possible.

Cowen's comments above about favelas, Marie Antoinette and revolutions should thus be seen in a broader view. The Industrial era middle class boom of the West may pass quickly through the emerging economies or perhaps bypass the later arrivals entirely. Shanty towns are already widespread in emerging economies; they house the displaced agricultural workers who cannot find or have already lost employment in manufacturing.  The age demographic in these countries is highly compatible with revolution and migration / invasion. For reasons of geography, the US may be less susceptible to invasion, but Europe's borders are under increasing siege. The Roman Empire's latter-day bread and circuses were very quickly overrun by the Vandals and the Visigoths.

The numbers and percentages of new jobs vs. unemployed worldwide are still up for debate among the economists. But their belief, echoed by the larger multinationals, that the consumer boom in the West of the 20th century will be replicated in the emerging economies stands perhaps on shaky ground.

For a broader and deeper view of the business and technological aspects of this topic, please take a look at my new book: Business unIntelligence: Insight and Innovation Beyond Analytics and Big Data

Part 4 follows.

]]> Internet of Things Tue, 18 Feb 2014 07:40:07 -0700
Big Data, the Internet of Things and the Death of Capitalism? Part 2 Kruger Elephant.JPGPart 1 introduced the elephant in the room: a few of the ways that current technological advances affect jobs and society.  Here, I probe deeper into the economics.

We in the world of IT and data tend to focus on the positive outcomes of technology. Diagnoses of illness will be improved. Wired reports that, in tests, IBM Watson's successful diagnosis rate for lung cancer is 90%, compared to 50% for human doctors according to Wellpoint's Samuel Nessbaum. We celebrate our ability to avoid traffic jams based on smartphone data collection. We imagine how we can drive targeted advertising. A trial (since discontinued) in London in mid-2013 of recycling bins that monitored passing smartphone WiFi addresses to eventually push product is but one of the more amusing examples. Big data can drive sustainable business, reducing resource usage and carbon footprint both within the enterprise and to the ends of its supply chains. Using a combination of big data, robotics and the IoT, Google already has prototype driverless cars on the road. Their recent acquisitions of Boston Dynamics (robotics), Nest (IoT sensors), and DeepMind Technologies (artificial intelligence) to name but a few indicate the direction of their thinking. The list goes on. Mostly, we see the progress but are largely blind to the down-sides.

The elephant in the room seems particularly perplexing to politicians, economists and others charged with taking a macro-view of where human society is going. Perhaps they feel some immunity to the excrement pile that is surely coming. But, surely an important question should be what will happen in economic and societal terms when significant numbers of workers in a wide range of industries are displaced fully or partially by technology, both physical and information-based? As a non-economist, the equation seems obvious to me: less workers means less consumers means less production means less workers. A positive feedback loop of highly negative factors for today's business models. Sounds like the demise of capitalism. Am I being too simplistic?

A recent article in the Economist, "The Future of Jobs: the Onrushing Wave" explores the widely held belief among economists that technological advances drive higher living standards for the population at large. The belief is based on historical precedent, most particularly the rise of the middle classes in Europe and the US during the 20th century. Anyone who opposes this consensus risks being labeled a Luddite, after the craft workers in the 19th century English textile industry who attacked the machines that were destroying their livelihoods. And perhaps that is why the Economist, after exploring the historical pattern at some length and touching on many of the aspects I've mentioned earlier, concludes rather lamely, in my opinion: "[Keynes] worry about technological unemployment was mainly a worry about a 'temporary phase of maladjustment' as society and the economy adjusted to ever greater levels of productivity. So it could well prove. However, society may find itself sorely tested if, as seems possible, growth and innovation deliver handsome gains to the skilled, while the rest cling to dwindling employment opportunities at stagnant wages."

 "Sorely tested" sounds like a serious understatement to me. The Industrial Revolution saw a mass movement of jobs from Agriculture to Manufacturing; the growth of the latter largely offset the shrinkage in Agricultural employment due to mechanization. In the Western world, the trend in employment since the 1960's has been from Manufacturing to Services. Services has compensated for the loss of Manufacturing jobs to both off-shoring and automation. But, as Larry Summers, former US Treasury Secretary, mentioned in the debate on "Rethinking Technology and Employment" at the World Economic Forum in Davos in January, the percentage of 25-54 year old males not working in the US will have risen from 5% in 1965 to an estimated near 15% in 2017. This trend suggests strongly that the shrinkage in Manufacturing is not being effectively taken up elsewhere. The Davos debate itself lumbered to a soporific draw on the motion that technological innovation is driving jobless growth. Prof. Erik Brynjolfsson, speaking with Summers in favor of the motion, offered that "off-shoring is just a way-station on the road to automation", a theme echoed by the January 2014 McKinsey Quarterly "Next-shoring: A CEO's guide". Meanwhile, Brynjolfsson's latest book, with Andrew McAfee, seems to limit its focus to the quality of work in the "Second Machine Age" rather than its actual quantity.
Blind_monks_examining_an_elephant 1.jpgAs in the tale of the blind men and the elephant, it seems that we are individually focusing only on small parts of this beast.

For a broader and deeper view of the business and technological aspects of this topic, please take a look at my new book: Business unIntelligence: Insight and Innovation Beyond Analytics and Big Data.  

Part 3 follows.
]]> Internet of Things Mon, 10 Feb 2014 11:11:36 -0700
Big Data, the Internet of Things and the Death of Capitalism? Part 1 Addo Elephant 1.JPGThere's an elephant in the room. No, not the friendly, yellow, Hadoop variety.

The large, smelly, excrement-dumping kind that no one wants to notice, never mind talk about. Allow me to name it: the combination of big data and the Internet of Things (IoT) is reinventing the nature of work. Much of employment as we know it is disappearing. And that signals the end of the consumer society we've painstakingly built over the past half century or so.

Am I being alarmist? Let's think about where big data and the IoT are going and how they are intersecting with the world of people earning and spending money. Include all the automation, connectivity and analytics that we are doing and considering. And you'll soon see what I mean. In part 1 of this short series, I look at the current state of play for big data and the Internet of Things.

Let's start with the physical. The Things of the IoT are acquiring ever more sensing abilities, as well as an increasing level of automation capabilities. Think piece-part robots yalking via the Internet and you wouldn't be far wrong. Look also at the developments in robotics in the past few years. In a recent O'Reilly Radar article, Rodney Brooks, former Panasonic Professor of Robotics at MIT and founder of Rethink Robotics, says "We're at the point with production robots where we were with mobile robots in the late 1980s... The advances are accelerating dramatically." The Rethink Robotics videos show some agonizingly slow-motion action, but it doesn't need Clayton Christensen to recognize a potential disruptive innovation here. The process about to be disrupted is the manual labor involved in a whole variety of repetitive but loosely bounded activities on assembly, packaging and similar production lines. Their manufacturers promote the idea that these new robots can work safely alongside humans, but the cost equation speaks more to replacement of manual labor than coexistence. Even at this early stage of development, Rethink Robotics' Baxter sells from $25,000. The impact on employment seems fairly obvious.

Baxter Robot.jpgBrooks, quoted in the same Radar article by Glen Martin, also targets robots for health care, especially care of the elderly, suggesting that demand for such workers is outstripping supply. "Again, the basic issue is dignity. Robots can free people from the more menial and onerous aspects of elder care, and they can deliver an extremely high level of service, providing better quality of life for seniors." And even less of the desperately needed social interaction that alleviates loneliness and disconnection among the elderly, given the more likely staff replacement vs supporting robotic role. Although you shouldn't discount YDreams Robotics' forthcoming desk lamp, which can sense your mood and call a friend or relative via smartphone and tell them you're in need of cheering up. The scenarios get creepier, right up to and including Spike Jonze's new movie, her. But from the viewpoint of this blog, the bottom line is another set of jobs under threat.

Information-based jobs are no safer, also under threat from multiple directions. Simpler tasks are automated. Sports scores and stock market prices can be spun into passable stories already by automated processes, once the jobs of cub journalists. More complex tasks are dissected into component parts, with the simpler parts automated and the human role often restricted to repetitive or supervisory activities. This is much the same process that took place in industrial factories in the 19th century. Craft workers who, skillfully and lovingly, made complete finished articles were displacing by production workers whose activities were increasingly optimized to produce simple piece parts in the cheapest and most soulless manner. Pattern recognition and machine learning have seen enormous advances as training data sets become ever larger. Human translators are being displaced by Google Translate and its ilk. According to Frey's and Osborne's estimate in their 2013 The Future of Employment, "47 percent of total US employment is in the high risk category, meaning that associated occupations are potentially automatable over some unspecifed number of years, perhaps a decade or two. It shall be noted that the probability axis can be seen as a rough timeline, where high probability occupations are likely to be substituted by computer capital relatively soon." The probabilities produced by their model run from 99% for telemarketers, freight agents, tax preparers and more to 0.3% for emergency management directors and first line supervisors. Frey and Osborne cover both physical and information-based jobs, but it's interesting to note how many information-based jobs are in the high probability range. It's also worth noting how much of the "services industry", that darling of developed Western economies, falls in the higher ranges of probability.

Thumbnail image for Business unIntelligence Cover.jpgSo, there is a problem here. Current technological trends will very clearly impact employment and society. The IT industry is confident that net effect will be positive. Many economists seem to be fence-sitting... but that's fairly normal. However, we'll take a closer look in Parts 2 and 3.

For a broader and deeper view of the business and technological aspects of this topic, please take a look at my new book: Business unIntelligence: Insight and Innovation Beyond Analytics and Big Data.   ]]> Internet of Things Tue, 04 Feb 2014 12:05:52 -0700
BI for Everyone? Mart.jpgBusiness unIntelligence emerged from my questioning of the fundamental assumptions underpinning BI in all its forms, from the enterprise data warehouse to big data analytics. The belief I'm questioning in this post is that the target audience of BI is everyone in the business. This springs from the very reasonable premise that BI should be used more widely throughout the organization than it currently is. But, somewhere along the way, I don't know when, the idea emerged that BI must be used by everybody in the enterprise, if we are to gain full business benefit. BI vendors thus lament low-penetration of their tools, agonizing over how to make them simpler, more appealing. Data marts, echoing the bright and breezy WalMarts and Kmarts of retail, were introduced in the mid-1990s as an alternative to the dark and dismal data warehouse. Today, self-service BI, self-service analytics, self-service everything will solve world BI hunger.

I'm sorry, for me, that dog don't hunt.

This post was triggered by Paxata's mission statement, which aims to "Empower EVERY PERSON in the enterprise to find and prepare analytical information..." In truth, Paxata is building a very impressive and powerful adaptive data preparation tool. But, their mission statement almost derailed my interest. Really, each and every person in the organization should be finding and preparing analytical information? From the janitor to the CEO? I'll return to Paxata later in this post, but first, let me check if anybody else feels the same level of discomfort with this concept as I do.

Let's start with self-service on a personal level. I'm pretty comfortable with self-service when I visit a computer retailer. Put me in a perfume store, and I need an assistant--quickly. My wife has the opposite experience. I conclude from this (and many other examples) that self-service works only if the self-server has (i) sufficient understanding of what she's trying to do and (ii) considerable knowledge of what is available, its characteristics and where it is in the store. My experience in BI is that business users typically satisfy the former condition but fail regularly on the latter. In my original 1988 data warehousing paper (or contact me if you want a copy), I distinguished between dependent and independent users. It was only the second group who satisfied both conditions above. At the simplest level, we may divide business users into two such groups, as I did back then. In reality, of course, it's more subtle. Some business users understand statistics, many don't. Some are more attuned to using information; others rely (often correctly) on their intuitions. In chapter 9 of Business unIntelligence: Insight and Innovation Beyond Analytics and Big Data I discuss many of the ways in which decisions are influenced by many things other than information. So, it's horses for courses at a personal level: self-service BI only works for some of the people some of the time.

Organizationally, the idea that everyone is involved in decision making that demands analytical support is contrary to all concepts of division of labor and responsibility. The above-mentioned janitor has no incentive to analyze his cleaning performance. There still exist a myriad of tasks in every organization that are menial, performed by rote. Managers of such areas have, since the time of Fredrick Winslow Taylor in the early 1900s, been analyzing such work and finding ways to streamline it. While we recognize today that production line workers do have knowledge to contribute to process improvement, such knowledge is tacit, and unlikely, in my view, to be quantified by the workers themselves. Rather, that elucidation depends on another type of skill, that of independent users, power users or, as they like to be called today, data scientists. These are people whose skills and interests bridge the business/IT divide.

Which brings me back to Paxata and the concept of the biz-tech ecosystem, the symbiotic relationship between business and IT demanded by the speed and span of modern business. Finding and preparing data has long been the principal bottleneck of all BI. It is a type work that really requires a balanced mix of business acumen and IT skill. Its exploratory and ad hoc nature defies the old development approach of business requirements statements thrown over the fence to IT.

What's required is an exploratory environment for diverse data types that seamlessly blends business acumen and IT skills. This is precisely what Paxata does. Based more on the data content than on the metadata (field or column names and types), business analysts explore the actual contents of data sources and their inter-relationships in a highly visual manner that uses color and other cues to direct attention to aspect of interest identified by heuristics within the tool itself. Data can be simply cleansed and transformed, split and joined. The interface is deliberately spreadsheet-like, another comfort zone for the business analyst. But, unlike a spreadsheet, all action are recorded and tracked; they can be rolled back and they can be repeated elsewhere, vital aspects of the level of data governance needed to make this solution capable of being put into production. It's hard to describe in words; you really need to try it or see the demo. See further descriptions from Joseph A. di Paolantonio and Jon Reed.

It's tools like this that make the biz-tech ecosystem real, that blend business and IT data skills and knowledge for easier application to real business needs. They enable people from the business side of the enterprise to enhance their IT abilities, and vice versa, removing the barriers to data exploration and preparation that stand between the information and full business value. They make it easier for more people to become independent or power users, data scientists, or whatever they choose to call themselves, making their jobs easier, faster and more productive. That is the vital and visionary work that Paxata (and other vendors like them) are doing in this era of exploding data varieties and information volumes.

Business unIntelligence Cover.jpgBut I still believe that this role and these tools will never be for everyone. What do you think?

News flash: Business unIntelligence is now available as an ebook on Kindle and on Safari.

]]> Business unIntelligence Thu, 23 Jan 2014 02:29:05 -0700
Predictive Analytics - the Power and the Gory astrologer.jpgMore years ago than I care to remember, I took a strong interest in predicting behavior. Besotted by a woman who blew hot and cold, I turned to astrology seeking to know the future of the relationship. The prognosis was far from positive... and ultimately correct.

Modern thinking is highly skeptical of such esoteric arts. Correctly so in the case of many practitioners. But it has far less reason, in my opinion, to dismiss much of the underlying rationale and value of these approaches. That, however, is a topic for another forum. My point here is that I did not believe the prediction; I didn't want to. And that's part of a rather gory truth of any predictive tool.

As we flip the calendar to 2014, it is clear that the audience for predicting people's future behavior has grown far beyond the ragged ranks of lonely lovers. Marketers and advertisers, once content with psychological and sociological generalizations, have become increasingly besotted by big data and the promise it offers of knowing the customer so intimately that her individual and specific future behaviors can be predicted. And, probably of more value, influenced. This, of course, is the power and the glory of predictive analytics. But, as already noted, this blog's title actually says "gory". And here, in detail, is why.

Thumbnail image for Business unIntelligence Cover.jpgWhile writing in Business unIntelligence: Insight and Innovation Beyond Analytics and Big Data about some of the uses of big data way back in February, 2012, I came across Charles Duhigg's New York Times piece on Target's ability to predict not only the pregnancy of female (obviously) customers but also their likely delivery dates. Over the New Year break, I finally found time to read Duhigg's book, The Power of Habit, in which that story first appeared. My point in my own book, and in a related blog Death by a thousand analytics, was that using big data in this manner is both a clear invasion of privacy and likely to creep people out. But Duhigg's book deals with a much broader topic--human habitual behavior and unconscious thought patterns--that has far deeper implications for predictive analytics and decision support in general.

The message of The Power of Habit is fairly easily summarized. A very significant proportion of our thinking, decision-making and action at personal, organizational and societal levels is entirely habitual. Given a particular cue (which can be anything from a thought to an action), a routine set of behaviors is immediately and automatically engaged, which provides an anticipated reward. Simply put: cue => routine => reward. This mechanism operates, both in its set up and execution, in one of the most primitive areas of the brain, the basal ganglia, a structure that derives in evolutionary terms from the earliest chordates. It is, in fact, a basic survival mechanism that enables us undertake a wide range of necessary life-preserving activities, from preparing food to getting home, with minimal expenditure of energy and attention. This, of course, initially freed the brain to deal with novel, life-threatening situations and more recently for higher-brain functions such as empathy and reasoning. However, as anyone who has ever tried to give up biting their nails or mid-afternoon snacking can testify, changing habitual patterns can be very difficult indeed, even those which are counterproductive or even downright dangerous.

This very primitive mental operation of cue-routine-reward has two (at least) important consequences for predictive analytics.

For marketing, the drive will be to move increasingly to an understanding of specific, individual habits and their cues, in order to create opportunities to influence behavior. This is more difficult than it sounds, as such cues are often difficult even for the individuals themselves to identify. Similarly, rewards are often far more subtle and diverse than might be imagined. The story of how Pepsodent toothpaste was sold to an American public famed for poor oral hygiene in the early 1900s has become a classic marketing exemplar of how to create a habit and a sales success. But, according to Duhigg, the actual cue and reward involved were long mistaken. Furthermore, although most Western people are by now inured to mass manipulation by advertising, the much more personal and private knowledge used for such individual influencing may raise resistance to such approaches. From the point of view of privacy, I am of the opinion that it most certainly should.

In the case of personal and organizational decision making support, habitual and other unconscious patterns are a serious impediment to efforts to become "data-driven". My unfortunate lovelorn experience of disbelieving a predicted but undesired outcome is a basic behavior pattern well-recognized in psychology but seldom mentioned in business intelligence. Habitual beliefs and behaviors at an organizational level are so deeply embedded and invisible to participants that they stymie efforts even to recognize problems, never mind take effective action.

The bottom line is that no matter how much data you gather or how elaborate the models you generate, Business unIntelligence operates finally, in full glory or gory fullness, in the minds of your customers and business users.

]]> Business unIntelligence Mon, 06 Jan 2014 03:40:50 -0700
Sex, Toys, and Big Data Sex_Lies_and_Videotape.jpgIn this last blog of 2013, I want to make some serious points that should be front and center (he said coyly) for all analytic, BI and big data professionals next year. In fact, they should be already. And if not, I urge you to use some reflection time over the coming holiday to make them so.

Let's start with sex... toys. Consider the value of deep analytic big data monitoring to enforce the following two verified laws in the USA: (i) you may not have more than two dildos in the same house in Arizona and (ii) it's illegal to own more than six dildos in Texas. (I refuse to speculate on why Texans may be less susceptible to moral degeneracy than citizens of Arizona.) I suspect most of us are mildly amused by such legislation. Many have probably long taken comfort in the belief that it is unenforceable; after all, who can imagine the local cops breaking into your home on suspicion that you are hoarding sex toys? But wait. Haven't the authorities in the US and other countries been rummaging through your digital drawers for years now, and without any just cause for suspicion? Hasn't at least one supermarket chain collated and analyzed data to determine if its (female) customers have recently become pregnant? Haven't two technology companies applied for patents to spy on (sorry, gather behavioral data to enable better targeted advertising to) you via your living room TV? How far will government and business go in mining our personal lives, our sexuality, our bodies, our social circles to (allegedly) protect us, cure us or sell us more (unnecessary) stuff.

ThanksforSharingPOSTER.jpgThe sad truth is that we have lost most of our privacy already, having entered into a Faustian pact to share, both knowingly and unwittingly, the details of our daily lives. That knowing part--the Facebook likes and Google +1s--may be said to represent a conscious tradeoff by the person sharing between a loss of privacy and a perceived increase in social capital or useful contextual information. Even the acceptance that our smartphones report our location minute by minute is driven by a consensual belief that we may be offered a coupon for a nearby coffee shop at any moment. The payoff for ultimate traceability. Apple iBeacon allows newer iPhones or Android phones with Bluetooth Low Energy devices to track their--and your--position in space with centimeter precision. Which aisle in the supermarket are you in? What about some very specific retail therapy recommendations? These, and other soon to emerge toys, have the addictive quality of sex to many of the current generation of CMOs and proponents of big analytics.

However, the smartphone is but the pioneer species of the internet of things, the ultimate in small toys for big data boys. As sensors become ever smaller, cheaper and ever more powerful, the utopian vision is of systems that respond instantly to our needs, that anticipate our very expectations. We are promised houses that know we're home and adjust lighting and heading accordingly. Wrist bands that sense when we awake in the morning, so that the coffee can be brewed, or know that our elderly aunt has not moved for the past hour and may have slipped and broken her hip. Software to the rescue, hardware to alleviate yet another chore. According to Computerworld, Gartner predicts that half of all BI implementations will incorporate machine data from the internet of things by 2017. Research conducted by EMA and 9sight during the summer of 2013, suggested that the future is already here; machine-generated data overtook human-sourced information as big data sources among our respondents. And the more details of our mundane activities that become available in the internet of things for analysis and correlation, the more specific and identifiable our individual patterns of behavior become and the more difficult it is to retain any degree of anonymity. IBM's 5 for 5 this year includes "a digital guardian that will protect you online" within 5 years. But it's mostly about security rather privacy, and already years too late.

Thumbnail image for Business unIntelligence Cover.jpgRealists may like to consider how useful such tools would have been to the Stasi in German Democratic Republic (East Germany) or the Soviet Union's KGB in the second half of the last century. Would the world be as we know it if they had? The more dystopian among us conjure up visions of Franz Kafka's The Trial or George Orwell's 1984 only 30 years late in arrival. But, it is not my intention to propose neo-Luddism. The big data jinni is already well out of the bottle. Despite the silly-bugger games we are currently playing with it, big data does hold the possibility to understand and address many of the most intractable environmental, climatic and social issues we face today. Although, as I've mentioned repeatedly in "Business unIntelligence: Insight and Innovation Beyond Analytics and Big Data", our success in acting upon, never mind solving, them will depend far more on our very human intention towards what we desire than on all the data, software and hardware we apply.

So, as we look to 2014, I urge all of us to keep three resolutions in our minds, a subtle matrix underpinning every business need we evaluate and every design decision we make:

  1. Understand and account for the relationship between big data and the traditional core business information long created and processed in existing operational and informational systems; each complements and contextualizes the other

  2. In gathering and analyzing big data, consider how its use can impact personal privacy, especially the ways in which such data can be combined with other big data sets and compromise anonymity

  3. Perhaps most importantly, consider the universal/global impact of the project: how does it contribute to or mitigate the real-world issues of environmental degradation, over-production and consumption, economic instability, and more. In short, how does it support the best in humanity and the world?

These, especially the last, may sound like utopian dreams. But, consider the almost unimaginable power unleashed by the unprecedented growth and interconnection of information currently in process. We are unleashing a potential for good or ill far greater than that created by a small team of physicists in Los Alamos during the Manhattan Project.

I wish you all a Happy Christmas and a Peaceful and Prosperous New Year, within whatever tradition you choose to celebrate this time of year. ]]> Business unIntelligence Wed, 18 Dec 2013 09:15:06 -0700
SOA for Process and Data Integration sounion_athens1.jpgTraditionally, BI has been a process-free zone. Decision makers are such free thinkers that suggesting their methods of working can be defined by some stogy process is generally met with sneers of derision. Or worse. BI vendors and developers have largely acquiesced; the only place you see process mentioned is in data integration, where activity flow diagrams abound to define the steps needed to populate the data warehouse and marts.

I, on the other hand, have long held - since the turn of the millennium, in fact - that all decision making follows a process, albeit a very flexible and adaptive one. The early proof emerges in operational BI (or decision management, as it's also called) where decision making steps are embedded in fairly traditional operational processes. As predictive and operational analytics has become increasingly popular, this intermingling of informational and operational is such that these once distinctly different business behaviors are becoming indistinguishable. A relatively easy thought experiment then leads to the conclusion that all decision making has an underlying process.

I was also fairly sure at an early stage that only a Service Oriented Architecture (SOA) approach could provide the flexible and adaptive activities and workflows required. I further saw that SOA could (and would need to) be a foundation for data integration as the demand for near real-time decision making grew. As a result, I have been discussing all this at seminars and conferences for many years now. But every time I'd mention SOA, the sound of discontent would rumble around the room. Too complex. Tried it and failed. And, more recently, isn't that all old hat now with cloud and mobile?

All of this is by way of introduction to a very interesting briefing I received this week from Pat Pruchnickyj, Director of Product Marketing at Talend, who restored my faith in SOA as an overall approach and in its practical application! Although perhaps best known for its open source ETL (extract, transform and load) and data integration tooling it first introduced in 2006, Talend today take a broader view and offers data focused solutions, such as ETL and data quality, as well as open source application integration solutions, such as enterprise service bus (ESB) and message queuing. These various approaches are united by common metadata, typically created and managed through a graphical, workflow-oriented tool, Talend Open Studio.

So, why is this important? If you follow the history of BI, you'll know that many well-established implementations are characterized by complex and often long-running batch processes that gather, consolidate and cleanse data from multiple internal operational sources into a data warehouse and then to marts. This is a model that scales poorly in an era where vast volumes of data are coming from external sources (a substantial part of big data) and analysis is increasingly demanding near real-time data. File-based data integration becomes a challenge in these circumstances. The simplest approach may be to move towards ever smaller files running in micro-batches. However, the ultimate requirement is to enable message-based communication between source and target applications/databases. This requires a fundamental change in thinking for most BI developers. So a starting point of ETL and an end point of messaging, both under a common ETL-like workflow, makes for easier growth. Developers can begin to see that a data transfer/cleansing service is conceptually similar to any business activity also offered as a service. And the possibility of creating workflows combining operational and informational processes emerges naturally to support operational BI.

Thumbnail image for Business unIntelligence Cover.jpgIs this to say that ETL tools are a dying species? Certainly not. For some types and sizes of data integration, a file-based approach will continue to offer higher performance or more extensive integration and cleansing function. The key is to ensure common, shared metadata (or as I prefer to call it, context-setting information, CSI) between all the different flavors of data and application integration.

Process, including both business and IT aspects, is the subject of Chapter 7 of "Business unIntelligence: Insight and Innovation Beyond Analytics and Big Data".

Sunset Over Architecture (SOA) image: ]]> Data integration Thu, 12 Dec 2013 03:46:17 -0700
Squaring the Circle of Human Analytics google-glass-utopia-dystopia.jpgThe ninth in a series of posts introducing the concepts and messages of my new book, "Business unIntelligence--Insight and Innovation Beyond Analytics and Big Data", now available!

I took a few days off last week and read the first science fiction book in maybe twenty years...and the first computer science fiction I've ever read. The book was briefly recommended by an attendee at my presentation in DW2013 in Zurich a few weeks ago, a thought inspired (I imagine) by some of my comments about various possibilities that social collaboration could offer to innovation in business. The book in question is "The Circle" by Dave Eggers, which explores the logical and apparently inevitable conclusion to current developments in social networking and search in the world at large. Set in a post-Facebook, post-Google world, the Circle is the all-encompassing company that delivers and controls all aspects of social interaction on the Web. And it is moving towards a vision encapsulated by the slogan "Secrets are Lies. Sharing is Caring. Privacy is Theft" crafted by the somewhat clueless heroine, Mae, who goes Transparent (recording everything she sees, hears and says on a live Web feed) but seems totally unaware of the delicious irony of using "bathroom breaks" to conceal some activities strictly unrelated to toilet.

While Mae's unquestioning embrace of all things social may seem a tad naive to many of us old-timers, there is little doubt that many people and organizations believe implicitly in one or more aspects of what the Circle proposes. One example is the explosion of video surveillance, both overt and covert, that is purported to improve citizens' behavior on the basis that increasing the risk of getting caught decreases our propensity to misbehave. Dan Ariely in "The (Honest) Truth about Dishonesty" begs to differ. According to his research, the Simple Model of Rational Crime, devised by University of Chicago economist and Nobel laureate, Gary Becker, which proposes that people commit crimes based on a rational analysis of each situation, is too simplistic. Rather, he believes that our behavior is driven by two conflicting motivations, one is to see ourselves as honest, honorable people and the second is that we want to benefit from cheating and get as much money as possible. However, our cognitive flexibility allows us to rationalize that if cheat by only a little bit, we can benefit from cheating and still view ourselves as honest. The extent of that "little bit" differs from person to person and determines how we actually behave. Personally, I suspect that this model of human motivation is also too simplistic, but, not being a professor at either Duke or MIT, I don't have the resources to test that hypothesis... My fundamental concern is that the erosion of privacy inherent in ongoing surveillance is being ignored in the misguided belief that society is benefiting from improved behavior.

In my last post, I discussed similar privacy issues that seem to go unnoticed when advances in technology allow monitoring our behavior while watching TV, in the interest of targeting advertising or improving monetization of pay-per-view programming.

Of course, exactly the same ethical issues arise as we consider the use of informal information within the enterprise to improve insight on what motivates decision makers and increases their ability to innovate. In my book, I define informal information as information generated as part of every process, in both formal activities (project kick-off and plan review meetings, shareholder and board meetings, court proceedings, etc.) but also during informal activities (chats at the water cooler, ad hoc meetings, phone calls, conferences attended, and so on) that is increasingly captured and stored digitally. A brief look at the two lists of activities above will show just how many are or could easily be recorded. My belief is that such records can be analyzed to understand how innovative thinking emerges and is often suppressed in teams working together on any creative project. In video-conferenced meetings, for example, recording and analysis of facial micro-expressions could show when, how and by whom particular ideas are approved or otherwise by team members and how this affects their proponents. Combine it with spoken and written comments, project success or failure and other information and we can build over time an understanding of how to construct innovative teams by blending skills and attitudes in the optimal proportions to encourage invention and balance it with the constructive criticism and enthusiastic support needed to convert it to innovation.

So, you can probably see the benefits of that as a proponent of decision support software if you really want to look beyond traditional BI. The above thinking is the logical conclusion of analytics, using yet further sets of big data. We get to see not just how participants behave but also gain insight into their subconscious reactions and motivations. And if you think this is science fiction, a recent New York Times article, "When Algorithms Grow Accustomed to Your Face", will convince you otherwise, but be careful how your expression may reveal your reaction, especially if a Webcam is pointing at you!

In terms of ethics and privacy, we are now treading on very dangerous ground. Increasingly, the technology is allowing us to record and analyze human motivation and intention. It can be used for good or ill. We must apply some of the innovative thinking that created this technology to figuring out how to control and manage its use.

Thumbnail image for Thumbnail image for Business unIntelligence Cover.jpg
I will be further exploring the themes and messages of "Business unIntelligence: Insight and Innovation Beyond Analytics and Big Data" over the coming weeks.  You can order the book at the link above; it is now available. Also, check out my presentation at the BrightTALK Business Intelligence and Big Data Analytics Summit, recorded Sept. 11 and "Beyond BI is... Business unIntelligence" recorded Sept. 26. Read my interview with Lindy Ryan in Rediscovering BI.

Montage inspired by ExtremeTech article on Google Glass. 
]]> Business unIntelligence Wed, 04 Dec 2013 02:37:42 -0700
Intention in decision making gravity.jpgThe eighth in a series of posts introducing the concepts and messages of my new book, "Business unIntelligence--Insight and Innovation Beyond Analytics and Big Data", now available!

Decision making seldom, if ever, occurs in a vacuum. Even in Gravity, all of Sandra Bullock's life-or-death decisions happened in tiny and delicate bubbles of air. All business decisions are likewise made in bubbles. One type of bubble is of the information variety, where the information available limits the view of what's possible in terms of action or outcome. The insidious effect of such bubbles, with particular emphasis on our personal lives is well described by Eli Pariser in The Filter Bubble: What the Internet Is Hiding from You, Penguin Press (2011). However, my interest here in in another bubble--an intention bubble that hides the broader consequences of decisions from their makers.

As I point out in Business unIntelligence, much of what drives our decisions--whether in business or in personal matters--is far from the fully rational, data-driven approach long espoused and gaining even more traction in this big data era. Modern psychology and neurobiology show us a growing set of internal, physical and mental processes that contribute significantly to all decision making. These include emotional state, energy level, psychological integration, intuition and more. In order of application, perhaps the first of these "non-rational" factors is the intentionality of the decision maker. A recent news story illustrates the danger of the intention bubble, particularly in the hyper-connected, big data world of modern marketing.

A search for LG Smart TV at the moment shows a story of snooping filling most of the first page. As first described by developer, tweaker and Linux enthusiast, DoctorBeet on his eponymous Blog, he was a little curious about the choice of advertisements his new TV was serving up on its landing screen. A packet analysis of data sent to LG unsurprisingly showed lots of information about channel choices. More concerning was that the names of personal files viewed from a USB device were also transmitted, all unencrypted. A link to an LG site, which has been "currently under maintenance" every time I've checked since, revealed LG's intention: "LG Smart Ad analyses users favourite programs, online behaviour, search keywords and other information to offer relevant ads to target audiences. For example, LG Smart Ad can feature sharp suits to men, or alluring cosmetics and fragrances to women. Furthermore, LG Smart Ad offers useful and various advertising performance reports. That live broadcasting ads cannot. To accurately identify actual advertising effectiveness," according to DoctorBeet. Following the online uproar, I imagine that LG may fall back on "coding error" or some such reason for the scope of data being collected and other excesses. However, few in the advertising industry or among proponents of big data will quibble with the intention. But, perhaps we should...

Some links in TechDirt's coverage of the story led to even creepier behavior-monitoring ideas, this time in patents applied for by Microsoft and Verizon. Microsoft's "consumer detector" could enable "the system [to] charge for the television show or movie based on the number of viewers in the room. Or, if the number of viewers exceeds the limits laid out by a particular content license, the system would halt playback unless additional viewing rights were purchased." Verizon was contemplating analyzing "ambient action" captured by the front camera now on most smart TV devices. According to the patent, "ambient action may include the user eating, exercising, laughing, reading, sleeping, talking, singing, humming, cleaning, playing a musical instrument, performing any other suitable action, and/or engaging in any other physical activity during the presentation of the media content." Should we assume that excluded from such "suitable actions" are those of a sexual nature that immediately spring to mind as often occurring in front of a TV?

At issue here is about how carefully we must attend to the intention of decision makers about what big data to collect, how it could be used, what could happen if it was combined with other available data, and so on. Further, we need to consider the intention of anybody who has legitimate access to such data. Let's be clear, the temptation for misuse extends far beyond NSA and other spy agency employees. Is anybody considering that there is a line that could or should be drawn here on how far businesses can go in behavioral analysis?

It was my intention to address innovation in this article. I decided the above was more topical, and at least equally important. I'll come back to innovative decision making in a later post.

Thumbnail image for Thumbnail image for Business unIntelligence Cover.jpg
I will be further exploring the themes and messages of "Business unIntelligence: Insight and Innovation Beyond Analytics and Big Data" over the coming weeks.  You can order the book at the link above; it is now available. Also, check out my presentation at the BrightTALK Business Intelligence and Big Data Analytics Summit, recorded Sept. 11 and "Beyond BI is... Business unIntelligence" recorded Sept. 26. Read my interview with Lindy Ryan in Rediscovering BI.
]]> Business unIntelligence Mon, 25 Nov 2013 07:37:17 -0700
NoSQL, NewSQL and NuoDB susurration.jpgIt seems to me that much of the drive behind NoSQL (whether No SQL or Not Only SQL) arose from a rather narrow view of the relational model and technology by web-oriented developers whose experience was constrained by the strengths and limitations of MySQL. Many of their criticisms of relational databases had actually been overcome by commercial products like DB2, Oracle and Teradata to varying extents and under certain circumstances. Although, of course, open source and commodity hardware pricing also continue to drive uptake.

A similar pattern can be seen with NewSQL in its original definition by Matt Aslett of the 451 group, back in April 2011. So, when it comes to products clamoring for inclusion in either category, I tend to be somewhat jaundiced. A class defined by what it is not (NoSQL) presents some logical difficulties. And one classed "new", when today's new is tomorrow's obsolete is not much better. I prefer to look at products in a more holistic sense. With that in mind, let's get to NuoDB, which announced version 2 in mid-October. With my travel schedule I didn't find time to blog then, but now that I'm back on terra firma in Cape Town, the time has come!

Back in October 2012, I encountered NuoDB prior to their initial launch, and their then positioning as part of the NewSQL wave. I also had a bit of a rant then about the NoSQL/NewSQL nomenclature (although no one listened then either), and commented on the technical innovation in the product, which quite impressed me, saying "NuoDB takes a highly innovative, object-oriented, transaction/messaging-system approach to the underlying database processing, eliminating the concept of a single control process responsible for all aspects of database integrity and organization. [T]he approach is described as elastically scalable - cashing in on the cloud and big data.  It also touts emergent behavior, a concept central to the theory of complex systems. Together with an in-memory model for data storage, NuoDB appears very well positioned to take advantage of the two key technological advances of recent years... extensive memory and multi-core processors."

The concept of emergent behavior (the idea that the database could be anything anybody wanted it to be, with SQL simply as first model) was interesting technically but challenging in positioning the product. Version 2 is more focused, with a tagline of distributed database and an emphasis on scale-out and geo-distribution within the relational paradigm. This makes more sense in marketing terms and the use case in a global VoIP support environment shows how the product can be used to reduce latency and improve data consistency. No need to harp on about "NewSQL" then...

Sales aside, the underlying novel technical architecture continues to interest me. A reading on the NuoDB Technical Whitepaper (registration required) revealed some additional gems. One, in particular, resonates with my thinking on the ongoing breakdown of one of the longest-standing postulates of decision support: the belief that operational and informational processes demand separate databases to support them, as discussed in Chapter 5 of my book. While there continue to be valid business reasons to build and maintain a separate store of core, historical information, real-time decision needs also demand the ability to support both operational and informational needs on the primary data store. NuoDB's Transaction Engine architecture and use of Multi-Version Concurrency Control together enable good performance of both read/write and longer-running read-only operations seen in operational BI applications.

Business unIntelligence Cover.jpgI will return to exploring the themes and messages of "Business unIntelligence: Insight and Innovation Beyond Analytics and Big Data" over the coming weeks.  You can order the book at the link above; it is now available. Also, check out my presentation at the BrightTALK Business Intelligence and Big Data Analytics Summit, recorded Sept. 11 and Beyond BI is... Business unIntelligence, recorded Sept. 26. Read my interview with Lindy Ryan in Rediscovering BI.

Susurration image: ]]> Database Wed, 20 Nov 2013 01:55:27 -0700
Information does not equal insight The seventh in a series of posts introducing the concepts and messages of my new book, "Business unIntelligence--Insight and Innovation Beyond Analytics and Big Data", now available!

Fig 9-4 - Rodin The Thinker.jpgIt has long been assumed in the BI community that more information is "a good thing" when it comes to making better decisions.  Except when there is too much information...when we encounter information overload. Some qualify the thinking by requiring information to be relevant, although that raises questions of how to determine relevance, especially in entirely novel or poorly understood situations. Despite any such reservations, the BI industry generally focuses entirely on information-related issues, from preparation to presentation, and leaves any thinking about the process of how humans make decisions to some unidentified other party.

So, how does insight arise in the human process of decision making? Of course, information does play an important role, but the information we focus on in BI is but a very thin sliver of the actual information that the mind takes into account. Today, neurobiology and psychology tells us that our minds are absorbing information from the earliest days of life, and that such pre-verbal information has significant and unconscious impact on all of the decisions we make during our lives. Such early information conditions our social behavior and expectations of the world. In a similar, although probably in a somewhat more conscious manner, later life experiences create an informational background that colors our thinking in each and every decision we take.

BI purists may argue that such information is unquantifiable and therefore should be discounted.  However, its impact can be large and may be inferred from the ongoing behavior of people involved in the decision making process. When we work in teams, we automatically notice this information and adjust accordingly. Joe always brings up such-and-such a problem. Maria regularly took issue with the previous CFO, but supports the current one, even though the policies and practices are unchanged. This informal information could be gathered electronically and mined for patterns even today to create an entirely new view of how irrational most decision really are.

This leads us to look beyond information as the sole or, even, majority basis for decision making. Rational choice theory has long held sway as the foundation of thinking about business decision making. In recent years, the roles of intuition, gut-feeling, emotional state and intention are slowly coming to the fore as possible contributors.  The BI community has yet to catch up on such thinking. And, indeed, its integration into decision support (to recall that old phrase) requires software and methods that are far from those found in traditional and current BI.

Collaborative and social tools come closest to the foundation of what is required, and I'll address this as we look at innovation in the next article.

Thumbnail image for Thumbnail image for Business unIntelligence Cover.jpg
I will be further exploring the themes and messages of "Business unIntelligence: Insight and Innovation Beyond Analytics and Big Data" over the coming weeks.  You can order the book at the link above; it is now available. Also, check out my presentation at the BrightTALK Business Intelligence and Big Data Analytics Summit, recorded Sept. 11 and "Beyond BI is... Business unIntelligence" recorded Sept. 26. Read my interview with Lindy Ryan in Rediscovering BI.

Upcoming conference appearances where I can share more are in Los Angeles at SPARK! on Oct. 15-16 and in Rome at a dedicated two-day seminar on October 30-31.
]]> Business unIntelligence Fri, 11 Oct 2013 11:36:32 -0700