Blog: Neil Raden I hope that you will engage with me with your comments as we explore the future of business intelligence (BI), particularly its expanding role in the actual process of making decisions and running an organization. BI is poised for a great leap forward, but that will leave a lot of people and solutions behind so expect a bumpy ride. I also expect there will be a flurry of advice and methodologies for moving BI into a more active role, one that will widen the audience as BI meets more needs. But a lot of that advice will be thin and gratuitous, so hold on while we put it under the microscope. You can reach me directly if you prefer at Copyright 2012 Tue, 01 Dec 2009 20:59:05 -0700 Is Pivot Synonymous with OLAP?

It's pretty clear that the term "OLAP" is losing its mojo. In a series of exchanges on Twitter, it was suggested that pivoting is the same as OLAP. I don't agree. Pivot tables were first introduced by Microsoft in Excel over ten years ago (I'm sure someone else used the term prior to that, probably that guy from IBM who invented Business Intelligence in the fifties) J. Other spreadsheet vendors have followed suit. Oracle introduced PIVOT to their non-standard brand of SQL, SQL*Plus, too, but more on that in a minute.


Essentially, a pivot in a spreadsheet is a little more than an a crosstab - transposing columns and rows, with some smarts added in to figure out the unique elements of each identifying "dimension" and performing some rudimentary calculations such as aggregation. It is essentially a linked report, in that changes to the original data are reflected in the pivot table, and this can go both ways.


OLAP, on the other hand, is quite a bit different. It includes:


Multidimensional model at its core: dimensions, hierarchies and attributes

Pivot: the ability to rearrange the display of same without code or script

Drill: at least to drill into detail and drill back up if not across dimensions

Cross-dimensional calculations


Collapsible browsing

Flexible definitions of time

Sparse matrix

Read &WRITE, though many OLAP tools cannot do the later

Query generation, no coding



This is no the clear definition of pivot tables or "analytics," but Oracle's SQL pivot fails on almost every criterion, especially on no code and fast. That isn't to say it isn't a useful feature in a database, but it's not OLAP and it isn't really pivot, either, which I assume is an interactive process. SQL is not, as far as I'm concerned, interactive.


OLAP as a term may be a little old-fashioned, but it is not the same as pivot tables. However, pivot tables do seem to be on a path to provide all or most of OLAP's functionality.


Interestingly, the energetic wars of ROLAP versus MOLAP of the 90's may have subsided, but OLAP has clearly settled into a cold peace between the two. The true MOLAP's, Essbase. Microsoft, TM1 for example, have settled in for the long haul while the ROLAP's of Microstrategy, SAP Business Objects and Oracle BI have their own adherents and are still viable. Not many people actually use OLAP, perhaps fewer than 10% of the workforce in organizations that own OLAP tools, but it would be reckless to say that OLAP is no longer relevant. If anything, its ascendant thanks to the acquisition of BI vendors by enterprise software companies.


One last thing to consider. Predictive analytics are the topic du jour, but before an analyst builds a predictive model, he/she spends a lot of time profiling the data. A great of this work is done on OLAP tools. In addition, the output from OLAP manipulations is not terribly different from applying models - analyzing data to derive some understanding. And lastly, after predictive models are run, there is a continuing need to examine the results. Fertile ground for OLAP.



]]> Tue, 01 Dec 2009 20:59:05 -0700
When Is a Trend a Trend? I enjoyed reading Nenshad Bardoliwalla's blog The Top 10 Trends for 2010 in Analytics, Business Intelligence, and Performance Management. I have to agree with most of the points, but I am a little skeptical of a few of them. For instance, 2010 is only a few weeks away. Some things here feel like they are, at best, 3-5 years away. Developments aren't like a tsunami that happens all at once. We may already see evidence for some of these things, but when will they reach fluorescence? At what point are they a "trend?" For example:


We will witness the emergence of packaged strategy-driven execution applications


Is an emergence a trend? Nevertheless, how different is that from packaged analytical apps? Strategic planning is still an oxymoron in most organizations. Granted, some people may have expressed a strategy, and its circulated in beautiful PowerPoint slides, but the nitty-gritty of putting numbers to a strategy is still a joke, an agonizing iteration of best guess forecasts combined with mandated goals. I'm not sure how this can connect to anything. So to imply that we are on the cusp of a smooth strategy-to-execution through packaged software products is, at best, a little optimistic. The following quote appeared in Harvard Business Review in an article entitled, "Who Needs Budgets".  It reinforces the need for fundamental changes to planning and budgeting processes to address business challenges. So long as the budget dominates business planning, a self motivated workforce is a fantasy, however many cutting edge techniques a company embraces.


Nenshad goes on to cite as an example that Oracle's Fusion technology "clearly portend(s) the increasing fusion of analytic and transactional capability in the context of business processes and this will only increase." There has been a lot of portent in this area for years, but I still can't see that 2010 is the year it will become a trend.



The holy grail of the predictive, real-time enterprise will start to deliver on its promises


Again, is a "start" a trend? I don't think so. Besides, predictive real-time organizations already exist, and have for some time, particularly in financial services and customer service applications. Business rules engines have been around for more than a decade and are usually primed with scored data from predictive models, but this is a niche. It represents a tiny proportion of operational decisions in an enterprise.


There is also danger in predictive models. Suppose a PM indicates that only the top 20% of your customers are profitable and the rest lose money. The "real-time enterprise" might close the accounts of the 80%. Suppose, however, that the PM was unable to understand WHY they were unprofitable, but it turned out to be excessive waste and poor quality caused by you and customer profitability was incorrectly measured? Quantitative methods are only as good as the data, methodologies employed and skill of the modelers. You'd have to be crazy to run your company on algorithms.


There is a lot of talk about CEP, but keep in mind that the domain of CEP is exceedingly narrow and driven by the discernible and codified rules that drive it. It doesn't run a company, just a few decisions. Likewise, in decision management, which I wrote about in Smart (Enough) Systems with James Taylor, we were extremely careful to point out that decision management as a technique only applied to a very small subset of operational decision types, that's why we called smart "enough." Though lots of small decisions add up, making some mistakes are acceptable, such as denying credit to an otherwise creditworthy consumer. But in those cases where even a single mistake can have severe consequences, decision automation approaches, whether decision management, CEP or other point solutions are clearly not acceptable. There are no solutions yet for "sensing and responding" approaches for those kinds of decisions.


This leads me to the next point. I believe that the distinction between exploratory BI (OLAP, reporting, visualization, etc.) and predictive analysis is rather artificial. To the greatest extent, they are used to understand things, not predict them. The predictive process is a very small part of the use of statistics in businesses. At the end of a statistical model is often the same process of BI - discussing the findings and deciding what to do. They just represent different methods. The exception is scoring models, a pretty widely used approach where large volumes of data are scored by a model such as a neural net and programmatic decisions are made without humans, such as mailing lists, next-best-offer, etc. But it's a real stretch to characterize this as a predictive, real-time enterprise.


SaaS / Cloud BI Tools will steal significant revenue from on-premise vendors but also fight for limited oxygen amongst themselves.


This already a trend. Well, maybe not if you use the word "significant." It is not clear to me that large enterprises are about to adopt SaaS / Cloud BI. Customers of are way ahead in this regard, but only for applications derived of data which is already in the cloud, so to speak. It's also not clear how the SMB's, whatever they are, are going to adopt this. Sure, cost is a major issue, but so is staff, attention, priorities, etc.


Open Source offerings will continue to make in-roads against on-premise offerings.


That's like saying I'll cut some calories from my diet or I'll work more diligently at blogging. How many? How much? You can't deny the claim, the question is, how significant is it?


So, with these small exceptions, I am agreement about these 10 points.

]]> Tue, 01 Dec 2009 15:05:28 -0700
A Case of Don't Kill the Messenger

I don't want to get in the middle of the argument going on between Stephen Few and Boris Evelson of Forrester, but Steve raised some very important points that deserve airing, independent of this particular skirmish. The large analysis firms serve up boatloads of information (most of it for sale at premium prices), but one might question how useful it is. Steve asks, "Far too often when relying on these services, however, we get advice from people whose range of topics is too broad to manage knowledgeably."


This is a problem with the business models of these firms. Analysts are very highly paid, but in order to produce enough revenue, they have to cover areas too broad and too quickly to master the nuance of each and report on it authentically. Don't blame the analysts themselves. In Boris' case, here is an extremely conscientious and articulate professional, who is very well-informed in his areas of primary focus, but he can't escape the pressure of the modern analyst firm to jump on the next big thing lest they fall behind the competition.


I don't provide "research" on products anymore, but when I did, I only published about those I had field experience with in real-life situations. An analyst does not have this opportunity and in fact, many of them never have, which rendered them, in my opinion, not much more credible than reporters. To make matters worse, it came to my attention that one of these firms said something favorable about one of my clients in a recent "research" report, only a paragraph, and included a note saying the client was free to use the quote for an entire year...for $15,000! Now how in the world can you provide objective analysis when your words are for sale?


Boris makes a suggestion that Few and Forrester "collaborate" as a means to improve the quality of the research. Few's response is predictable, and I would respond the same way. Basically, when you're an "indie" you have only your reputation, whereas a firm like Forrester has an entire infrastructure to advance its business. How could an arrangement like that ever work out?


I also seem to be the only person who agrees with Steve. Many argue that his words are too inflammatory and his tone too blunt or insulting. I don't find it so, but I believe these claims just avoid the real issue: Is he right? If you can't argue the case, attack the means by which it is delivered. A classic rhetorical technique. Lord knows I've been on the receiving end of that myself. But I for one find the large analyst firms' grip on decision-making in IT to be detrimental to our industry because it isn't sound. The business model disrupts it. It keeps the focus on the industry and product features to the detriment of good sound decisions. And the unholy relationship between the firms and the providers would be called racketeering in any other industry. Or at least collusion.


So I'll probably join with Steve in getting a chorus of boos and raspberries over this, but isn't the point of a blog, of social media in general, to raise the level of discourse to meaningful things, not just a recitation of the obvious? If Boris was offended by the way Steve dissected him in public, I don't blame him. I would cringe if someone did the same to me. In that sense, it was unfair, but in reality, Steve was pointing out a larger problem, and that needs airing.

And just in case your aren't sure about the messenger, I was referring to Steve Few.
]]> Mon, 23 Nov 2009 16:22:26 -0700
My Post Mortem of LucidEra

First of all, I haven't spoken to Ken Rudin CEO or Darren Cunningham VP Marketing yet (and I won't unless and until they decide to discuss it with me), so I can't offer a description of the events that lead to the closure LucidEra. There are some things I do know, though.

First, Ken and Darren are extremely competent and experienced software people. I've known both of them for years and have worked with them. Whatever happened, you can be sure it wasn't a result of poor management. In fact, knowing Ken the way I do, I'm quite sure he was a good steward of his investors' money and didn't buy a $47 million jet to complement his two yachts the way some other BI company CEO recently did.

However, unless I'm mistaken, both of them have deep resumes, but almost if not entirely in the enterprise software business working for companies like Business Objects, Siebel, and Oracle. It was a bold, and perhaps misguided attempt to aim a startup at the notoriously hard to identify SMB (small-medium business) segment, especially when the principals lacked the deep connections to partners and VARs needed to reach this market. Again, I'm speculating, but this seems to me to be sort of obvious. Selling into the SMB market requires a completely different message, vocabulary and approaches. In my consulting practice with software companies, this is a common problem that I also hasten to point out. Even if you know THEM (the SMB's) the real question is, do they know you, and what are you willing to spend to get hooked up with them?

I also wonder if it was too risky to come out with a product that was so narrow? LucidEra was not a BI product, it was a BI application. When you combine the tough economic hurdle they ran into at their most vulnerable time, combined with an application designed to address one small part of a firm's application portfolio and a presumably tough learning curve in the SMB market, the risk (in retrospect) looks too great to bear.

I liked the LucidEra product. My wife was about to add it on to her system for her business. Without any prodding from me, she found the easy start-up, simplified licensing and almost transparent integration with just what the doctor ordered. I suspect someone will pick it up and it will live on. 

]]> Tue, 23 Jun 2009 18:11:59 -0700
Freaky Models = Freaky Conclusions Freakonomics by Levitt and Dubner, but there are some issues related to the book that should concern us. Everyone cheats, there is a relationship between the Ku Klux Klan and real estate agents, why drug dealers live with their mothers or how abortion lowered crime - these make for entertaining reading. But when it came to the negligible effect of good parenting, my radar went up.
At the end of the book, there is a short section about Two Paths to Harvard. Why is it always Harvard, by the way? Why is that the exemplar for academic achievement? It's as if going to any of the other hundreds of great schools is like being the runner-up bitch at the Westminster Dog Show. But I digress. To demonstrate their thesis that Freakonomics shows that parenting isn't the strongest determining factor in achievement, they compare two boys. One is an African-American boy born in Daytona Beach, Florida, whose mother deserted him when he was two and his abusive, alcoholic father who was finally locked up when he was twelve, leaving him on his own to join a gang and sell drugs. The other was born in an upper-middle class suburb of Chicago to loving, well-educated parents. It turns out they both went to Harvard, the former now a professor there. The boy from Chicago did OK at Harvard, but things eventually went sideways. His name is Ted Kaczynski, a.k.a. The Unabomber.
So what is the point of this? Two observations hardly make for rigorous analysis. Besides, Ted Kaczynski is schizophrenic. It has nothing to do with the variables observed and analyzed. And that's the whole problem with "economic" models - you can never know if you are looking at the right variables and or truly understand cause and effect. This is the whole idiotic idea behind Cheerios protecting your heart, an extrapolation of a tentative conclusion of a slew of flawed "medical" (statistical not physical) studies. So if I want to know whether to stock more blue or more red shirts in my store, fine, lets try to understand our market. But if we want to understand the economic impact of wretched people having children, we don't need a model. If we're a government regulating false advertising, we don't need a model to inform us that Cheerios will not protect us from heart disease, especially the implication that that is the ONLY thing we need to protect our heart.
I love statistics, I built statistical models myself at many points in my career. But for heavens sake, lets not delude ourselves that we can use models for real life or death issues like public health or the sociology of child rearing. Models can give us insight (or mislead us), but every counterintuitive result has to be examined rigorously and not taken at face value.
Freakonomics was interesting, but not very useful.
]]> Wed, 20 May 2009 15:18:21 -0700
Why You Should Be Thinking About Operational BI
There was an excellent webinar on BeyeNetwork today, sponsored by GoldenGate Software and delivered by Claudia Imhoff on this topic (an archived version should be available there by the time this blog is posted). Claudia explained the various options that exist (and the drawbacks of each), such as going directly against operational systems, using ODSs, re-architecting data warehouses to provide fresher data and more pliant data models and, presumably the most attractive solution, using various flavors and types of EII (Enterprise Information Integration) to dip into the data warehouse and the operational systems by creating some abstraction to allow data to appear as if were in a single location.

Why is Operational Reporting important? One reason is that the rise of business process modeling and execution systems will inevitably make business processes more dynamic and the data warehouse/BI structure will not be able to keep up. Pulling data directly from operational databases (or replicates or logs or even on-the-fly from message queues) avoids the steps of transformation and loading to a data warehouse, eliminating a lot of latency. In addition, data warehouse models are not operationally oriented, so its conceivable that in the transformation process, a lot of the semantics of the data at the operational level are lost. As a result, data created from Operational BI could not easily be written back to the operational systems.

Another reason is that BI has not been able to cross that "actionable" gap because its architecture is geared toward informing people, not operational systems. An Operational "reporting" system has to do more than just report, it has to react. That reaction can be driven by a human being (picking up the phone and sending the store manager to investigate a problem) or it can be driven by decision services as James Taylor and I described in Smart (Enough) Systems: How to Deliver Competitive Advantage by Automating Hidden Decisions.

Now that it is possible for business people to interactively modify running business processes with process automation tools, they will need help assessing what they've done. There won't be time to study the problem, devise some data models to support investigations and build data integration aplets. EII allows for a sort "canonical" map between the systems and analytical/reporting model so that these constant changes can be handled at a higher level of abstraction, almost instantaneously. By aligning the substantial resources of the BI industry to this effort, I'm pretty confident we'll make a lot progress quickly.

]]> Fri, 10 Apr 2009 05:55:53 -0700
You Cannot Fix a Broken Company by Measuring How Broken It Is You can't fix a broken company by measuring how broken it is. You can't even improve a functioning company by measuring where it could do better.

The BI industry has sort of casually sent the message that BI makes companies better. I've seen this in presentations, webinars, seminars, books and blogs from vendors, practitioners and analysts. But the question is, once you expose something that needs attention, what next? As a consultant and implementer of data warehousing and BI for many years, I never really came up with a good answer. I always referred to it as the Jordan River problem: we could take them through the desert and to the east bank of the Jordan, but like Moses, we couldn't get them across. Even when we did everything we set out to do, when approaching management about the next phase of the operation, to help the client start addressing the problems with employee morale, high turnover, inventory snafus, poor customer service, etc., the response was usually something like, "Neil, aren't you the data warehouse guy? Shouldn't we get McKinsey in here to work on that?"

In short, if you're involved with BI, you may have as good or even better insight into what is going on in the company, but you clearly lack the portfolio to do anything about it.

Part of the problem is that we feel successful if we can just get people to USE what we've built, so maybe it's time to stretch a little. In our proposals to clients, we should measure our success by how effectively people use what we built, and develop the programs and consulting expertise to see to it that they do. The first step in doing that may be to address the issue of decision-making in addition to just informing decisions. I'll be generating some original research this year on decision-making, but it's only the beginning. If we can deliver BI with a better understanding of how decisions are made (or avoided), how good/bad decisions are made, we can more or less stay in our patch but offer a much better service.

This practice could include things like a decision audit, workshops for helping people understand how they really are deciding things (versus their mental model which may be incorrect) and, finally, tracking actual decisions and evaluating how well they were made. Now THAT would be Business Intelligence. All of this can happen if we can ride BI into the areas of the company that need attention. If we perpetuate the BI-is-an-IT-project mentality, or that BI is for go-to-guys, we won't have the wormhole that moves us past the McKinsey barrier. But if we do, those decision makers that can change the way the company works will be our clients. 

Anyone want to join me in building this competency?
]]> Decisions Thu, 26 Mar 2009 12:36:22 -0700
<![CDATA[Time to Rexamine the "Virtual" Data Warehouse]]> Bill Inmon posted an article  where he discussed the drawbacks of a "virtual data warehosue." Now the idea of a virtual data warehouse has been around for years (I remember IBI positioning their EDASQL technology as a virtual data warehouse fifteen years ago). That was a terrible idea then because it provided only a method to access data, with no proper cleansing, integration or persistent storage of it. It was, frankly, not a data warehouse at all. Using that definition, anyone would agree that a "virtual data warehouse" was not a workable solution.

The problem is that Inmon defines any method to augment the data warehouse with access to source data directly as a virtual data warehouse, and in the same terms as the IBI concept from long ago, citing its shortcomings, casting the virtual data warehouse as not only a bad idea, but one that has been historically bad and discredited. In fact, the term "virtual data warehouse" has been used for years by detractors for any idea that augmented or supplemented the data warehouse architecture, many of which were truly misguided ideas. In today's world, though, it is not only possible, it is necessary to supplement the data warehouse. This "virtual data warehouse" is a new thing, something not possible before.

Inmon's primary argument about virtual data warehousing is that it skips the crucial step of integrating data:

The great appeal of the virtual data warehouse is that you do not have to face the problem of integration. 

This is the central premise of his argument, that a virtual data warehouse skips data integration.  What actually happens in today's version of a virtual data warehouse (and it shouldn't really be called that) is that a central data warehouse endures, with the kind of integrated data that Inmon would approve of. However, since there are so many data sources now, they change more frequently and the volumes are extreme, the batch load data warehouse is burdened with latency and often impractical for 100% of the analytical requirements. Because system architecture has changed so radically over 15 years, it is possible to read logs, queues, perform changed data capture and even go directly into the operational systems without resorting to extraction into staging areas or degrading their performance. The semantics of these source systems are considerably more transparent than they used to be and the speed of processing is so much faster it is possible to cleanse and integrate data on the fly. In those cases where it is not feasible, or where for a myriad of reasons it makes sense to warehouse the data, that process is preserved. So the virtual data warehouse is actually more of a surround strategy than an alternative. 

Why then is the virtual data warehouse such a supremely bad idea? There are actually lots of reasons for the vacuity of virtue manifested by the virtual data warehouse. 

I won't repeat all of his reasons, but they all seem to be related to the old "it will bring the system to its knees" claims about the inefficiency of federated queries and the affect on systems and network performance. I call this reasoning "managing from scarcity," except there is no scarcity anymore. Again, not every query is federated; many will be satisfied by the data warehouse. We can build systems thousands of times larger and millions of times faster than we could two decades ago as data warehousing was catching on and isolation of data and processing was absolutely necessary. Today's applications and business processes call for faster, fresher data than a data warehouse can provide, in many cases. Rather than ignore these requirements, a virtual data warehouse (with adequate semantic rationalization) gives the developers a chance to position the processing to the logical location, and the ability to use abstraction (or indirection if you prefer that word) to address the data, providing the flexibility to change that configuration whenever necessary. 

The ugly truth is that the corporate analyst MUST do integration whether he or she wants to or not. There is no getting around it. Wishful thinking just does not cut it here.

Who wouldn't agree with that? The question is, where and when? The bulk of the difficult integration work is manual - people gathering information and getting buy-in that they have the right mapping. Where that process gets implemented is not the issue. There is no rule that says it must all go to a data warehouse. Historically, that was the practice because of the constraint of resource limitations that are now mostly relaxed.

So the problems with the virtual data warehouse are legion. There is poor performance or no performance at all. There is an enormous system overhead. There is only a limited amount of historical data. There is the work of integration that each analyst must do, one way or the other. There is no reconcilability of data. There is no single version of the truth for the corporation.

All of these things are true in the virtual data warehouse the way Inmon defines it, a big mess of federated queries directly to source systems with no integration, no history, no real context. No one would suggest this is a good idea. But a data warehouse today should be a participant in a constellation of source data, so long as the access methods to data beyond the warehouse's control can be relied upon to be correct, timely and not a burden on other systems. This kind of arrangement is not a virtual data warehouse, it's an analytical framework. We should discuss the merits of the idea as such, not dismiss them because we name them with a long-discredited concept. 

]]> Federated Data Warehouse Mon, 23 Mar 2009 14:24:22 -0700