Blog: Steve Dine If you're looking for a change from analysts, thought leaders and industry gurus, you've come to the right place. Don't get me wrong, many of these aforementioned are my colleagues and friends that provide highly intelligent insight into our industry. However, there is nothing like a view from the trenches. I often find that there is what I like to call the "conference hangover." It is the headache that is incurred after trying to implement the "best" practices preached to your boss at a recent conference. It is the gap between how business intelligence (BI) projects, programs, architectures and toolsets should be in an ideal world versus the realities on the ground. It's that space between relational and dimensional or ETL and ELT. This blog is dedicated to sharing experiences, insights and ideas from inside BI projects and programs of what works, what doesn't and what could be done better. I welcome your feedback of what you observe and experience as well as topics that you would like to see covered. If you have a specific question, please email me at Copyright 2016 Sun, 18 Sep 2011 16:37:59 -0700 Why BI Projects Fail (2 of 5): The Crumbling Foundation For Business Intelligence (BI) projects to be sucessful, they must meet user requirements, be designed with strong user involvement, include a manageable scope, provide fast query response times, deliver the required reporting and analytic capabilities and provide value.  Also, as nearly all experienced BI practitioners will attest, for BI programs to be successful, they must be built iteratively.  In the trenches, that means sourcing, integrating, and developing front-end capabilities that target a subset of the enterprise data in each project.  Sometimes the scope of an interation is limited to a specific organizational process, such as invoicing, and other times by system, such as order-to-cash.  Either way, each subsequest project adds to the scope and volume of data that is integrated and available for analysis.  For initial BI projects and subsequent iterations to be successful, the data architecture must be scalable, flexible, understandable and maintainable.  However, most organizations put more thought and effort into their selection of BI tools than their data architecture.  The end result is that they implement a data architecture that crumble under the growing requirements of their users and data.

What is a BI Data Architecture?

A data architecture, as defined by Wikipedia, is the "design of data for use in defining the target state and the subsequent planning needed to achieve the target state".  For BI, this means the design of enterprise data to facilitate the ability by the business to efficiently transform data into information.  They two key words in this definition are "facilitate" and "efficiently".  The vast majority of BI projects are initiated because of the iniability of the business to efficiently access and consume enterprise data.  They must navigate non-user friendly application databases, must manually integrate data from disparate systems, and/or must fight to find and gain access to the data they require.  If BI is to be successful, it must deliver at the most basic level...the data.  If it takes three weeks to add five attributes to the data warehouse or twenty minutes to return a simple query, users will go it on their own.

To be more specific, the BI data architecture includes:

  • Data store architecture (database (RDBMS), unstructured data store (NoSQL), in-memory, OLAP)
  • The data warehouse process design (i.e. staging, ODS, base, presentation layers)
  • The data warehouse data model (for each process layer)
  • Metadata data model
  • MDM process design
  • MDM data model
  • BI tool semantic layer models
  • OLAP cube models
  • Data virtualization models
  • Data archiving strategy

What are the Signs of a Poor BI Data Architecture?

Sometimes the signs of a poor BI data architecture are obvious before the first project is rolled out, other times it's further downstream.  These include, but are not limited to:

  • Inability to meet DW load windows (batch, mini-batch, streaming, aggregates, etc)
  • Persistent slow query response
  • Non-integrated data (see:
  • Takes too long to add new attributes to data architecture
  • Users don't understand presentation layers (DW, semantic layers, etc)
  • Poor data quality
  • Doesn't meet compliance standards
  • Inability to adapt to business rule changes

Why is Designing a Solid BI Data Architecture so Challenging?

The main challenge with data architectures is that there is no one right architecture for every organiziation.  There are differences in data volumes, data latency requirements, number of source systems, types of data, network capacity, number of users, analytical needs, and many other variables.  We must also design not only for the requirements of the first project, but also future iterations.  This requires highly experienced data architects that have a background designing and modeling for BI, are flexible in their design approach, have the ability to communicate effectively with the business, a strong business acumen and a technical understanding of data and databases.  These resources are usually expensive and can be difficult to find, especially for more specialized verticals.

It also takes time to design and model a scalable, flexible, understandable and maintainable data architecture, which must be allocated for in project whether waterfall or agile.  The business must be involved to communicate requirements, data definitions and business rules, especially during the design of the logical data model.  Many data architects skip the logical modeling process and proceed directly to creating a physical model based on reverse engineering the source systems and/or their understanding of the systems.   Other times, companies look to pre-packaged solutions (see The Allure of the Silver Bullet) but later find that they their architecture is both non-scalable and inflexible.

What's the Solution?

This issue reminds me of a home improvement program that was aired on TV a few months ago.  The people wanted to add a second story on their house but were concerned since they had noticed cracks in their walls and uneven floors.  After inspecting the foundation, they discovered that it wasn't designed correctly and would crumble under the additional load.  Their only choice was to redesign and rebuild their foundation.  I find that many organizations discover the same issues with their data architectures.  The data architecture is the foundation of the BI program, and very few succeed with a poor one.  At some point it will crumble under the load.  For new programs, the solution is to understand the challenge and ensure that you're putting the right amount of focus and proper resource(s) on the task.  For existing programs, look for the signs mentioned above and assess your existing architecture.  You may still have time to recover and avoid a failed BI initiative. 

]]> Business Intelligence Sun, 18 Sep 2011 16:37:59 -0700
Why BI Projects Fail (1 of 5): The Allure of the Silver Bullet

The high failure rate of BI projects and programs has been well documented throughout the years.  The reasons why they fail has also been well documented, which leads to the obvious question of "why do they continue to fail at such a high rate?"   Our company is often brought in to assess existing BI programs and provide recommendations to organizations to help turn around their BI programs.  In a series of blogs, I'll address 5 of the most common reasons, why we see BI projects and programs fail.  These reasons are in no particular order as it's usually a combination of issues that play a part in the failure.

The first reason I'm going to tackle is what I like to call the allure of the silver bullet.  In most organizations, BI projects emerge out of an immediate need to report and analyze data that is not easily accessed.  It could be in a source system that has a complex schema, in multiple systems and require integration, be too voluminous to easily analyze, or one of many other data access related issues.  There also may, or may not, be tools available to facilitate the reporting and analytic process beyond MS Excel.   Whatever the reason, experienced BI practitioners will usually estimate the solution at 3 to 6 months on average if no BI infrastructure exists, or 30 to 90 days if one does.  In the face of this daunting estimate, the first instinct of many decision makers is to search for a less costly solution, in terms of both time and money.

Fortunately, for these decision makers, the market has recognized this opportunity and sprouted vendors that have developed pre-packaged reporting and analytic solutions to meet this need.  They are often created for a specific source system, subject area and/or industry vertical.  In some cases, they utilize custom interfaces, but more often they leverage existing BI front-end software.  If they provide ETL capabilities, then it's usually delivered as pre-developed mappings through an off-the-shelf tool.  The sales team provides a well polished demo and promises a solution in less than a week.  It sounds pretty convincing.  So, how can these pre-packaged analytic solutions lead to so many failed BI projects and programs?  The concept in itself isn't the issue, in fact in many cases it can help reduce the implementation time and cost, as well as provide value.  The issue is that these solutions aren't a silver bullet and are usually evaluated tactically rather than as part of the framework of the overall BI strategy.

BI project success and failure is measured by most companies as whether the projects are delivered on-time and on-budget.  In fewer cases, they are measured on the return on investment (ROI).  The challenge with pre-packaged analytic solutions is that they are sold as a quick hit and low cost solution to the traditional BI project.  What they usually fail to mention is that this is only possible in the case where an organization has not customized their ERP system, has a high level of data quality, only moderate data volumes, only requires sourcing data from one system and has 100% overlap in reporting requirements.  Unfortunately, this doesn't describe the state of most data environments.  What is more common is that these projects require more time to implement at a cost that often approaches, if not exceeds, a traditional solution.  If these factors aren't accounted for in the project timeline, budget and ROI calculation, they are seen as having failed.

The challenge that pre-packaged analytic solutions present to the long term success of BI programs is that their data model is often difficult to scale and once customized, difficult to upgrade, especially when an ETL and semantic models are included.  BI architectures that are difficult to change, given the constant change of business and requirements on BI, is a recipe for failure.  They also often don't track changes to the data and provide a flexible architecture, such as staging layers that facilitate the addition of new source systems.  In some instances, it ties organizations to a data integration (DI) tool that doesn't scale or adds another tool to manage in their environment.  At one customer we assessed, the DI tool that was used by the pre-packaged analytic solution was the same one they were using, but a different version and so they had to maintain two versions of the same software.

As I mentioned, the issue isn't with pre-packaged analytic solutions per se, it's that they are often purchased for the wrong reasons and not factored in to their long term BI strategy.  Organizations can derive a great deal of value from these solutions, but don't get lulled into believing that they are a silver bullet.


]]> Business Intelligence Wed, 16 Feb 2011 15:18:07 -0700
Advanced Analytics for the Masses: Is it Viable?
The question is whether making the tools easier to use will drive a larger population of users and derive more value from the data.  The challenge is that it may lead to a greater misuse of statistics and less confidence in the results by the business.  It takes a deep understanding of statistical methods and experience to not only create proper models, but also to interpret the results.  As eloquently described in Wikipedia; "Misuse of statistics can produce subtle, but serious errors in description and interpretation -- subtle in the sense that even experienced professionals make such errors, and serious in the sense that they can lead to devastating decision errors. Even when statistical techniques are correctly applied, the results can be difficult to interpret for those lacking expertise."

I applaud the BI vendors for addressing this market since I believe that we need organizations to be more analytically driven.  However, I don't believe that it's simply a matter of making the tools easier to use.  My hope is that they add instructive features into their applications with regard to proper use of statistical methods, how to select the right method for the analysis and how to interpret the results.  I'd also believe it's necessary to provide and promote analytic training, either virtual, classroom or both, as part of their offering. The result is that we end up with a larger pool of trained analysts which is good for both the vendors and their customers.  In addition, organizations looking to achieve more value from their BI investment win as well.
]]> Advanced Analytics Mon, 31 Jan 2011 17:59:38 -0700
2010 Year in Review...From the Trenches
1)  Larger numbers of existing BI programs are maturing beyond managed reports, ad-hoc queries, dashboards and OLAP than in years past.  Companies are increasingly looking to squeeze more value out of their BI programs via advanced visualization, predictive analytics, spatial analysis, operational BI and collaboration.  This is driving the rise of analytic databases (ADB's) and new analytic applications and features.

2)  The number of failed BI projects continues to remain high.  While the industry has learned, and documented, the reasons of why projects fail, it hasn't done much to stem the tide of failed projects.  From my perspective, and I'm sure most would concur, the overarching reason is because implementing successful BI projects is hard.  It requires a balance of strong business involvement, thorough data analysis, scalable system & data architectures, comprehensive program and data governance, high quality data, established standards & processes, excellent communication & project management and, experience.  I don't necessarily see things changing in 2011 unless companies:

  • institute and enforce Enterprise data management practices
  • ensure high levels of business involvement for BI projects
  • institute measurable, value driven, metrics for each BI project
  • realize that offshoring BI projects, except for simple staging ETL and simple BI reports, doesn't work well for BI.

3)  Much of the old is new again.  Master Data Management (MDM), dashboarding and predictive analytics are not new concepts, but they did see a strong reemergence in 2010.  One of the challenges with implementing integrated dashboards and predictive analytics was that they were often missing from the enterprise BI software suites.  It seems that the major BI vendors finally started listening to their customers are new capabilities started rolling out.  MDM was simply repackaged to include more than just the data warehouse and found traction in organizations struggling with defining their master data and keeping it consistent across their disparate systems.

4)  BI wants to be agile. We've always recognized the high cost and long lead times for implementing BI, but it seems that the customer has finally said 'enough' and BI teams are listening.  They are looking for new ways to implement BI and finding that many Agile practices, such as smaller, focused iterations, daily scrum meetings, prototyping and integrated testing help speed up the velocity of BI and augment the communication between the business and IT. In 2011, I expect to see BI practitioners better understand what Agile practices work with BI and which one's don't translate as well.

5)  There was increasing interest in BI in the Cloud, but fewer takers than expected.  Infrastructure as a service is still a bit too technical for many BI programs to implement and concerns over data security remain high.  There are also concerns over performance, both hardware and network.  BI software as a service products are continuing to mature but  the biggest hindrance for user adoption in 2010 appeared to be poor ETL capabilities and non-extensible identity & access management.    

I'm always interested in you reaction and views.  Please feel free to post your comments and take on 2010.   ]]> Advanced Analytics Mon, 10 Jan 2011 11:31:20 -0700
Defining the Proof in a Proof-of-Concept
There are many reasons why a customer will request a POC, but it usually comes down to ensuring that the software will actually be able to deliver the promises made by the vendor's sales reps and marketing materials.  In most cases, the terms of the POC are dictated by the vendor while the requirements are defined by the customer.

The primary mistake that I often find when evaluating the results of a POC is that the customer defines their requirements based completely on the issues that they are looking for the software to resolve rather than all the requirements that they need the software to deliver.  For example, in a recent POC that I evaluated, the customer was considering the purchase of a DW appliance.  They were having challenges with loading their data within their available load window and had a number of queries that were taking more than 30 minutes to return.  They asked to set up a POC with a major industry vendor and they were able to cut the total load time by 20 hours and brought the longest running query down to under 3 minutes.

By all accounts, the POC was a major success and the customer purchased the appliance.  When they began migrating their existing ETL jobs they found out that the data loads were extremely slow.  Instead of insisting that they include a few of their ETL jobs in the POC, they let the vendor steer them into using only the bulk loader.  They also discovered that their front-end BI tool issued SQL that wasn't compatible with the database appliance and their data modeling tool wasn't compatible.  Those areas weren't considered for the POC.  Many reading this blog entry are probably thinking that this couldn't happen in their environment, but unfortunately it happens more than you'd think.

The challenge with POC's is that the vendors' know the strengths and weakness of their software and are savvy with steering the POC in a direction that demonstrates the strengths and avoids the weaknesses.  However, we can't blame the vendors since this is simply sales 101.  Customers not only need to dictate both the terms and requirements of the POC, but also dedicate the time and resources necessary to ensure a proper evaluation.  For example, if you can't provide hardware resources that that are comparable to your existing or planned environment for the POC, then you may not be proving what you set out to achieve.    Customers also need to include a broad range of requirements, not just issues that you're looking to solve.  You may solve your issues, but introduce many more.

The next time you look to bring in software for a POC, remember that the 'P' stands for Proof that the software will meet all your expectations, not just the issues you're looking to resolve.
]]> Business Intelligence Sun, 10 Oct 2010 16:30:41 -0700
Will Performance Continue to be a Top Concern for Moving BI to the Cloud?
It won't be long before purpose built Clouds start to hit the market, designed specifically to meet the high compute and i/o demands of BI.  Kognitio is already providing a SaaS based solution for hosted data warehouses and front-end BI applications.  From my standpoint, it won't be long before performance will drop off the list of top BI in the Cloud concerns and BI practitioners will wonder why they ever implemented and managed onsite BI architectures.]]> Cloud Tue, 03 Aug 2010 17:38:51 -0700
Thoughts from TDWI Chicago 2010
This year, the topics that seem to generate the most buzz were:

  • Agile BI
  • Data Visualization
  • Statistical Analytics
I spoke with a number of attendees that had mature BI programs and were interested in new ways to add value.  They had conquered reporting, OLAP analysis, events & notifications, dashboards and scorecards and felt as though their programs weren't keeping up with the current needs of their users.  Their users were becoming more sophisticated and looking for new ways to analyze their data.  They were also looking for ways to increase the velocity of their project iterations as users have started to look to software-as-a-service (SaaS) to address data and functionality gaps. 

The attendees that I spoke with that are new to BI were concerned primarily with the well publicized rate of failed BI projects.  Some worked for companies that had past failures with BI and were interested in how Agile project management might help.  They were also interested in technologies that might facilitate Agile projects.

Only time will tell whether Agile BI, data visualization and advanced analytics will become this year's major trends in BI.  However, it's exciting to see strong interest in areas that have the potential to change the way we deliver BI and the value of what we deliver.  What's interesting is that these aren't new concepts in the world of project management and data analysis, it's just that we may have matured to a point where they become front and center in our focus.

I'm interested in your thoughts on what you think will be the major trends in BI in 2010.  You can leave your comments below or feel free to email me at 
]]> Data Warehousing Sun, 16 May 2010 16:08:13 -0700
Are you Delivering Consolidated Data Rather than Integrated Data? In the never ending quest to determine exactly why so many BI projects and programs still fail, regardless of everything we've learned over the years, another trend has hit my top 10 list.  It appears that in the quest to reduce cost and time on BI projects, there seems to be more cases where data is being consolidated rather than integrated.  What's the difference?  Data consolidation is the process of bringing together entities and attributes into a data warehouse, each retaining their form, value and technical characteristics as they exist in the source.  Data integration is the process of combining entities and attributes such that they have common form, meaning and technical characteristics.  (See graphic)


Why does data consolidation present a challenge in the data warehouse, and/or data mart?  It reduces the ability of users to understand and analyze the data. Users are left to try and determine which attributes and values mean the same thing, and often attach different meaning than their colleagues.  It adds time to every analysis and often reduces the value of the result.  I usually find that companies try and compensate by building the complex integration rules into the semantic layers of the front end tools.  Aside from the additional time required to create the model, it can significantly slow down response time, especially if the rules are being processed by the BI tool instead of the database.

Why do companies end up with data consolidations rather than data integration?  In my experience, there are a number of reasons.  One cause is an inexperienced DW data modeler.  Often data modelers have more experience in the OLTP world than the DW world.  They reverse engineer the sources and try to preserve all the attributes that meet reporting requirements.  Another reason is that it takes more time to integrate data.  It requires data profiling, data analysis and interaction with the subject matter experts.  There can also be political factors involved with reaching common meaning.  It's also difficult to integrate data.  Fortuntely, we've learned techniques over the years for handling different situations, like preserving source values while still providing integrated data.  Lastly, a DW may start out with only a single source and the data model is not created to accomodate integrated data. 

If your BI program isn't delivering the value that was promised, maybe the issue is with what your delivering. 

You can also follow me on Twitter @steve_dine

]]> Business Intelligence Tue, 30 Mar 2010 08:46:47 -0700
Before you Purchase that Data Warehouse Appliance... Chances are that if you have data that's taking too long to load, monster queries that are taking too long to return and/or planning to add complex analytics to the mix, you are considering purchasing a data warehouse appliance.  DW appliances leverage integrated hardware with a high-speed back-plane, embedded operating system, scalable storage and analytic database technology to deliver a compelling price to performance solution.  For existing BI programs, a proof-of-concept is usually requested to have the vendor prove that their DW appliance will solve their biggest pain points.  It's difficult not to get excited when that 3 hour query returns in 3 minutes.  However, before you sign that contract, you may also want to ensure that the DW appliance:

  1. Is certified to work with your existing, or planned, front-end BI tools.  Your BI tools may not generate SQL that is recognized by the appliance, be able to leverage the built-in database functions or support the same data types.
  2. Has native drivers that work with your existing, or planned, ETL tools.  Many appliances require the use of their bulk loaders to move large volumes of data in and out of the database and are drastically slower with ETL tools.  (note: a fast driver may not be necessary if they support push-down optimization, aka ELT)
  3. Supports correlated subqueries.  While appliance vendors will correctly argue that correlated subqueries are inherently slow, due to the fact that they require nested loops, chances are that your BI tool will produce them.
  4. Is compatible with your data modeling tool.  Many DW appliances are built upon open source databases, such as Postgres.  Make sure your data modeling tool can reverse engineer and produce the DDL that these databases can consume.
  5. Automatically handle the addition of new nodes.  Since most appliances utilize a massively parallel processing (MPP) architecture, data is distributed across the nodes to ensure balanced processing.  However, this can cause challenges when adding new nodes and often require the unloading and reloading of the data.

The best approach is to include these items, or those that are critical to your requirements, in the proof-of-concept.  While you can't mitigate all your risks, adding these items to your list of requirements will help to ensure a more successful implementation for you and your appliance vendor. 

]]> Vendors Sun, 21 Mar 2010 21:03:35 -0700
Is Collaboration the Key to Pervasive BI? I was recently invited to the corporate offices of Lyzasoft for an inside look at their new 2.0 release.  Aside from the impressive advances in charting and visualizations, I was especially blown away by their augmentation of collaboration features to their shared portal, called Lyza Commons.  They provide a web environment that allows authorized users to search, bookmark, combine, create, tag, share, comment, rate, relate, and interact directly with intelligence content in the form of blogs, microblogs, charts, tables, dashboards, and collections.  More simply put, they enable the development and growth of analytic communities.

The enterprise BI vendors have essentially ignored adding the collaborative capabilities that have permeated their way into nearly every organization via wikis, blogs, forums, instant messaging and podcasts.  Why is collaboration in BI so compelling?  Up to this point, reporting and analysis has essentially been a single threaded activity.  We've been unable to share our findings in a way that allows others to contribute their knowledge and inisight to the overall analysis.  If we begin to treat the development of a report or analysis as a starting point rather than an end point, we have the ability to actually share information through interaction instead of delivery.  At the very least, it brings BI to the masses through interfaces they are already used to utilizing.

Is collaboration the next big thing in BI?  I usually don't make predictions but I do think that the way we deliver BI will change drastically in the next 12 to 24 months.  Instead of concentrating on dumbing down BI through flash based dashboards and bursted reports, we'll actually start to focus on the intelligence in BI through intuitive interfaces and shared insights. 

]]> Sun, 28 Feb 2010 20:44:17 -0700
Hands-on SaaS BI Review: Bime I recently decided to check out the SaaS BI offering called Bime (  According to their website, they are focused on delivering a simple-to-use business intelligence product on top of the latest data visualization and cloud computing groundbreaking innovations.  There are now in excess of 25 vendors offering SaaS BI solutions, from dashboards to predictive analytics.  They offer the prospects of a BI solution that can be faster to implement, externally managed and less upfront cost than traditional enterprise BI suites.

Getting started was surprisingly easy.  I created an account by providing a login (my email address), a password and a self-chosen domain.  They offer a free account that limits the user to 2 connections, 2 dashboards and 10MB of storage or an enterprise account that allows 20 connections, 20 dashboards and 10GB of storage.  For testing purposes, the free account was fine.  Once my account was created, I was off and running.  They have short video tutorials that make it easy to get started. 

Like most front-end BI applications, development is a three step process.  Create a connection to a data source, define a semantic layer and create reports/analyses.  However, Bime makes it easy by creating an initial, dimensional semantic layer based on the structure of your data.  You can source data from a relational database, MS Excel file, Google spreadsheet, Amazon SimpleDB, XML, Lighthouse, Salesforce or Composite.  However, to create a connection to a relational database or Excel spreadsheet, the desktop version of Bime is required, which is an Adobe Air application that runs the same code as the web version.  You can then synchronize it to push it to the web based version.

When the data is uploaded, you have the choice of loading the data into Deja Vu, which is a distributed cache that is stored in the Amazon cloud.   That cache can be updated on a scheduled basis but ensures that each user has access to the same data.  It also improves the performance by not requiring the uploading of data each time you access the dashboard or analysis.  Once the data is uploaded, the response is very quick.

The strengths of Bime are its low barrier to entry, ease of use, advanced visualizations, multi-tenant caching mechanism, web 2.0 interface, collaboration features and low cost.  They make it easy for business users to upload data, perform an analysis, save it to a dashboard and share it with others.  The data is secure and you can choose who can have access.  They also integrate Google maps into their application to create interactive geospatial charts.  A feature that should appeal to business users.

The weakness of Bime is it's lack of maturity and 10GB data limit.  It lacks ETL capabilities, which can make uploading complex data a challenge.  To upload data from a relational database, you either need to select a table, a view or write custom SQL.  Once the connection is defined, the attributes in the semantic layer are ordered alphanumerically and cannot be custom ordered.  Also, if you have any changes to your database, the schema is blown away and any changes are lost.  In addition, I ran into a number of bugs in the application that made it difficult to create meaningful analysis.  For example, the custom SQL feature didn't seem to work and aliases do not import correctly.  I also received error messages that the datasource had changed and the schema needed to be recreated. 

My overall opinion of Bime is good.  They make it easy to create meaningful analyses and dashboards at a low cost.  It is visually appealing and the collaboration features are a step in the right direction.  It is a great option for the mid-market but will likely require time to mature before appealing to the larger companies.  Features like LDAP integration, ETL capabilities, group based security, data lineage, web service api's and drill-through will likely be required before large companies will jump on board.

]]> Wed, 20 Jan 2010 13:47:09 -0700
Addressing the Cloud Critics In recent months, I've read numerous blog postings, tweets, articles and interviews from BI pundits that have denounced the viability of BI in the cloud.  They point to data volumes, security concerns, lack of control, network reliability and bandwidth as factors that deem the concept unfathomable.  While each one of these issues can be a show stopper for a BI project, they are not roadblocks for all BI projects.  What's most baffling to me is that I've reached out to many of these critics and I've yet to find one that has any actual experience working in the cloud.

As a proponent of BI in the cloud, and one who has implemented a BI project in the cloud, I can say that it is fathomable.  I've been pleasantly surprised by the speed, reliability and flexibility that the cloud provides.  While not every BI project is suitable to be implemented in the cloud, there are large number of projects that qualify.  For those projects where the cloud isn't an viable alternative for the entire architecture, one can leverage it for specific uses, such as development environments, software testing & upgrades, redundancy and extra cpu capacity.

If your interested in more information about the cloud, check out, and  I would also recommend going to Amazon Web Services, Rackspace or GoGrid, sign up for an account and launch your first instance.

If your interested in information specific to BI in the Cloud, feel free to send me an email ( and I'll forward you a recent presentation.

]]> Sat, 09 Jan 2010 13:35:51 -0700
Are we Any Closer to Pervasive BI? One of the topics that still seems to garner a great deal of attention is "pervasive BI".  Just like Henry Ford's vision of a car in every driveway, the BI vendors envision a license for every employee.  Don't get me wrong, I'm at the front of the parade when it comes to increasing the BI footprint in today's organizations.  However, I think that we still have a long way to go in large part because it seems that the more things change, the more they stay the same.  For example, we learned a long time ago that data without context is just data.  Yet, it seems that most of today's BI applications lack all but the most rudimentary metadata capabilities.  It is a challenge to import the business and technical metadata, maintain it and expose it to the users.  How will BI become pervasive if we can't enable users to become more self-service?

From my perspective, another area that has been a roadblock to pervasive BI is the restrictive licensing models of today's BI applications.  I'm not advocating that vendors give away their software for free, but if we want to make BI pervasive then we need to be able to embed it in our processes, intranets, extranets, web services and everywhere else we display and share information.  In most organizations, we are forced to choose who is worthy of a BI software license, since the cost per user is based on software access, not usage.  The open source vendors are the furthest along in easing the licensing restrictions of their components, recognizing that more pervasive BI becomes, the larger the ultimate market for their software.  Hopefully the cloud will also play a role in helping to change the software licensing models.

These are only two of many reasons why I don't think that we are any closer to pervasive BI than we were 10 years ago.  I am working on an article that will discuss this topic in more detail.  In the meantime, please feel free to provide comments from the trenches on what you see as being the greatest challenges to pervasive BI.


]]> Sun, 13 Dec 2009 17:08:23 -0700
Is the Cloud a Viable Option for BI? A few weeks ago I presented a full day course on BI in the Cloud at the TDWI conference in Orlando.  I was pleasantly surprised by the number of attendees and the overall interest in the cloud from both vendors and BI practitioners.  Having spent the better part of the past year working in the Amazon cloud, I took for granted the general confusion that still exists over how the cloud is defined and how it relates to BI.   I believe that this is due in large part to the fact that the cloud is still taking shape, especially when it comes to BI.  As an industry, we look at the expectation of large data volumes, a seemingly endless desire for faster response times and concerns over data security and wonder how BI infrastructures could possibly exist in the cloud.  Given the fact that we are at the early adopter phase of the consumer lifecyle, these concerns are expected.  However, these concerns may not be reality for as many BI programs as some might think. 

I recently met with a data warehouse developer who mentioned the same concerns and questioned whether BI in the cloud would ever be viable.  I asked about the data volumes in their program and the type of data that they were integrating.  They were warehousing 3rd party prescription data, had approximately 100GB of data and indexes across their architecture and growing by less than 20GB per year.   While there may be some compliance issues to consider, based on my experience, they would likely be a great candidate for moving some or all of their architecture to the cloud.

While not all BI programs are candidates for implementing Software-as-a-Service (Saas) or Infrastructure-as-a-Service (IaaS) for their BI architectures, it is a viable option for many.  However, cloud vendors will need to do a better job at educating the public about their physical infrastructures, service level agreements, pricing, standards and security before more BI programs will consider it as a viable option.   

]]> Tue, 17 Nov 2009 21:51:04 -0700
Tips to Avoid Purchasing Shelfware A colleague of mine recently asked me to chime in on discussion thread about ETL tools.  Like many discussion board threads, it had drifted significantly from the orginal topic.  As a general rule, I tend to spend little time on these boards, not because I don't think that they have value, but between Twitter, Blogs and instant messaging, I can't seem to find much time left for additional on-line interaction.  I read through the thread and was immediately struck by the number of comments that were phrased as facts but were either based on outdated information, innuendo or rumor.

Only a few days after my discussion board experience, a client of ours asked me to read an evaluation they had purchased on ETL tools.  I read through the report and many of their scores didn't seem to parallel my experience with the tools. I found the likely reason in the fine print.  The report publisher had been kind enough to explain that their scores were derived, not from a hands-on evaluation, but by meetings with vendors and consulting partners.  I was perplexed, but not surprised.  The customer had purchased the report because it was relatively inexpensive compared to other more reputable sources.

I'm often asked how companies end up with shelfware (software purchased but not used).  In my experience, the tool selection process, or lack thereof, is often to blame.  Purchasing BI software is usually the second most expensive capital cost of a BI project. (often after consulting)  Utilizing a process that consists of:

  * Internal requirements analysis
  * Thorough vendor research
  * Formation of an internal selection team
  * Pre-RFI vendor interviews
  * A request for information (RFI) sent to the vendors
  * Vendor demos (onsite or offsite)
  * A comprehensive scoring spreadsheet (including weightings)
  * Vendor reference calls
  * A proof-of-concept (POC)
  * A formal selection meeting

All vendors should be provided with the same amount of information and if your utilizing a consulting company to help with the selection process, you should ask about their vendor partnerships and relationships. If time is constrained, you can limit the number of vendor demos, parallelize many of the activities and leverage some of the vendor-neutral software analysis that is available.  Just remember to read the fine print.

]]> Functional Wed, 30 Sep 2009 11:43:51 -0700