Blog: Steve Dine Subscribe to this blog's RSS feed!

Steve Dine

If you're looking for a change from analysts, thought leaders and industry gurus, you've come to the right place. Don't get me wrong, many of these aforementioned are my colleagues and friends that provide highly intelligent insight into our industry. However, there is nothing like a view from the trenches. I often find that there is what I like to call the "conference hangover." It is the headache that is incurred after trying to implement the "best" practices preached to your boss at a recent conference. It is the gap between how business intelligence (BI) projects, programs, architectures and toolsets should be in an ideal world versus the realities on the ground. It's that space between relational and dimensional or ETL and ELT. This blog is dedicated to sharing experiences, insights and ideas from inside BI projects and programs of what works, what doesn't and what could be done better. I welcome your feedback of what you observe and experience as well as topics that you would like to see covered. If you have a specific question, please email me at sdine@datasourceconsulting.com.

About the author >

Steve Dine is President and founder of Datasource Consulting, LLC. He has more than 12 years of hands-on experience delivering and managing successful, highly scalable and maintainable data integration and business intelligence (BI) solutions. Steve is a faculty member at The Data Warehousing Institute (TDWI) and a judge for the Annual TDWI Best Practices Awards. He is the former director of global data warehousing for a major durable medical equipment manufacturer and former BI practice director for an established Denver based consulting company. Steve earned his bachelor's degree from the University of Vermont and a MBA from the University of Colorado at Boulder.

Editor's Note: More articles and resources are available in Steve's BeyeNETWORK Expert Channel. Be sure to visit today!

Recently in Functional Category

For Business Intelligence (BI) projects to be sucessful, they must meet user requirements, be designed with strong user involvement, include a manageable scope, provide fast query response times, deliver the required reporting and analytic capabilities and provide value.  Also, as nearly all experienced BI practitioners will attest, for BI programs to be successful, they must be built iteratively.  In the trenches, that means sourcing, integrating, and developing front-end capabilities that target a subset of the enterprise data in each project.  Sometimes the scope of an interation is limited to a specific organizational process, such as invoicing, and other times by system, such as order-to-cash.  Either way, each subsequest project adds to the scope and volume of data that is integrated and available for analysis.  For initial BI projects and subsequent iterations to be successful, the data architecture must be scalable, flexible, understandable and maintainable.  However, most organizations put more thought and effort into their selection of BI tools than their data architecture.  The end result is that they implement a data architecture that crumble under the growing requirements of their users and data.

What is a BI Data Architecture?

A data architecture, as defined by Wikipedia, is the "design of data for use in defining the target state and the subsequent planning needed to achieve the target state".  For BI, this means the design of enterprise data to facilitate the ability by the business to efficiently transform data into information.  They two key words in this definition are "facilitate" and "efficiently".  The vast majority of BI projects are initiated because of the iniability of the business to efficiently access and consume enterprise data.  They must navigate non-user friendly application databases, must manually integrate data from disparate systems, and/or must fight to find and gain access to the data they require.  If BI is to be successful, it must deliver at the most basic level...the data.  If it takes three weeks to add five attributes to the data warehouse or twenty minutes to return a simple query, users will go it on their own.

To be more specific, the BI data architecture includes:

  • Data store architecture (database (RDBMS), unstructured data store (NoSQL), in-memory, OLAP)
  • The data warehouse process design (i.e. staging, ODS, base, presentation layers)
  • The data warehouse data model (for each process layer)
  • Metadata data model
  • MDM process design
  • MDM data model
  • BI tool semantic layer models
  • OLAP cube models
  • Data virtualization models
  • Data archiving strategy

What are the Signs of a Poor BI Data Architecture?

Sometimes the signs of a poor BI data architecture are obvious before the first project is rolled out, other times it's further downstream.  These include, but are not limited to:

  • Inability to meet DW load windows (batch, mini-batch, streaming, aggregates, etc)
  • Persistent slow query response
  • Non-integrated data (see: http://www.b-eye-network.com/blogs/dine/archives/2010/03/are_you_deliver.php)
  • Takes too long to add new attributes to data architecture
  • Users don't understand presentation layers (DW, semantic layers, etc)
  • Poor data quality
  • Doesn't meet compliance standards
  • Inability to adapt to business rule changes

Why is Designing a Solid BI Data Architecture so Challenging?

The main challenge with data architectures is that there is no one right architecture for every organiziation.  There are differences in data volumes, data latency requirements, number of source systems, types of data, network capacity, number of users, analytical needs, and many other variables.  We must also design not only for the requirements of the first project, but also future iterations.  This requires highly experienced data architects that have a background designing and modeling for BI, are flexible in their design approach, have the ability to communicate effectively with the business, a strong business acumen and a technical understanding of data and databases.  These resources are usually expensive and can be difficult to find, especially for more specialized verticals.

It also takes time to design and model a scalable, flexible, understandable and maintainable data architecture, which must be allocated for in project whether waterfall or agile.  The business must be involved to communicate requirements, data definitions and business rules, especially during the design of the logical data model.  Many data architects skip the logical modeling process and proceed directly to creating a physical model based on reverse engineering the source systems and/or their understanding of the systems.   Other times, companies look to pre-packaged solutions (see The Allure of the Silver Bullet) but later find that they their architecture is both non-scalable and inflexible.

What's the Solution?

This issue reminds me of a home improvement program that was aired on TV a few months ago.  The people wanted to add a second story on their house but were concerned since they had noticed cracks in their walls and uneven floors.  After inspecting the foundation, they discovered that it wasn't designed correctly and would crumble under the additional load.  Their only choice was to redesign and rebuild their foundation.  I find that many organizations discover the same issues with their data architectures.  The data architecture is the foundation of the BI program, and very few succeed with a poor one.  At some point it will crumble under the load.  For new programs, the solution is to understand the challenge and ensure that you're putting the right amount of focus and proper resource(s) on the task.  For existing programs, look for the signs mentioned above and assess your existing architecture.  You may still have time to recover and avoid a failed BI initiative. 


Posted September 18, 2011 4:37 PM
Permalink | No Comments |

The high failure rate of BI projects and programs has been well documented throughout the years.  The reasons why they fail has also been well documented, which leads to the obvious question of "why do they continue to fail at such a high rate?"   Our company is often brought in to assess existing BI programs and provide recommendations to organizations to help turn around their BI programs.  In a series of blogs, I'll address 5 of the most common reasons, why we see BI projects and programs fail.  These reasons are in no particular order as it's usually a combination of issues that play a part in the failure.

The first reason I'm going to tackle is what I like to call the allure of the silver bullet.  In most organizations, BI projects emerge out of an immediate need to report and analyze data that is not easily accessed.  It could be in a source system that has a complex schema, in multiple systems and require integration, be too voluminous to easily analyze, or one of many other data access related issues.  There also may, or may not, be tools available to facilitate the reporting and analytic process beyond MS Excel.   Whatever the reason, experienced BI practitioners will usually estimate the solution at 3 to 6 months on average if no BI infrastructure exists, or 30 to 90 days if one does.  In the face of this daunting estimate, the first instinct of many decision makers is to search for a less costly solution, in terms of both time and money.

Fortunately, for these decision makers, the market has recognized this opportunity and sprouted vendors that have developed pre-packaged reporting and analytic solutions to meet this need.  They are often created for a specific source system, subject area and/or industry vertical.  In some cases, they utilize custom interfaces, but more often they leverage existing BI front-end software.  If they provide ETL capabilities, then it's usually delivered as pre-developed mappings through an off-the-shelf tool.  The sales team provides a well polished demo and promises a solution in less than a week.  It sounds pretty convincing.  So, how can these pre-packaged analytic solutions lead to so many failed BI projects and programs?  The concept in itself isn't the issue, in fact in many cases it can help reduce the implementation time and cost, as well as provide value.  The issue is that these solutions aren't a silver bullet and are usually evaluated tactically rather than as part of the framework of the overall BI strategy.

BI project success and failure is measured by most companies as whether the projects are delivered on-time and on-budget.  In fewer cases, they are measured on the return on investment (ROI).  The challenge with pre-packaged analytic solutions is that they are sold as a quick hit and low cost solution to the traditional BI project.  What they usually fail to mention is that this is only possible in the case where an organization has not customized their ERP system, has a high level of data quality, only moderate data volumes, only requires sourcing data from one system and has 100% overlap in reporting requirements.  Unfortunately, this doesn't describe the state of most data environments.  What is more common is that these projects require more time to implement at a cost that often approaches, if not exceeds, a traditional solution.  If these factors aren't accounted for in the project timeline, budget and ROI calculation, they are seen as having failed.

The challenge that pre-packaged analytic solutions present to the long term success of BI programs is that their data model is often difficult to scale and once customized, difficult to upgrade, especially when an ETL and semantic models are included.  BI architectures that are difficult to change, given the constant change of business and requirements on BI, is a recipe for failure.  They also often don't track changes to the data and provide a flexible architecture, such as staging layers that facilitate the addition of new source systems.  In some instances, it ties organizations to a data integration (DI) tool that doesn't scale or adds another tool to manage in their environment.  At one customer we assessed, the DI tool that was used by the pre-packaged analytic solution was the same one they were using, but a different version and so they had to maintain two versions of the same software.

As I mentioned, the issue isn't with pre-packaged analytic solutions per se, it's that they are often purchased for the wrong reasons and not factored in to their long term BI strategy.  Organizations can derive a great deal of value from these solutions, but don't get lulled into believing that they are a silver bullet.

 


Posted February 16, 2011 3:18 PM
Permalink | No Comments |
In the world of software sales, a proof of concept, or POC, is a formal initiative to prove the viability of the software to meet a customer's defined requirements.  The initiative may be as short as one day or as long as a month, but usually falls in the 3 to 5 day time frame.  A POC is usually initiated at the request of the customer with the understanding that if the software meets their requirements, then a purchase is likely.

There are many reasons why a customer will request a POC, but it usually comes down to ensuring that the software will actually be able to deliver the promises made by the vendor's sales reps and marketing materials.  In most cases, the terms of the POC are dictated by the vendor while the requirements are defined by the customer.

The primary mistake that I often find when evaluating the results of a POC is that the customer defines their requirements based completely on the issues that they are looking for the software to resolve rather than all the requirements that they need the software to deliver.  For example, in a recent POC that I evaluated, the customer was considering the purchase of a DW appliance.  They were having challenges with loading their data within their available load window and had a number of queries that were taking more than 30 minutes to return.  They asked to set up a POC with a major industry vendor and they were able to cut the total load time by 20 hours and brought the longest running query down to under 3 minutes.

By all accounts, the POC was a major success and the customer purchased the appliance.  When they began migrating their existing ETL jobs they found out that the data loads were extremely slow.  Instead of insisting that they include a few of their ETL jobs in the POC, they let the vendor steer them into using only the bulk loader.  They also discovered that their front-end BI tool issued SQL that wasn't compatible with the database appliance and their data modeling tool wasn't compatible.  Those areas weren't considered for the POC.  Many reading this blog entry are probably thinking that this couldn't happen in their environment, but unfortunately it happens more than you'd think.

The challenge with POC's is that the vendors' know the strengths and weakness of their software and are savvy with steering the POC in a direction that demonstrates the strengths and avoids the weaknesses.  However, we can't blame the vendors since this is simply sales 101.  Customers not only need to dictate both the terms and requirements of the POC, but also dedicate the time and resources necessary to ensure a proper evaluation.  For example, if you can't provide hardware resources that that are comparable to your existing or planned environment for the POC, then you may not be proving what you set out to achieve.    Customers also need to include a broad range of requirements, not just issues that you're looking to solve.  You may solve your issues, but introduce many more.

The next time you look to bring in software for a POC, remember that the 'P' stands for Proof that the software will meet all your expectations, not just the issues you're looking to resolve.
 

Posted October 10, 2010 4:30 PM
Permalink | 2 Comments |

A colleague of mine recently asked me to chime in on discussion thread about ETL tools.  Like many discussion board threads, it had drifted significantly from the orginal topic.  As a general rule, I tend to spend little time on these boards, not because I don't think that they have value, but between Twitter, Blogs and instant messaging, I can't seem to find much time left for additional on-line interaction.  I read through the thread and was immediately struck by the number of comments that were phrased as facts but were either based on outdated information, innuendo or rumor.

Only a few days after my discussion board experience, a client of ours asked me to read an evaluation they had purchased on ETL tools.  I read through the report and many of their scores didn't seem to parallel my experience with the tools. I found the likely reason in the fine print.  The report publisher had been kind enough to explain that their scores were derived, not from a hands-on evaluation, but by meetings with vendors and consulting partners.  I was perplexed, but not surprised.  The customer had purchased the report because it was relatively inexpensive compared to other more reputable sources.

I'm often asked how companies end up with shelfware (software purchased but not used).  In my experience, the tool selection process, or lack thereof, is often to blame.  Purchasing BI software is usually the second most expensive capital cost of a BI project. (often after consulting)  Utilizing a process that consists of:

  * Internal requirements analysis
  * Thorough vendor research
  * Formation of an internal selection team
  * Pre-RFI vendor interviews
  * A request for information (RFI) sent to the vendors
  * Vendor demos (onsite or offsite)
  * A comprehensive scoring spreadsheet (including weightings)
  * Vendor reference calls
  * A proof-of-concept (POC)
  * A formal selection meeting

All vendors should be provided with the same amount of information and if your utilizing a consulting company to help with the selection process, you should ask about their vendor partnerships and relationships. If time is constrained, you can limit the number of vendor demos, parallelize many of the activities and leverage some of the vendor-neutral software analysis that is available.  Just remember to read the fine print.


Posted September 30, 2009 11:43 AM
Permalink | 2 Comments |

How many times have you heard a statistic such as "42% of respondents rate their BI program as moderately successful" or "more than 50% of all BI projects fail"? When I read these types of statistics I often wonder what's behind these numbers. I'd love to be able to drill down directly to the respondents and ask them how they defined success or failure when answering the question. I recently met with a company that asked if I could come in and help make their BI program more successful. Naturally, my first question was to ask them how they define success. Each person in the room defined success a bit differently and meeting turned into a healthy discussion on what constitutes a successful BI program.

As an industry, I believe that one of our big challenges is that we have left the definition of success relatively ambiguous and in some cases, in the hands of the tool vendors. We have done a good job at identifying the general contributors of success, such as ROI, better access to data, number of users, increased quality of data, cost savings, etc. The problem is that these contributors in themselves don't necessarily determine the success or failure of a BI program. For example, how many BI programs are considered a failure because they were 3 months late delivering the initial data warehouse or have less than 5% active users in their company.

As a starting point for discussion, I am going to make a definitive, and likely controversial statement, of what defines success and throw some concrete numbers into the mix. In my experience, there are two components that determines the success of a BI program; reality and perception. Reality can drive perception but perception cannot drive reality. From a reality perspective, the success or failure of a BI program can only be measured its ROI. Why? Because companies are in the business to make money and if they don't, they will ultimately fail. A BI program is an investment that a company makes in hopes of achieving a positive return. With that in mind, I contend that a BI program that achieves a negative ROI over time is considered a failure. One that achieves a 0 to 24% ROI is considered a moderate success and one that achieves a 25% or greater ongoing ROI is considered highly successful. Ongoing ROI is calculated each year that takes into account all increased revenues, decreased costs as well as capital and operational investments.

A quick story from the trenches. A few years back I was the director of data warehousing for a large manufacturing company and each month the CIO used to ask me how many active users do we have of the data warehouse. He believed that the program was failing because less than 40% or our targeted constituents were using the data warehouse. Then our team embarked on a project sponsored by the VP of international sales to identify high growth customers and create a set of targeted analyses and reports for his field reps. Ultimately, he attributed 3% of the growth that year in international sales to the work of the DW team. Our CIO stopped asking how many active users we had and starting asking how ROI we were generating from the BI program.

Some readers than have attended my Lean BI course might be scratching their heads thinking that this conflicts with the first principle, "Focus on Customer Value". In my next blog, I'll discuss how customer value drives ROI and why both reality and perception are important to the success of a BI program and how reality can drive perception. In the meantime, I welcome your feedback.


Posted October 13, 2008 8:37 AM
Permalink | No Comments |