Blog: Steve Dine Subscribe to this blog's RSS feed!

Steve Dine

If you're looking for a change from analysts, thought leaders and industry gurus, you've come to the right place. Don't get me wrong, many of these aforementioned are my colleagues and friends that provide highly intelligent insight into our industry. However, there is nothing like a view from the trenches. I often find that there is what I like to call the "conference hangover." It is the headache that is incurred after trying to implement the "best" practices preached to your boss at a recent conference. It is the gap between how business intelligence (BI) projects, programs, architectures and toolsets should be in an ideal world versus the realities on the ground. It's that space between relational and dimensional or ETL and ELT. This blog is dedicated to sharing experiences, insights and ideas from inside BI projects and programs of what works, what doesn't and what could be done better. I welcome your feedback of what you observe and experience as well as topics that you would like to see covered. If you have a specific question, please email me at sdine@datasourceconsulting.com.

About the author >

Steve Dine is President and founder of Datasource Consulting, LLC. He has more than 12 years of hands-on experience delivering and managing successful, highly scalable and maintainable data integration and business intelligence (BI) solutions. Steve is a faculty member at The Data Warehousing Institute (TDWI) and a judge for the Annual TDWI Best Practices Awards. He is the former director of global data warehousing for a major durable medical equipment manufacturer and former BI practice director for an established Denver based consulting company. Steve earned his bachelor's degree from the University of Vermont and a MBA from the University of Colorado at Boulder.

Editor's Note: More articles and resources are available in Steve's BeyeNETWORK Expert Channel. Be sure to visit today!

For Business Intelligence (BI) projects to be sucessful, they must meet user requirements, be designed with strong user involvement, include a manageable scope, provide fast query response times, deliver the required reporting and analytic capabilities and provide value.  Also, as nearly all experienced BI practitioners will attest, for BI programs to be successful, they must be built iteratively.  In the trenches, that means sourcing, integrating, and developing front-end capabilities that target a subset of the enterprise data in each project.  Sometimes the scope of an interation is limited to a specific organizational process, such as invoicing, and other times by system, such as order-to-cash.  Either way, each subsequest project adds to the scope and volume of data that is integrated and available for analysis.  For initial BI projects and subsequent iterations to be successful, the data architecture must be scalable, flexible, understandable and maintainable.  However, most organizations put more thought and effort into their selection of BI tools than their data architecture.  The end result is that they implement a data architecture that crumble under the growing requirements of their users and data.

What is a BI Data Architecture?

A data architecture, as defined by Wikipedia, is the "design of data for use in defining the target state and the subsequent planning needed to achieve the target state".  For BI, this means the design of enterprise data to facilitate the ability by the business to efficiently transform data into information.  They two key words in this definition are "facilitate" and "efficiently".  The vast majority of BI projects are initiated because of the iniability of the business to efficiently access and consume enterprise data.  They must navigate non-user friendly application databases, must manually integrate data from disparate systems, and/or must fight to find and gain access to the data they require.  If BI is to be successful, it must deliver at the most basic level...the data.  If it takes three weeks to add five attributes to the data warehouse or twenty minutes to return a simple query, users will go it on their own.

To be more specific, the BI data architecture includes:

  • Data store architecture (database (RDBMS), unstructured data store (NoSQL), in-memory, OLAP)
  • The data warehouse process design (i.e. staging, ODS, base, presentation layers)
  • The data warehouse data model (for each process layer)
  • Metadata data model
  • MDM process design
  • MDM data model
  • BI tool semantic layer models
  • OLAP cube models
  • Data virtualization models
  • Data archiving strategy

What are the Signs of a Poor BI Data Architecture?

Sometimes the signs of a poor BI data architecture are obvious before the first project is rolled out, other times it's further downstream.  These include, but are not limited to:

  • Inability to meet DW load windows (batch, mini-batch, streaming, aggregates, etc)
  • Persistent slow query response
  • Non-integrated data (see: http://www.b-eye-network.com/blogs/dine/archives/2010/03/are_you_deliver.php)
  • Takes too long to add new attributes to data architecture
  • Users don't understand presentation layers (DW, semantic layers, etc)
  • Poor data quality
  • Doesn't meet compliance standards
  • Inability to adapt to business rule changes

Why is Designing a Solid BI Data Architecture so Challenging?

The main challenge with data architectures is that there is no one right architecture for every organiziation.  There are differences in data volumes, data latency requirements, number of source systems, types of data, network capacity, number of users, analytical needs, and many other variables.  We must also design not only for the requirements of the first project, but also future iterations.  This requires highly experienced data architects that have a background designing and modeling for BI, are flexible in their design approach, have the ability to communicate effectively with the business, a strong business acumen and a technical understanding of data and databases.  These resources are usually expensive and can be difficult to find, especially for more specialized verticals.

It also takes time to design and model a scalable, flexible, understandable and maintainable data architecture, which must be allocated for in project whether waterfall or agile.  The business must be involved to communicate requirements, data definitions and business rules, especially during the design of the logical data model.  Many data architects skip the logical modeling process and proceed directly to creating a physical model based on reverse engineering the source systems and/or their understanding of the systems.   Other times, companies look to pre-packaged solutions (see The Allure of the Silver Bullet) but later find that they their architecture is both non-scalable and inflexible.

What's the Solution?

This issue reminds me of a home improvement program that was aired on TV a few months ago.  The people wanted to add a second story on their house but were concerned since they had noticed cracks in their walls and uneven floors.  After inspecting the foundation, they discovered that it wasn't designed correctly and would crumble under the additional load.  Their only choice was to redesign and rebuild their foundation.  I find that many organizations discover the same issues with their data architectures.  The data architecture is the foundation of the BI program, and very few succeed with a poor one.  At some point it will crumble under the load.  For new programs, the solution is to understand the challenge and ensure that you're putting the right amount of focus and proper resource(s) on the task.  For existing programs, look for the signs mentioned above and assess your existing architecture.  You may still have time to recover and avoid a failed BI initiative. 


Posted September 18, 2011 4:37 PM
Permalink | 1 Comment |

The high failure rate of BI projects and programs has been well documented throughout the years.  The reasons why they fail has also been well documented, which leads to the obvious question of "why do they continue to fail at such a high rate?"   Our company is often brought in to assess existing BI programs and provide recommendations to organizations to help turn around their BI programs.  In a series of blogs, I'll address 5 of the most common reasons, why we see BI projects and programs fail.  These reasons are in no particular order as it's usually a combination of issues that play a part in the failure.

The first reason I'm going to tackle is what I like to call the allure of the silver bullet.  In most organizations, BI projects emerge out of an immediate need to report and analyze data that is not easily accessed.  It could be in a source system that has a complex schema, in multiple systems and require integration, be too voluminous to easily analyze, or one of many other data access related issues.  There also may, or may not, be tools available to facilitate the reporting and analytic process beyond MS Excel.   Whatever the reason, experienced BI practitioners will usually estimate the solution at 3 to 6 months on average if no BI infrastructure exists, or 30 to 90 days if one does.  In the face of this daunting estimate, the first instinct of many decision makers is to search for a less costly solution, in terms of both time and money.

Fortunately, for these decision makers, the market has recognized this opportunity and sprouted vendors that have developed pre-packaged reporting and analytic solutions to meet this need.  They are often created for a specific source system, subject area and/or industry vertical.  In some cases, they utilize custom interfaces, but more often they leverage existing BI front-end software.  If they provide ETL capabilities, then it's usually delivered as pre-developed mappings through an off-the-shelf tool.  The sales team provides a well polished demo and promises a solution in less than a week.  It sounds pretty convincing.  So, how can these pre-packaged analytic solutions lead to so many failed BI projects and programs?  The concept in itself isn't the issue, in fact in many cases it can help reduce the implementation time and cost, as well as provide value.  The issue is that these solutions aren't a silver bullet and are usually evaluated tactically rather than as part of the framework of the overall BI strategy.

BI project success and failure is measured by most companies as whether the projects are delivered on-time and on-budget.  In fewer cases, they are measured on the return on investment (ROI).  The challenge with pre-packaged analytic solutions is that they are sold as a quick hit and low cost solution to the traditional BI project.  What they usually fail to mention is that this is only possible in the case where an organization has not customized their ERP system, has a high level of data quality, only moderate data volumes, only requires sourcing data from one system and has 100% overlap in reporting requirements.  Unfortunately, this doesn't describe the state of most data environments.  What is more common is that these projects require more time to implement at a cost that often approaches, if not exceeds, a traditional solution.  If these factors aren't accounted for in the project timeline, budget and ROI calculation, they are seen as having failed.

The challenge that pre-packaged analytic solutions present to the long term success of BI programs is that their data model is often difficult to scale and once customized, difficult to upgrade, especially when an ETL and semantic models are included.  BI architectures that are difficult to change, given the constant change of business and requirements on BI, is a recipe for failure.  They also often don't track changes to the data and provide a flexible architecture, such as staging layers that facilitate the addition of new source systems.  In some instances, it ties organizations to a data integration (DI) tool that doesn't scale or adds another tool to manage in their environment.  At one customer we assessed, the DI tool that was used by the pre-packaged analytic solution was the same one they were using, but a different version and so they had to maintain two versions of the same software.

As I mentioned, the issue isn't with pre-packaged analytic solutions per se, it's that they are often purchased for the wrong reasons and not factored in to their long term BI strategy.  Organizations can derive a great deal of value from these solutions, but don't get lulled into believing that they are a silver bullet.

 


Posted February 16, 2011 3:18 PM
Permalink | No Comments |
As companies look to derive more value from their data, many are looking to add advanced analytics to their BI capabilities.  One challenge that organizations find with taking that next step is finding individuals with the skills necessary to build, run and interpret analytic models.  In the past, tools like SAS, SPSS and R have been limited to a smaller subset of users in most organizations.  However, there are a number of new software offerings that are aimed at providing complex analytic capabilities to the masses.  They promote the ability for users to perform assisted advanced statistical analysis, such as market basket analysis, fraud detection and risk modeling without users necessarily having to understand statistical methods, such as regression, Chi-squared testing, ANOVA and K-means clustering in order to create and run the models. 

The question is whether making the tools easier to use will drive a larger population of users and derive more value from the data.  The challenge is that it may lead to a greater misuse of statistics and less confidence in the results by the business.  It takes a deep understanding of statistical methods and experience to not only create proper models, but also to interpret the results.  As eloquently described in Wikipedia; "Misuse of statistics can produce subtle, but serious errors in description and interpretation -- subtle in the sense that even experienced professionals make such errors, and serious in the sense that they can lead to devastating decision errors. Even when statistical techniques are correctly applied, the results can be difficult to interpret for those lacking expertise."

I applaud the BI vendors for addressing this market since I believe that we need organizations to be more analytically driven.  However, I don't believe that it's simply a matter of making the tools easier to use.  My hope is that they add instructive features into their applications with regard to proper use of statistical methods, how to select the right method for the analysis and how to interpret the results.  I'd also believe it's necessary to provide and promote analytic training, either virtual, classroom or both, as part of their offering. The result is that we end up with a larger pool of trained analysts which is good for both the vendors and their customers.  In addition, organizations looking to achieve more value from their BI investment win as well.

Posted January 31, 2011 5:59 PM
Permalink | No Comments |
Each year I form my opinion based on the observations from five sources; our customers, industry conferences, articles, social media and BI software vendors.  2010 proved to be an interesting year on many fronts.  My observations from 2010 are as follows:

1)  Larger numbers of existing BI programs are maturing beyond managed reports, ad-hoc queries, dashboards and OLAP than in years past.  Companies are increasingly looking to squeeze more value out of their BI programs via advanced visualization, predictive analytics, spatial analysis, operational BI and collaboration.  This is driving the rise of analytic databases (ADB's) and new analytic applications and features.

2)  The number of failed BI projects continues to remain high.  While the industry has learned, and documented, the reasons of why projects fail, it hasn't done much to stem the tide of failed projects.  From my perspective, and I'm sure most would concur, the overarching reason is because implementing successful BI projects is hard.  It requires a balance of strong business involvement, thorough data analysis, scalable system & data architectures, comprehensive program and data governance, high quality data, established standards & processes, excellent communication & project management and, experience.  I don't necessarily see things changing in 2011 unless companies:

  • institute and enforce Enterprise data management practices
  • ensure high levels of business involvement for BI projects
  • institute measurable, value driven, metrics for each BI project
  • realize that offshoring BI projects, except for simple staging ETL and simple BI reports, doesn't work well for BI.

3)  Much of the old is new again.  Master Data Management (MDM), dashboarding and predictive analytics are not new concepts, but they did see a strong reemergence in 2010.  One of the challenges with implementing integrated dashboards and predictive analytics was that they were often missing from the enterprise BI software suites.  It seems that the major BI vendors finally started listening to their customers are new capabilities started rolling out.  MDM was simply repackaged to include more than just the data warehouse and found traction in organizations struggling with defining their master data and keeping it consistent across their disparate systems.

4)  BI wants to be agile. We've always recognized the high cost and long lead times for implementing BI, but it seems that the customer has finally said 'enough' and BI teams are listening.  They are looking for new ways to implement BI and finding that many Agile practices, such as smaller, focused iterations, daily scrum meetings, prototyping and integrated testing help speed up the velocity of BI and augment the communication between the business and IT. In 2011, I expect to see BI practitioners better understand what Agile practices work with BI and which one's don't translate as well.

5)  There was increasing interest in BI in the Cloud, but fewer takers than expected.  Infrastructure as a service is still a bit too technical for many BI programs to implement and concerns over data security remain high.  There are also concerns over performance, both hardware and network.  BI software as a service products are continuing to mature but  the biggest hindrance for user adoption in 2010 appeared to be poor ETL capabilities and non-extensible identity & access management.    

I'm always interested in you reaction and views.  Please feel free to post your comments and take on 2010.  

Posted January 10, 2011 11:31 AM
Permalink | 2 Comments |
In the world of software sales, a proof of concept, or POC, is a formal initiative to prove the viability of the software to meet a customer's defined requirements.  The initiative may be as short as one day or as long as a month, but usually falls in the 3 to 5 day time frame.  A POC is usually initiated at the request of the customer with the understanding that if the software meets their requirements, then a purchase is likely.

There are many reasons why a customer will request a POC, but it usually comes down to ensuring that the software will actually be able to deliver the promises made by the vendor's sales reps and marketing materials.  In most cases, the terms of the POC are dictated by the vendor while the requirements are defined by the customer.

The primary mistake that I often find when evaluating the results of a POC is that the customer defines their requirements based completely on the issues that they are looking for the software to resolve rather than all the requirements that they need the software to deliver.  For example, in a recent POC that I evaluated, the customer was considering the purchase of a DW appliance.  They were having challenges with loading their data within their available load window and had a number of queries that were taking more than 30 minutes to return.  They asked to set up a POC with a major industry vendor and they were able to cut the total load time by 20 hours and brought the longest running query down to under 3 minutes.

By all accounts, the POC was a major success and the customer purchased the appliance.  When they began migrating their existing ETL jobs they found out that the data loads were extremely slow.  Instead of insisting that they include a few of their ETL jobs in the POC, they let the vendor steer them into using only the bulk loader.  They also discovered that their front-end BI tool issued SQL that wasn't compatible with the database appliance and their data modeling tool wasn't compatible.  Those areas weren't considered for the POC.  Many reading this blog entry are probably thinking that this couldn't happen in their environment, but unfortunately it happens more than you'd think.

The challenge with POC's is that the vendors' know the strengths and weakness of their software and are savvy with steering the POC in a direction that demonstrates the strengths and avoids the weaknesses.  However, we can't blame the vendors since this is simply sales 101.  Customers not only need to dictate both the terms and requirements of the POC, but also dedicate the time and resources necessary to ensure a proper evaluation.  For example, if you can't provide hardware resources that that are comparable to your existing or planned environment for the POC, then you may not be proving what you set out to achieve.    Customers also need to include a broad range of requirements, not just issues that you're looking to solve.  You may solve your issues, but introduce many more.

The next time you look to bring in software for a POC, remember that the 'P' stands for Proof that the software will meet all your expectations, not just the issues you're looking to resolve.
 

Posted October 10, 2010 4:30 PM
Permalink | 2 Comments |
PREV 1 2