Blog: Steve Dine Subscribe to this blog's RSS feed!

Steve Dine

If you're looking for a change from analysts, thought leaders and industry gurus, you've come to the right place. Don't get me wrong, many of these aforementioned are my colleagues and friends that provide highly intelligent insight into our industry. However, there is nothing like a view from the trenches. I often find that there is what I like to call the "conference hangover." It is the headache that is incurred after trying to implement the "best" practices preached to your boss at a recent conference. It is the gap between how business intelligence (BI) projects, programs, architectures and toolsets should be in an ideal world versus the realities on the ground. It's that space between relational and dimensional or ETL and ELT. This blog is dedicated to sharing experiences, insights and ideas from inside BI projects and programs of what works, what doesn't and what could be done better. I welcome your feedback of what you observe and experience as well as topics that you would like to see covered. If you have a specific question, please email me at sdine@datasourceconsulting.com.

About the author >

Steve Dine is President and founder of Datasource Consulting, LLC. He has more than 12 years of hands-on experience delivering and managing successful, highly scalable and maintainable data integration and business intelligence (BI) solutions. Steve is a faculty member at The Data Warehousing Institute (TDWI) and a judge for the Annual TDWI Best Practices Awards. He is the former director of global data warehousing for a major durable medical equipment manufacturer and former BI practice director for an established Denver based consulting company. Steve earned his bachelor's degree from the University of Vermont and a MBA from the University of Colorado at Boulder.

Editor's Note: More articles and resources are available in Steve's BeyeNETWORK Expert Channel. Be sure to visit today!

Recently in Vendors Category

As companies look to derive more value from their data, many are looking to add advanced analytics to their BI capabilities.  One challenge that organizations find with taking that next step is finding individuals with the skills necessary to build, run and interpret analytic models.  In the past, tools like SAS, SPSS and R have been limited to a smaller subset of users in most organizations.  However, there are a number of new software offerings that are aimed at providing complex analytic capabilities to the masses.  They promote the ability for users to perform assisted advanced statistical analysis, such as market basket analysis, fraud detection and risk modeling without users necessarily having to understand statistical methods, such as regression, Chi-squared testing, ANOVA and K-means clustering in order to create and run the models. 

The question is whether making the tools easier to use will drive a larger population of users and derive more value from the data.  The challenge is that it may lead to a greater misuse of statistics and less confidence in the results by the business.  It takes a deep understanding of statistical methods and experience to not only create proper models, but also to interpret the results.  As eloquently described in Wikipedia; "Misuse of statistics can produce subtle, but serious errors in description and interpretation -- subtle in the sense that even experienced professionals make such errors, and serious in the sense that they can lead to devastating decision errors. Even when statistical techniques are correctly applied, the results can be difficult to interpret for those lacking expertise."

I applaud the BI vendors for addressing this market since I believe that we need organizations to be more analytically driven.  However, I don't believe that it's simply a matter of making the tools easier to use.  My hope is that they add instructive features into their applications with regard to proper use of statistical methods, how to select the right method for the analysis and how to interpret the results.  I'd also believe it's necessary to provide and promote analytic training, either virtual, classroom or both, as part of their offering. The result is that we end up with a larger pool of trained analysts which is good for both the vendors and their customers.  In addition, organizations looking to achieve more value from their BI investment win as well.

Posted January 31, 2011 5:59 PM
Permalink | No Comments |
In the world of software sales, a proof of concept, or POC, is a formal initiative to prove the viability of the software to meet a customer's defined requirements.  The initiative may be as short as one day or as long as a month, but usually falls in the 3 to 5 day time frame.  A POC is usually initiated at the request of the customer with the understanding that if the software meets their requirements, then a purchase is likely.

There are many reasons why a customer will request a POC, but it usually comes down to ensuring that the software will actually be able to deliver the promises made by the vendor's sales reps and marketing materials.  In most cases, the terms of the POC are dictated by the vendor while the requirements are defined by the customer.

The primary mistake that I often find when evaluating the results of a POC is that the customer defines their requirements based completely on the issues that they are looking for the software to resolve rather than all the requirements that they need the software to deliver.  For example, in a recent POC that I evaluated, the customer was considering the purchase of a DW appliance.  They were having challenges with loading their data within their available load window and had a number of queries that were taking more than 30 minutes to return.  They asked to set up a POC with a major industry vendor and they were able to cut the total load time by 20 hours and brought the longest running query down to under 3 minutes.

By all accounts, the POC was a major success and the customer purchased the appliance.  When they began migrating their existing ETL jobs they found out that the data loads were extremely slow.  Instead of insisting that they include a few of their ETL jobs in the POC, they let the vendor steer them into using only the bulk loader.  They also discovered that their front-end BI tool issued SQL that wasn't compatible with the database appliance and their data modeling tool wasn't compatible.  Those areas weren't considered for the POC.  Many reading this blog entry are probably thinking that this couldn't happen in their environment, but unfortunately it happens more than you'd think.

The challenge with POC's is that the vendors' know the strengths and weakness of their software and are savvy with steering the POC in a direction that demonstrates the strengths and avoids the weaknesses.  However, we can't blame the vendors since this is simply sales 101.  Customers not only need to dictate both the terms and requirements of the POC, but also dedicate the time and resources necessary to ensure a proper evaluation.  For example, if you can't provide hardware resources that that are comparable to your existing or planned environment for the POC, then you may not be proving what you set out to achieve.    Customers also need to include a broad range of requirements, not just issues that you're looking to solve.  You may solve your issues, but introduce many more.

The next time you look to bring in software for a POC, remember that the 'P' stands for Proof that the software will meet all your expectations, not just the issues you're looking to resolve.
 

Posted October 10, 2010 4:30 PM
Permalink | 2 Comments |
When you mention BI in the Cloud amongst database vendors and BI practitioners, chances are you'll land on the topic of performance within the first minute or two of the conversation.  Until recently, there was little to debate as limited image sizes, slow storage and shared networks limited the performance of Hardware-as-a-Service (HaaS) based solutions. However, recently Amazon announced the availability of Cluster Compute Instances for Amazon's Elastic Computing Cloud (EC2).  Cluster Compute Instances provide more CPU than any other Amazon EC2 instance. Customers can also group Cluster Compute Instances into clusters allowing applications to get the low-latency network performance required for tightly coupled, node-to-node communication (typical of many BI architectures). Cluster Compute Instances also provide significantly increased network throughput making them well suited for customer applications that need to perform network-intensive operations.

It won't be long before purpose built Clouds start to hit the market, designed specifically to meet the high compute and i/o demands of BI.  Kognitio is already providing a SaaS based solution for hosted data warehouses and front-end BI applications.  From my standpoint, it won't be long before performance will drop off the list of top BI in the Cloud concerns and BI practitioners will wonder why they ever implemented and managed onsite BI architectures.

Posted August 3, 2010 5:38 PM
Permalink | No Comments |

Chances are that if you have data that's taking too long to load, monster queries that are taking too long to return and/or planning to add complex analytics to the mix, you are considering purchasing a data warehouse appliance.  DW appliances leverage integrated hardware with a high-speed back-plane, embedded operating system, scalable storage and analytic database technology to deliver a compelling price to performance solution.  For existing BI programs, a proof-of-concept is usually requested to have the vendor prove that their DW appliance will solve their biggest pain points.  It's difficult not to get excited when that 3 hour query returns in 3 minutes.  However, before you sign that contract, you may also want to ensure that the DW appliance:

  1. Is certified to work with your existing, or planned, front-end BI tools.  Your BI tools may not generate SQL that is recognized by the appliance, be able to leverage the built-in database functions or support the same data types.
  2. Has native drivers that work with your existing, or planned, ETL tools.  Many appliances require the use of their bulk loaders to move large volumes of data in and out of the database and are drastically slower with ETL tools.  (note: a fast driver may not be necessary if they support push-down optimization, aka ELT)
  3. Supports correlated subqueries.  While appliance vendors will correctly argue that correlated subqueries are inherently slow, due to the fact that they require nested loops, chances are that your BI tool will produce them.
  4. Is compatible with your data modeling tool.  Many DW appliances are built upon open source databases, such as Postgres.  Make sure your data modeling tool can reverse engineer and produce the DDL that these databases can consume.
  5. Automatically handle the addition of new nodes.  Since most appliances utilize a massively parallel processing (MPP) architecture, data is distributed across the nodes to ensure balanced processing.  However, this can cause challenges when adding new nodes and often require the unloading and reloading of the data.

The best approach is to include these items, or those that are critical to your requirements, in the proof-of-concept.  While you can't mitigate all your risks, adding these items to your list of requirements will help to ensure a more successful implementation for you and your appliance vendor. 


Posted March 21, 2010 9:03 PM
Permalink | No Comments |

I recently led a BI tool selection project for one of our customers that is classified in the small and mid-sized business (SMB) market. A business with 100 or fewer employees is generally considered small, while one with 100-999 employees is considered to be mid-sized. (by this definition, our customer would be mid-sized) Having recently heard a number of the enterprise BI vendors state that they are now targeting this market, I figured that they would be prepared for the differences that these customers require with regard to the sales process. Unfortunately, in most cases I didn't find this to be true.

The SMB customers tend to differ from larger, "enterprise" customers in a number of ways. First, they generally have fewer IT employees to support the organization and if a BI competency center exists, it may only consist of one or two resources. Second, the user base is obviously smaller along with the budget for new software. Third, the projects tend to be shorter, including the tool selection process, so the sales cycle is generally faster. At least one of these areas seemed to cause problems for all the vendors on our list, while a few of the vendors seemed unprepared for all these differences.

The selection team's first challenge was in getting some of the sales reps to participate. While a few of the vendors had specific sales reps assigned to the SMB market, my guess is that they primarily focus on the upper end of the market. A sales rep for one of the top enterprise BI vendors, who I won't mention by name, declined to participate unless we agreed to purchase their ETL tool along with their front-end BI tool set. In one case, we discovered that the vendor doesn't sell into SMB customers but instead relies entirely on consulting partners for this area of the market. This works well unless there is no opportunity for the partner to sell consulting. Three of the sales reps wouldn't respond to the RFI before being promised multiple fact gathering meetings. While most sales reps would argue that this is sales 101, unfortunately a faster sales cycle means fewer chances to meet prior to completion of the RFI. Maybe that is taught in sales 102.

Another area the selection team found to be a challenge was with regard to pricing. Most of the vendors claim to have special pricing for SMB customers, but only with regard to the discount. As far as the selection team could tell, the list prices were the same, it was only with regard to the discount amount that the sales rep could provide. This made it difficult to determine whether a tool set was within the budget of the customer until well into the process. The other challenge we found was with regard to the cost of non-production licenses. (development & test environments) In one case, the cost of the non-production licenses would be more expensive than the production licenses. This was due to the fact that the non-production licenses were priced based on assuming 100 users, with no ability to variate.

I realize that not all sales reps are created equally and in some cases, it's the actions of the rep that reflect poorly on the vendor as a whole. However, for the enterprise BI vendors to be successful in this market, they will need a different approach to selling and an updated compensation model for their sales force. It is not enough to simply focus on pricing, they must also focus on the sales process. Handing the opportunity off to an inside sales team or a consulting partner isn't sufficient. It only signals to SMB customers how they will likely be treated once they buy the product. Also, sending in a sales rep that has no training on selling into a smaller customer isn't sufficient either. Chances are that they will overlook opportunites because they expect the process to be the same as selling into larger customers. The truth is that many of these companies will grow out of the SMB market and become bigger fish in the BI pond.

Steve Dine
sdine@datasourceconsulting.com


Posted December 21, 2008 2:04 PM
Permalink | No Comments |