Blog: Steve Dine Subscribe to this blog's RSS feed!

Steve Dine

If you're looking for a change from analysts, thought leaders and industry gurus, you've come to the right place. Don't get me wrong, many of these aforementioned are my colleagues and friends that provide highly intelligent insight into our industry. However, there is nothing like a view from the trenches. I often find that there is what I like to call the "conference hangover." It is the headache that is incurred after trying to implement the "best" practices preached to your boss at a recent conference. It is the gap between how business intelligence (BI) projects, programs, architectures and toolsets should be in an ideal world versus the realities on the ground. It's that space between relational and dimensional or ETL and ELT. This blog is dedicated to sharing experiences, insights and ideas from inside BI projects and programs of what works, what doesn't and what could be done better. I welcome your feedback of what you observe and experience as well as topics that you would like to see covered. If you have a specific question, please email me at sdine@datasourceconsulting.com.

About the author >

Steve Dine is President and founder of Datasource Consulting, LLC. He has more than 12 years of hands-on experience delivering and managing successful, highly scalable and maintainable data integration and business intelligence (BI) solutions. Steve is a faculty member at The Data Warehousing Institute (TDWI) and a judge for the Annual TDWI Best Practices Awards. He is the former director of global data warehousing for a major durable medical equipment manufacturer and former BI practice director for an established Denver based consulting company. Steve earned his bachelor's degree from the University of Vermont and a MBA from the University of Colorado at Boulder.

Editor's Note: More articles and resources are available in Steve's BeyeNETWORK Expert Channel. Be sure to visit today!

When you mention BI in the Cloud amongst database vendors and BI practitioners, chances are you'll land on the topic of performance within the first minute or two of the conversation.  Until recently, there was little to debate as limited image sizes, slow storage and shared networks limited the performance of Hardware-as-a-Service (HaaS) based solutions. However, recently Amazon announced the availability of Cluster Compute Instances for Amazon's Elastic Computing Cloud (EC2).  Cluster Compute Instances provide more CPU than any other Amazon EC2 instance. Customers can also group Cluster Compute Instances into clusters allowing applications to get the low-latency network performance required for tightly coupled, node-to-node communication (typical of many BI architectures). Cluster Compute Instances also provide significantly increased network throughput making them well suited for customer applications that need to perform network-intensive operations.

It won't be long before purpose built Clouds start to hit the market, designed specifically to meet the high compute and i/o demands of BI.  Kognitio is already providing a SaaS based solution for hosted data warehouses and front-end BI applications.  From my standpoint, it won't be long before performance will drop off the list of top BI in the Cloud concerns and BI practitioners will wonder why they ever implemented and managed onsite BI architectures.

Posted August 3, 2010 5:38 PM
Permalink | No Comments |
I recently returned from the TDWI spring conference in Chicago, where I spent 3 days meeting attendees, sitting in on parts of some classes and also presented a course on BI in the Cloud.  I am always interesting assessing the trends in BI by listening to what the attendees have to say and gauging course interest.  While it's unscientific at best, it has been a fairly good predictor over the years of where the focus on BI has gone.

This year, the topics that seem to generate the most buzz were:

  • Agile BI
  • Data Visualization
  • Statistical Analytics
I spoke with a number of attendees that had mature BI programs and were interested in new ways to add value.  They had conquered reporting, OLAP analysis, events & notifications, dashboards and scorecards and felt as though their programs weren't keeping up with the current needs of their users.  Their users were becoming more sophisticated and looking for new ways to analyze their data.  They were also looking for ways to increase the velocity of their project iterations as users have started to look to software-as-a-service (SaaS) to address data and functionality gaps. 

The attendees that I spoke with that are new to BI were concerned primarily with the well publicized rate of failed BI projects.  Some worked for companies that had past failures with BI and were interested in how Agile project management might help.  They were also interested in technologies that might facilitate Agile projects.

Only time will tell whether Agile BI, data visualization and advanced analytics will become this year's major trends in BI.  However, it's exciting to see strong interest in areas that have the potential to change the way we deliver BI and the value of what we deliver.  What's interesting is that these aren't new concepts in the world of project management and data analysis, it's just that we may have matured to a point where they become front and center in our focus.

I'm interested in your thoughts on what you think will be the major trends in BI in 2010.  You can leave your comments below or feel free to email me at sdine@datasourceconsulting.com. 

Posted May 16, 2010 4:08 PM
Permalink | No Comments |

In the never ending quest to determine exactly why so many BI projects and programs still fail, regardless of everything we've learned over the years, another trend has hit my top 10 list.  It appears that in the quest to reduce cost and time on BI projects, there seems to be more cases where data is being consolidated rather than integrated.  What's the difference?  Data consolidation is the process of bringing together entities and attributes into a data warehouse, each retaining their form, value and technical characteristics as they exist in the source.  Data integration is the process of combining entities and attributes such that they have common form, meaning and technical characteristics.  (See graphic)

Data_Consolidation_Graphic-1.png

Why does data consolidation present a challenge in the data warehouse, and/or data mart?  It reduces the ability of users to understand and analyze the data. Users are left to try and determine which attributes and values mean the same thing, and often attach different meaning than their colleagues.  It adds time to every analysis and often reduces the value of the result.  I usually find that companies try and compensate by building the complex integration rules into the semantic layers of the front end tools.  Aside from the additional time required to create the model, it can significantly slow down response time, especially if the rules are being processed by the BI tool instead of the database.

Why do companies end up with data consolidations rather than data integration?  In my experience, there are a number of reasons.  One cause is an inexperienced DW data modeler.  Often data modelers have more experience in the OLTP world than the DW world.  They reverse engineer the sources and try to preserve all the attributes that meet reporting requirements.  Another reason is that it takes more time to integrate data.  It requires data profiling, data analysis and interaction with the subject matter experts.  There can also be political factors involved with reaching common meaning.  It's also difficult to integrate data.  Fortuntely, we've learned techniques over the years for handling different situations, like preserving source values while still providing integrated data.  Lastly, a DW may start out with only a single source and the data model is not created to accomodate integrated data. 

If your BI program isn't delivering the value that was promised, maybe the issue is with what your delivering. 


You can also follow me on Twitter @steve_dine


Posted March 30, 2010 8:46 AM
Permalink | No Comments |

Chances are that if you have data that's taking too long to load, monster queries that are taking too long to return and/or planning to add complex analytics to the mix, you are considering purchasing a data warehouse appliance.  DW appliances leverage integrated hardware with a high-speed back-plane, embedded operating system, scalable storage and analytic database technology to deliver a compelling price to performance solution.  For existing BI programs, a proof-of-concept is usually requested to have the vendor prove that their DW appliance will solve their biggest pain points.  It's difficult not to get excited when that 3 hour query returns in 3 minutes.  However, before you sign that contract, you may also want to ensure that the DW appliance:

  1. Is certified to work with your existing, or planned, front-end BI tools.  Your BI tools may not generate SQL that is recognized by the appliance, be able to leverage the built-in database functions or support the same data types.
  2. Has native drivers that work with your existing, or planned, ETL tools.  Many appliances require the use of their bulk loaders to move large volumes of data in and out of the database and are drastically slower with ETL tools.  (note: a fast driver may not be necessary if they support push-down optimization, aka ELT)
  3. Supports correlated subqueries.  While appliance vendors will correctly argue that correlated subqueries are inherently slow, due to the fact that they require nested loops, chances are that your BI tool will produce them.
  4. Is compatible with your data modeling tool.  Many DW appliances are built upon open source databases, such as Postgres.  Make sure your data modeling tool can reverse engineer and produce the DDL that these databases can consume.
  5. Automatically handle the addition of new nodes.  Since most appliances utilize a massively parallel processing (MPP) architecture, data is distributed across the nodes to ensure balanced processing.  However, this can cause challenges when adding new nodes and often require the unloading and reloading of the data.

The best approach is to include these items, or those that are critical to your requirements, in the proof-of-concept.  While you can't mitigate all your risks, adding these items to your list of requirements will help to ensure a more successful implementation for you and your appliance vendor. 


Posted March 21, 2010 9:03 PM
Permalink | No Comments |

I was recently invited to the corporate offices of Lyzasoft for an inside look at their new 2.0 release.  Aside from the impressive advances in charting and visualizations, I was especially blown away by their augmentation of collaboration features to their shared portal, called Lyza Commons.  They provide a web environment that allows authorized users to search, bookmark, combine, create, tag, share, comment, rate, relate, and interact directly with intelligence content in the form of blogs, microblogs, charts, tables, dashboards, and collections.  More simply put, they enable the development and growth of analytic communities.

The enterprise BI vendors have essentially ignored adding the collaborative capabilities that have permeated their way into nearly every organization via wikis, blogs, forums, instant messaging and podcasts.  Why is collaboration in BI so compelling?  Up to this point, reporting and analysis has essentially been a single threaded activity.  We've been unable to share our findings in a way that allows others to contribute their knowledge and inisight to the overall analysis.  If we begin to treat the development of a report or analysis as a starting point rather than an end point, we have the ability to actually share information through interaction instead of delivery.  At the very least, it brings BI to the masses through interfaces they are already used to utilizing.

Is collaboration the next big thing in BI?  I usually don't make predictions but I do think that the way we deliver BI will change drastically in the next 12 to 24 months.  Instead of concentrating on dumbing down BI through flash based dashboards and bursted reports, we'll actually start to focus on the intelligence in BI through intuitive interfaces and shared insights. 


Posted February 28, 2010 8:44 PM
Permalink | No Comments |

1 2 NEXT