Blog: Steve Dine Subscribe to this blog's RSS feed!

Steve Dine

If you're looking for a change from analysts, thought leaders and industry gurus, you've come to the right place. Don't get me wrong, many of these aforementioned are my colleagues and friends that provide highly intelligent insight into our industry. However, there is nothing like a view from the trenches. I often find that there is what I like to call the "conference hangover." It is the headache that is incurred after trying to implement the "best" practices preached to your boss at a recent conference. It is the gap between how business intelligence (BI) projects, programs, architectures and toolsets should be in an ideal world versus the realities on the ground. It's that space between relational and dimensional or ETL and ELT. This blog is dedicated to sharing experiences, insights and ideas from inside BI projects and programs of what works, what doesn't and what could be done better. I welcome your feedback of what you observe and experience as well as topics that you would like to see covered. If you have a specific question, please email me at sdine@datasourceconsulting.com.

About the author >

Steve Dine is President and founder of Datasource Consulting, LLC. He has more than 12 years of hands-on experience delivering and managing successful, highly scalable and maintainable data integration and business intelligence (BI) solutions. Steve is a faculty member at The Data Warehousing Institute (TDWI) and a judge for the Annual TDWI Best Practices Awards. He is the former director of global data warehousing for a major durable medical equipment manufacturer and former BI practice director for an established Denver based consulting company. Steve earned his bachelor's degree from the University of Vermont and a MBA from the University of Colorado at Boulder.

Editor's Note: More articles and resources are available in Steve's BeyeNETWORK Expert Channel. Be sure to visit today!

Recently in Hardware Category

Chances are that if you have data that's taking too long to load, monster queries that are taking too long to return and/or planning to add complex analytics to the mix, you are considering purchasing a data warehouse appliance.  DW appliances leverage integrated hardware with a high-speed back-plane, embedded operating system, scalable storage and analytic database technology to deliver a compelling price to performance solution.  For existing BI programs, a proof-of-concept is usually requested to have the vendor prove that their DW appliance will solve their biggest pain points.  It's difficult not to get excited when that 3 hour query returns in 3 minutes.  However, before you sign that contract, you may also want to ensure that the DW appliance:

  1. Is certified to work with your existing, or planned, front-end BI tools.  Your BI tools may not generate SQL that is recognized by the appliance, be able to leverage the built-in database functions or support the same data types.
  2. Has native drivers that work with your existing, or planned, ETL tools.  Many appliances require the use of their bulk loaders to move large volumes of data in and out of the database and are drastically slower with ETL tools.  (note: a fast driver may not be necessary if they support push-down optimization, aka ELT)
  3. Supports correlated subqueries.  While appliance vendors will correctly argue that correlated subqueries are inherently slow, due to the fact that they require nested loops, chances are that your BI tool will produce them.
  4. Is compatible with your data modeling tool.  Many DW appliances are built upon open source databases, such as Postgres.  Make sure your data modeling tool can reverse engineer and produce the DDL that these databases can consume.
  5. Automatically handle the addition of new nodes.  Since most appliances utilize a massively parallel processing (MPP) architecture, data is distributed across the nodes to ensure balanced processing.  However, this can cause challenges when adding new nodes and often require the unloading and reloading of the data.

The best approach is to include these items, or those that are critical to your requirements, in the proof-of-concept.  While you can't mitigate all your risks, adding these items to your list of requirements will help to ensure a more successful implementation for you and your appliance vendor. 


Posted March 21, 2010 9:03 PM
Permalink | No Comments |