Chances are that if you have data that's taking too long to load, monster queries that are taking too long to return and/or planning to add complex analytics to the mix, you are considering purchasing a data warehouse appliance. DW appliances leverage integrated hardware with a high-speed back-plane, embedded operating system, scalable storage and analytic database technology to deliver a compelling price to performance solution. For existing BI programs, a proof-of-concept is usually requested to have the vendor prove that their DW appliance will solve their biggest pain points. It's difficult not to get excited when that 3 hour query returns in 3 minutes. However, before you sign that contract, you may also want to ensure that the DW appliance:
- Is certified to work with your existing, or planned, front-end BI tools. Your BI tools may not generate SQL that is recognized by the appliance, be able to leverage the built-in database functions or support the same data types.
- Has native drivers that work with your existing, or planned, ETL tools. Many appliances require the use of their bulk loaders to move large volumes of data in and out of the database and are drastically slower with ETL tools. (note: a fast driver may not be necessary if they support push-down optimization, aka ELT)
- Supports correlated subqueries. While appliance vendors will correctly argue that correlated subqueries are inherently slow, due to the fact that they require nested loops, chances are that your BI tool will produce them.
- Is compatible with your data modeling tool. Many DW appliances are built upon open source databases, such as Postgres. Make sure your data modeling tool can reverse engineer and produce the DDL that these databases can consume.
- Automatically handle the addition of new nodes. Since most appliances utilize a massively parallel processing (MPP) architecture, data is distributed across the nodes to ensure balanced processing. However, this can cause challenges when adding new nodes and often require the unloading and reloading of the data.
The best approach is to include these items, or those that are critical to your requirements, in the proof-of-concept. While you can't mitigate all your risks, adding these items to your list of requirements will help to ensure a more successful implementation for you and your appliance vendor.
Posted March 21, 2010 9:03 PM
Permalink | No Comments |



