Business Intelligence Network business intelligence resources

Blog: Dan E. Linstedt

« Where should Dirty Data be cleaned up? | Main | Nanohousing, an interesting thought on the way to dinner. »

Back to Appliances and Defining the World

Appliance is an over-used term. Face it, from a customer perspective, there is NO clear definition of what an appliance should be or should contain. To that end, I'll take an opinionated stand on this because it's gotten to be a thorn in my side again. Platforms are what we have today; appliances are something we need to grow into. The EDW/ADW market needs to learn lessons from "accepted appliances" in the market place, things like fire-walls, routers, hubs, etc...

With that, I will say in addition to a previous entry I blogged (an attempt at defining the EDW/ADW appliance), I will augment those definitions with these additional comments:

A TRUE Appliance (for EDW/ADW purposes):
* has not yet been developed.
* is NOT just an RDBMS slapped on top or coupled with hardware, this is just another RDBMS engine on top of hardware.
* is NOT just a simple black box with a SQL API.

To me, these are platform combinations - why? because I can put a database engine on my laptop, but does that make my laptop an appliance? No. Likewise I can put an RDBMS engine on a server, specific hardware, but does that make my server/hardware an appliance? NO.

So what is a TRUE appliance?
A TRUE APPLIANCE:
* has a Web-based Thin client GUI administration front-end (full BI capabilities)
* has a full API for reporting, logging, admin, reprogramming, configuration.
* has software embedded at the hardware / firmware levels.
* can do transformation, data mining, loading, and reporting.
* can notify end-users, administrators of security breaches, and "bad or suspect incoming data."
* has web-enabled firmware updates.
* is truly: plug and play.
* is NOT part of a cluster, it IS part of a grid.
* is self-contained.
* has 9 - 9's up-time, and never quits.
* has (near) linear scalability.

Let's give it some background:
If you think about a router with a fire-wall built in, it has all these services. Granted the BI is simplistic, but if an EDW or ADW is to call itself an appliance, these are the criteria by which I would judge it by. Think about it, data mining and security breaches is BI at it's best, and firewalls HAVE to stop unfriendly intrusions.

Why then is it SO HARD for the "data warehousing appliance market" to get a grip on this?
Because the data is 10x to 100x the volume (or more), it's quite simply more complex a task.

ADW/EDW is more complex, deals with more data (than a fire-wall), deals with different kinds of data - or does it? If we back away or maybe get close enough to see the grass blades, the data and the API's might have to be really simplistic to make an appliance work. Maybe there are classes of "actions" like usage, and configuration - underneath usage is: SQL, underneath SQL is ETL / ELT, Data Mining, UDF, Data Cleansing, Data Profiling, and REPORTING - all functions that should be driven through SQL ON THE APPLIANCE!! Underneath configuration are intrusion detection (data mining), real-time analysis, alerting, and so on - everything the appliance has to "report".

When we think about the FireWall, and security we think about complexity in a different light. The data mining in the firewall must include intrusion detection, virus protection, anti-spam protection, anti-Trojan horse protection, but really - much of this is data driven in configuration. The Firewall is an extremely advanced ANALYTIC APPLIANCE which we take for granted, it is an OPERATIONAL SYSTEM (even though it logs a historical trail of the items it catches), it still rolls them off. It provides alerts, reporting, fraud detection..

The ADW/EDW market should be listening and take a queue - perhaps even partner or acquire firewall device companies, what can happen is the ADW/EDW company can adapt the firewall technology to become a true "point-collection-ADW-device" by adding storage. It would keep a rolling history of data as transactional analysis - and roll the data into a centralized store. The centralized ADW/EDW would then be connected "live” rather than through software, through hardware/software combination - specialized that is to do specific tasks.

Do you have thoughts on how you define APPLIANCE for ADW/EDW? Please share

Thanks,
Dan Linstedt
CTO, Myers-Holum, Inc
http://www.MyersHolum.com

  Posted by Dan Linstedt on September 7, 2006 8:53 AM |

Post a comment