My good friend Richard Winter just published a document about Oracle and Exadata and scalability. Don't take this the wrong way, but I believe the findings are lopsided at best. I hold Richard in the highest regards for exercising VLDB systems, but this report clearly is aimed at highlighting what Oracle does best - but it is missing crucial information about very large systems performance that I've been asking about for years.
You can read it for yourself. First, I have to give kudos and credit to Oracle for finally recognizing that Infiniband networking is needed for high band-width, and also that high speed disk (such as SATA or SCSI Internal) is also needed for Oracle to perform. These numbers of throughput are impressive. However the report itself fails to test the following components:
1) High Performance Batch Load - where are the performance numbers of high performance batch load, or of parallel loads executing against the device? How many parallel BIG batch loads can execute at once before the upper limits of the machine and Oracle are reached?
2) Performance of near-real time transaction feeds. How many feeds can be consumed? what's the maximum throughput rate? What's the upper limit for number of parallel feeds and number of transactions per second that can be "added" to the data warehouse?
3) Mixed workload performance tests. What happens to the query performance when either one or both of the above loads take place WHILE querying? How much is the impact to the system? What happens to the logs and the temp? Do we end up with CPU bound operations?
These are all things that Richard is very familiar with testing. I have a feeling that Oracle didn't sanction these tests, or that somehow they were simply "removed" from the paper. Again, Oracle marketing has stepped forward - it shows the Exadata appliance in the right light, but it doesn't have enough information to lead to sound decision making (in terms of: should we invest or purchase this appliance or not)?
One more piece I can't understand is the Star Schema that was put forward at the end of the report. What appear to be "dimensions" are EXTREMELY narrow, they almost look like fact tables. This star does not appear to me as any star I see on customer sites. The first FACT table appears to house data that is not "fact based", and is extremely wide. Of course Oracle will eat this up, as the dimensions can almost be "pinned in RAM". Where is the "type 2" nature of the data in the dimensions?
Typically at least we see a customer dimension with multiple versions of the customer address - then we apply that to millions of customer rows, but nope - the fact table is the only one with billions of rows embedded.
Ok, maybe I'm being to harsh, and if so - my appologies. But I'm just really frustrated with the marketing of all of these companies that say: "The worlds fastest and largest database appliance/engine...." and then fail to include the whole story.
What did you take away from reading the report? is it biased? is it one-sided? or is it spot-on and provide the full answers?
Posted March 5, 2009 4:16 AM
Permalink | 2 Comments |