Blog: Dan E. Linstedt Subscribe to this blog's RSS feed!

Dan Linstedt

Bill Inmon has given me this wonderful opportunity to blog on his behalf. I like to cover everything from DW2.0 to integration to data modeling, including ETL/ELT, SOA, Master Data Management, Unstructured Data, DW and BI. Currently I am working on ways to create dynamic data warehouses, push-button architectures, and automated generation of common data models. You can find me at Denver University where I participate on an academic advisory board for Masters Students in I.T. I can't wait to hear from you in the comments of my blog entries. Thank-you, and all the best; Dan Linstedt http://www.COBICC.com, danL@danLinstedt.com

About the author >

Cofounder of Genesee Academy, RapidACE, and BetterDataModel.com, Daniel Linstedt is an internationally known expert in data warehousing, business intelligence, analytics, very large data warehousing (VLDW), OLTP and performance and tuning. He has been the lead technical architect on enterprise-wide data warehouse projects and refinements for many Fortune 500 companies. Linstedt is an instructor of The Data Warehousing Institute and a featured speaker at industry events. He is a Certified DW2.0 Architect. He has worked with companies including: IBM, Informatica, Ipedo, X-Aware, Netezza, Microsoft, Oracle, Silver Creek Systems, and Teradata.  He is trained in SEI / CMMi Level 5, and is the inventor of The Matrix Methodology, and the Data Vault Data modeling architecture. He has built expert training courses, and trained hundreds of industry professionals, and is the voice of Bill Inmons' Blog on http://www.b-eye-network.com/blogs/linstedt/.

Everyone's blogged on this and quite a few more have offered their two cents on the topic. After reading about all of this, I figured maybe my voice could add a half-cent value to the noise out there. This is my opinion of what this acquisition means, and what Oracle really needs to do to solve their ailing sales issues in the EDW / ADW space. I recently wrote an article on Bill's newsletter about Oracle and Clustering that explains this core issue in a round-about manner. This entry is more direct.

Many companies are finding out that volume is a huge deal for them, as is real-time loads into an already overloaded and overwhelmed RDBMS engine. The problem is that companies are left with several key decisions to make:

1. Should we invest in a hardware solution? Is a single big box better than many small boxes? we want to consolidate... lower cost, etc... Big boxes are expensive.
2. We are on RDBMS X today, should we consider Oracle, DB2 UDB (DPF), Netezza, DatAllegro, Sand, Calpont, Teradata, SQLServer2005? What world-class appliance or RDBMS should we move to and why? Volume is pushing our current system to the brink of self-destruction...
3. Our ETL just doesn't seem to cut it anymore, running transformations in-stream with these volumes has overwhelmed the CPU and RAM resources. Or the other one: running ETL with all these "file staging" intermediate steps has overrun our CPU, RAM, and DISK resources. It simply takes too long. BUT we're caught in a catch-22, when we put the data into the RDBMS and try ELT, our system runs even slower... HELP! WHAT DO WE DO?

Then there's the everlasting questions:
a. We have EAI - our business wants SOA and web-services - should we just run all this stuff over EAI or a message queing system? OUCH - like it or not, you still suffer the volume problem.

It's all a shell-game, pushing the volume off legacy or mainframe, or collecting the volume off the web or gaining the volume from customer growth are all problems that lead to the same conclusion: Upgrade and change the architecture. But which part? What changes? And what does Oracle & Sunopsis have to do with this?

Ok - so that's what I've been hearing in the market space (amongst other things). Oracle has been under-fire from Teradata, Netezza, IBM, DatAllegro, Sand, and now Microsoft with SQLServer2005. Oracle in the context of OLTP (transaction processing) is an awesome database, and clustering for OLTP works wonders, especially in large clustered systems - but it drives up cost of support.

Oracle has had problem after problem with their Oracle Warehouse Builder (so many have discussed this, I don't feel I should give it time) - ranging from Metadata management, to user interfaces, to lack of complex transformation ability. But the real-killer to using Oracle as an EDW / ADW has been Volume issues. Processing Volume in the context of history (already with volume in the database - I'm not talking about loading an empty DB here), coupled with complex Transformation requirements has basically put Oracle out on a limb.

In my opinion, the problem stems from their clustering, rather than an MPP solution. Now, it's been said that they never had a "world-class" data integration / data migration ETL / ELT (ETLT) engine - so what did they do? They bought an up-and-comer who had a strong relationship with Teradata.

Sunopsis metadata leaves a lot to be desired, but being a young engine in the market place, they effectively leveraged their ability to generate highly specific Teradata SQL code to run transformations in the database (as ELT) to gain market share and visibility; particularly among the volume crowd in big systems. If the same customer were to put that level of volume on Oracle, then try to run ELT through Sunopsis on it, it would be Oracle's engine that would choke.

Now keep this in mind: Oracle EDW on a SINGLE BIG IRON BOX (without clustering) hums along just fine at a multi-terabyte level, and probably would meet this need just fine. Also, big iron keeps getting bigger and faster (Single SMP, or Mainframe with LPAR) we're talking 64 to 128 CPU's with 64 GB to 248 GB of RAM...

I'm not talking about 500GB of data here, I'm talking about 50 TB or more - of historical, and incoming information at 2TB a day - something that ELT is necessary for due to volume constraints.

Bottom line: I'm not going to speculate as to why Oracle thought buying Sunopsis would save it's EDW market, or gain market share for it... But I will say: Sunopsis is a life-preserver the size of an individual, when the whole ocean liner appears to be sinking.

Oracle needs to wake up! They MUST REWRITE THEIR DATA WAREHOUSE CORE ENGINE to be an MPP enabled solution, drop this cluster stuff (except for OLTP) - they need two separate core engines, one for data warehousing and one for transaction processing. Again, I'm talking enterprise class engines where this makes a difference - then and only then will the engine begin to support the volumes needed (at the performance needed) to make the "T" in ELT work properly, at that point they need to add a host of OLAP functions to operate within SQL (some they already have), and everything needs to run in parallel ALL the time on the enterprise platform.

When this has happened, they could be a formidable force to be reckoned with - and they just might be able to regain a foothold by producing a single suite of products (like IBM). They need to add metadata foundations to the Sunopsis product, add more MPP like functionality and parallel processing ability to the core-engine, and so on.

But I ramble. I realize this is fairly one-sided, but after years of Very Large Data Warehousing experience at executive level projects, I feel it had to be said.

I'd love to hear any counter-view points, or other thoughts you might have, If you work for Oracle - and are willing to share, let me know...

Thanks,
Dan L
CTO, Myers-Holum, http://www.MyersHolum.com


Posted October 19, 2006 6:05 AM
Permalink | 2 Comments |

2 Comments

Excellent point. The critical issue for Oracle is to change its core capability of the EDW engine. Currently, it still depends on its OLTP to do the OLAP. Only in this way, it can compete with its counterparts, like Teradata.

Oracle seems to be competing very well with Teradata right now - in spite of some widespread speculation and opinion to the contrary.

As for the technical solutions to business problems approach .. not even two cents of the wrong stuff is enough.

Leave a comment

    
Search this blog
Categories ›
Archives ›
Recent Entries ›