Business Intelligence Network
business intelligence resources

Blog: Dan E. Linstedt

« Disk and VLDW: What you need, can use | Main | Database Specifics: VLDW & Switches, making it work »

Databases and VLDW: Petabyte Scalability

If you've been following along, you've noticed that I've been writing about Terabyte (50+) to Petabyte levels. In this entry, it's no different. I'll discuss database engines generically, and their usage as a VLDB component. I'll use another entry to discuss appliances (separately) that contain embedded database engines. Databases all have their quirks and notions, but there are some generalities that simply stick - that you cannot ignore, that most people have ignored over the years... I'll try to dispell these false-hoods, and bring the truth to the table.

What are the issues around BIG DATA and RDBMS engines, especially when it comes to VLDW (Data warehousing)?

1. Most database engines do _not_ support temporal mathematics within their query optimizers. The issue? EDW's contain TEMPORAL data!!
2. Most database engines do _not_ work well with NATURAL BUSINESS KEYS. The issue? Data warehousing SHOULD be all about the business and the business keys.
3. Most database engines perform in-memory caching. The issue? They cannot handle the volume of data that needs to be "compressed/cached/stored/referenced" in RAM when we get into VLDW.
4. Most database engines were WRITTEN (core engine that is) for OLTP (transaction processing), with EDW's as an after-thought, an add-on to meet growing demand. The issue? This _severely_ limits performance of the optimizers, and often lacks parallelism when needed by the queries.
5. Most database engines CANNOT handle FAST/PARALLEL updates. This is a REQUIREMENT of a star-schema, or any "data distribution system", or ODS. (see my comment about deletes, I'm not talking about one single mass-update...)
6. Most database engines CANNOT perform FAST/PARALLEL deletes. I'm not talking about a single grouped delete statement, I'm talking about 100,000 individual row specificiation deletes - there's no "fast interface" in which to QUEUE the data set/keys/where clauses, and run these in parallel.
7. Most database engines DO NOT provide an EASY way to compare column values on a row by row basis IN BULK, FAST LOADING. The don't even address duplicates on the incomming stream... The problem? 90% or better of the incomming data within a VLDW usually HAS duplicates, even if the source system/source CDC engine told us that there's an update.
8. Most database engines DO NOT provide row-based temporal compression/expansion, especially while keeping the data set useful. The problem? Very large data sets can usually be compressed (according to flat-file compression rates I get on incomming sets) anywhere between 60% and 90% for structured data. This could significantly improve performance.
9. Most database engines DO NOT tell you that the DISK I/O performance is too slow.
10. Most DBMS engines DO NOT automatically select "parallel paths" to the disk, nor do they (when the install) profile the system enough to figure out WHERE to put logs/temp and data.

* Appliances answer some of these questions for us... Others are "managing persons" related...

Ok, common operator/installation flaws or mistakes we make...

1. Allowing Default Installation parameters to RULE the instance. If you have 40GB of RAM, and 18 CPU's when you install, the RDBMS engine installer will setup to use 100% of the available resources IF or WHEN pushed to the load limit. The PROBLEM with this, is that there is NO ROOM for the operating system to breathe under load, and when your in a VLDW, you've always got LOAD!! (usually). The solution? Set the database to be 1/2 or possibly 80% of the available resources. ALWAYS leave 20% available for other applications INCLUDING the operating system. THEN tune the database to effectively use all 100% of that "80%" of the hardware when pushed to operating limits.

2. Installing "more than one" instance of a single database engine on an underpowered server. Typically too many people make this mistake in the name of "recovery" or "fault-tolerance." This is a problem, it not only eats up the resources, but does NOT handle fault-tolerance very well. Why? because what usually fails are the hardware/networking layers, or the CPU's quickly become over-loaded and rebooting the machine is necessary. PLEASE PLEASE PLEASE in a VLDW system, Consolodate the "instances" to a SINGLE instance, multiple databases, and manage a core set of services. The exceptions to this rule are: Teradata, and IBM DB2 which is smart enough to know to launch only ONE set of core-operating services.

What happens when I make this mistake?
(when pushed to limits, or operating at 80% or above in capacity) The RDBMS engines:
a. begin to fight with each other (see #1 on the list), thus starting "b" below...
b. can and WILL re-prioritize their CORE services "nice-ness" and WILL take over the machine, pushing even the operating system OUT of execution range. This results in a hard-lock and cold-boot...
c. No-one gets the "most" of the resources available, parallelism is severly damaged and cannot be tuned properly for any instance, the hardware (even a large box like I described above) quickly is over-whelmed.
d. Large data sets require LARGE RAM CACHES and BUFFERING MECHANISMS. The database cannot properly "load-balance" queries, query types, and data retrievals (I/O channels) if too many instances are running.

3. We fail to PROPERLY tune the database for the hardware we're on, and we do not make allowances for network speeds, disk speeds, buffering, network traffic, and so on. The result is: the longer this goes, the more "fragmentation" this leads to in our VLDW data set. Fragmentation of the queries, fragmentation of the data in RAM, non-parallel operations, and non-optimized operations (more blocking I/O operations). Properly tuning requires an intimate knowledge of the hardware platform on which our RDBMS engines sit, along with base-line performance testing.

4. We fail to setup parallelism and partitioning according to time or nodes. In VLDW, the only way to manage data sets effectively is to partition by time (that is "load-time" or "load cycle number"), or even distribution by available MPP nodes. If you are running in an MPP environment, then data proximity and layout across the MPP nodes is critical in accordance with the MOST USED join columns, or better yet - an even distribution across parallel MPP nodes for optimal parallelism in any query. Why? because the join technology can be orchestrated through alternative means (like Join Indexes).

5. We under-power the hardware and expect the RDBMS engine to perform. RDBMS engines NEED LARGE MEMORY CACHES, and BIG BUFFERS to perform decently well in terabyte sized solutions. DON'T put a Windows Dual-Core server, 32 bit with 2 GB RAM in place with Oracle 9i and then try to place 30TB of data into the database and USE it as a VLDW... It just simply won't scale. On the flip side, DON'T count out the Windows 64 bit OS, and SQLServer 2005 64 bit system, they can run SUPER FAST if tuned properly, and if the hardware is there to support it.

6. Ok, we place 32 bit systems in play to do what really requires a 64 bit system to run with.

7. We place the Logs / temp / and data and indexes on the same DISK ARRAY, then we further damage ourselves by using RAID 5. Then we further damage ourselves by adding CLIENT TRAFFIC to the disk device, or even the network under which our database REQUIRES dedicated network access to the disk device. Oh yes, it gets' worse... We sometimes outsource our disk arrays to a "storage farm" and don't hold the outsourcers to an SLA for accurate performance, then they put our critical RDBMS engines on a SHARED DISK with other companies, and the network traffic has to load balance on the way in and out.

The real problem? All of these things limit parallelism... Once parallelism is limited, the performance drops like a rock in water. Factors of 4x, 6x, 12x, all the way to 100x, 200x performance is negatively affected. In VLDW solutions, you CANNOT "absolutely CANNOT" limit parallelism in any way.

What about the settings in the RDBMS?
That's for another entry. There are tons of common mistakes we make when we tweak databases without an understanding of VLDW (50+ Terabytes). Chances are, if you make these changes EVEN TO YOUR 100GB RDBMS you'll see tremendous performance improvements without changing hardware.

Comments? Thoughts? Did I leave something out? Please post your ideas.

Thank-you kindly,
Daniel Linstedt
Come get a Masters of Science in BI and DW at Daniels College of Business, DU.
http://www.COBICC.org

  Posted by Dan Linstedt on September 9, 2007 5:25 AM |

Post a comment