Business Intelligence Network
Business Intelligence Resources

Blog: Dan E. Linstedt

« Databases and VLDW: Petabyte Scalability | Main | Applications In General: VLDW and Machine Play »

Database Specifics: VLDW & Switches, making it work

So just how do you get a particular Database to work for you in a growing data environment? There must be something you can do right? Yes, there are certain switches, and specific knowledge which you need to focus on in order to get the RDBMS engines to scale. This entry is all about those specific databases, and their switches. While unfortunately (due to legal ramifications) I cannot discuss publicly the performance numbers (it's all in the fine-print of the licensing agreements they have), I can discuss the switches that make things fast. Remember, that unless you have the proper hardware, and the right architecture, these switch settings won't make a difference. In fact, they may even slow your system down.

There are TWO fundamental tenants to making ANY system perform:
1. Increase parallelism (or the opportunity for parallelism)
2. Decrease the data set that you are working with or on.

Ok - the third, is to do both of the above. This is where the orders of magnitude GAIN is achieved, and yes - this is AFTER you have already obtained the desired I/O performance from your system.

Yea, man... So what... So how do I get my system to work properly?
In short: hire a very knowledgable DBA, their rare and hard to find (and usually aged to some degree). (just kidding).

Ok, back to the REAL issues:

Oracle 8 & Oracle 9i
1. upgrade to 10g to get the optimizer features (which they did a lot of work on)
2. Alter the default parameters for the ORACLE instance see the following references:
http://www.oracle-base.com/articles/10g/PerformanceTuningEnhancements10g.php
http://www.dbazine.com/olc/olc-articles/ault7

Oracle default parameters on INSTALL look at the "whole machine", but the installer does NOT run any type of machine performance testing. It simply counts the number of CPU's and sets the Number of CPU's available to the MAX number of CPU's on the box. It also sums up the total amount of RAM available, and sets the SGA and total RAM to the total RAM on the box.

What does this mean? It means that when the instance is pushed to 80% or more of utilization, it will take ALL the resources available on the box. What Oracle doesn't publish is that it's core routines change the "niceness" prioritization of it's core routines, and push the Operating System out of the way... This can cause machine hard-lock and re-boot.

I will suggest that the previous post covers pretty much all databases in general, and what you should do with the RAM & CPU and Instance settings for Oracle, and other RDBMS engines.
3. The other parameters you should be tweaking include: # of read-ahead buffers, SGA, Sort Area Size, Bitmap Area Size, Parallelism, Size of Parallel Packets, Number of Log Threads, Number of I/O buffers, Number of I/O threads, where the LOG/TEMP files sit in relation to the Data and Indexing. Think about the partitioning of the tables, max degrees of parallelism for the queries and how these queries work.
For instance: Oracle 32 bit or 64 bit on a 32 bit windows platform (OS) is limited to a maximum parallelism of 4. This is severe in terms of handling large data sets. You need a lot more parallelism from the query set in order to make this work properly within big data. Recommendation: move to LINUX or UNIX with Oracle, and increase the parallelism to 8, 10, or 12 for starters.
4. Move from 32 bit Oracle to 64 bit Oracle. Most definately, take advantage of BIG RAM (buy more RAM for your box).
5. If you can, LOCK the SGA in to RAM, refuse to allow the RAM to be swapped to disk under any circumstance. This can kill performance
6. Put the TEMP space on a RAM DISK (screaming performance), but yet again, you have to buy more RAM.
7. Change the BLOCK SIZE. Enterprise Data Warehouses should use a MINIMUM block size of 32k. EDW's today, should actually ALWAYS be allocated at 64k, however once again, windows & oracle maximum block size is around 8k (due to windows 32 bit limitations - not sure what the limitation (if any) is in 64 bit).
8. use Oracle Bitmap indexing WHENEVER POSSIBLE.
9. Partition your tables, do it, do it, do it... Manage the partitions
10. DE-FRAGMENT, de-frag, de-frag. For any database, this can be a performance killer - especially in big systems. The complaint usually is: "Defragging always takes a long time" Well, if you defrag more often, there is "less" defragging to do, and it will always take "less" time.
11. Turn on Query Re-Write
12. Increase the number of "estimated plans" that Oracle is allowed to use to make it run fast.
13. Read, read, read, there are hundreds of knobs to turn, and even more opinions on which ones to use...
14. RECOMPILE YOUR VIEWS, after every major data change, recompile your views - it takes a fraction of a second, but will help performance tremendously.
15. PRE-EXTEND your file size BEFORE loading more data, this will help establish contiguous portions of your disk space, and will help manage/minimize fragmentation.
16. Put a single instance of Oracle on BIG IRON. Oracle does best when it has at least 10 to 12 CPU's and 16 to 64 GB of RAM to work with. Especially with Big Data, and it SCREAMS on LINUX with super-fast DASD in 64 bit mode.
13. DO NOT CLUSTER!!!
14. use RAW format as a last resort, Raid 0+1 should be good enough
15. SWITCH TO COST BASED OPTIMIZATION!!!!!!
16. Increase the TDU and SDU sizing in the Oracle Client that is doing the loading, but only for the "connection" which uses the Bulk Loaders or ETL engines that do the batch loading.

DB2 UDB - Upgrade to 9 if you're not there.
1. DO NOT USE GRID (YET) FOR HIGH SPEED LOADING, ONLY FOR HIGH SPEED QUERYING. They (IBM) are currently working on resolving some issues with loading in parallel across grid machines, they _are_ being dilligent about their efforts. I expect to hear that they've solved this problem shortly.
2. Super-Fast I/O Disk (400MB to 500MB per second)
3. Use LINUX 64 bit, tune the switches for Linux.
4. Put it on a Z-series or P-Series machine to make it scream
5. Setup huge block sizes
6. setup RAID 0+1, do NOT use Raid 5 (without enough Disk RAM CACHING to acheive proper throughput)
7. Use Bitmap indexing where possible
8. PARTITION, PARTITION, PARTITION
9. Provide enough CPU's to run queries in Parallel (usually 6 to 8 CPU's, high speed, multi-core, 64 bit).
10. IF YOU HAVE A GRID SYSTEM (MPP aka: DPF), then Load to ONE NODE, then use their buffering technology to disperse the data inside the grid. There are special INSERT INTO buffering commands that can only be issued from the control node.

SQLServer 2000: UPGRADE, UPGRADE, UPGRADE!!! 2005 IS A BLAST, and REALLY is fast.
1. You cannot afford to stay on SQLServer2000, the performance gains from 2005 are enough to make your mouth water. SQLServer has finally grown up. They still need some additional engineering work to make partitioning easier (configuring the partitions within a table IS A CHALLENGE), but if you can make it work, it's awesome.
2. NORMALIZE, NORMALIZE, NORMALIZE. Stick with normalization of your tables, take advantage of Clustered index joins across SURROGATE sequence numbers. SCREAMING performance. Denormalizing in SQLServer2005 is the kiss of slow performance. Why? Because SQLServer 32 bit block size is locked in to 8k. This is a severe performance inhibitor. As I said earlier in the Oracle section, all RDBMS engines for Warehousing should operate on no-less than 32k block sizes.
3. move to dual-core CPU's at 3.0 GHZ with high speed internal disk (10,000 RPM a pop), with Raid 0+1, and if you can get it - move to 64 bit, with Windows OS 64 bit - but beware: UPGRADE THE CPU & THE RAM, otherwise your system _will_ slow down.
4. USE FOREIGN KEYS instead of separate indexes for your joins. Foreign keys in SQLServer2005 are much faster. They've improved the optimizer quite a bit.
5. Increase the NETWORK PACKET SIZES on the server, and on DTS, BCP, SSIS, and other client applications. Be-warned: only do this for LOADING mechanisms. Increasing packet sizes for Reporting tools or BI Queries will not only take your database down and will eat memory, but may also take down the client software. Only the bulk-loading mechanisms should request a large packet size. Recommended packet sizes is 32k & 64k
6. Increase the "query reserved for client", but ONLY while loading - once loading, reset it back to 8k. Why? Because _every_ client that connects will receive these chunks of memory, and maybe more, especially if they connect more than once, or run multiple threads.
7. Recompile your views, after every load
8. Use clustered indexes, but ONLY on sequence numbers (identity columns)
9. Use fill-factors of 90% or higher for indexes, or data blocks
10. PRE-Allocate the DISK space before loading, do NOT allow the data files to self-extend, this can cause severe fragmentation.
11. Manage fragmentation
12. Move to COST BASE OPTIMIZATION
13. Get RID of all the hints in the SQL statements (you don't need most of them in SQLServer2005).
14. REMOVE ALL OTHER APPLICATIONS from the SQLServer Machine, leaving only the database on that machine.
15. TURN DOWN the amount of memory that SQLServer is allowed to take, to 75% or slightly more (80%), leaving the OS room to breathe.
16. Turn on USE FIBERS (unless you are "remote linking servers")
17. Do NOT remote link servers, this is a HUGE performance detriment.
18. Turn up "CPU priority"
19. DURING LOAD: Change the Recovery Model to BULK, turn on Select Into/BulkCopy, Turn on "Trunc. Log on Chkpt" then reverse these settings once you're finished loading.

Teradata:
1. UNDERSTAND HOW YOUR DATA MUST BE EVENLY SPREAD OUT ACROSS THE NODES, once even distribution has been achieved, the only things left to tune are:
2. How many nodes? how old/new? how fast are they?
3. PRIORITY of execution by user, and types of queries
4. Logging mechanisms of the table for redundancies.

Sybase is mostly the same as SQLServer

Hope this helps somewhat, have ideas? Please share them.

Dan Linstedt
http://www.COBICC.org

  Posted by Dan Linstedt on September 12, 2007 7:20 PM |

Comments

I've been following your blog for some time now and I find it very insightful and interesting. I've recently decided to try and apply your ideas regarding VLDW in my current enviroment. IBM P-series, AIX 5.3, DB2 DPF v9, DataStage Parallel engine and Microstrategy.

I would like to understand why you prefer Linux on a P-series instead of AIX for instance, any insightful information? Also I have tried to find information regarding bitmap indexes on DB2 but I havn't seen any indications that it exists. Are your sure that you've seen it?

In a previous post regarding OS you said that on Unix we must setup SWAP/TEMP 2x the amount of RAM (except AIX), what do you meen by this?

Hi Stefan,

P-Series and Z-Series mainframes running LPAR and VPAR technology is extremely powerful. The mainframe's LPAR/VPAR can actually run other operating systems along side of Mainframe systems. They can use the computing power to get the job done, besides that - they are on BIG-IRON, and dynamically allocated CPU's and RAM.

AIX is also impressive, if you have the right hardware underneath to support it.

I'll look into Bitmap indexes.... I specifically was referring to "DB2 UDB" universal database, Not "DB2 AS/400". I'll look around to see if I can find links to the information. I may have been suggesting a new feature that IBM needs to add.

Regarding swap/temp 2x the size of RAM except AIX: AIX is self-managing, and in most cases you are not allowed (even as an admin) to change the size of swap/temp space. AIX usually also locks out general users from accessing this space, therefore taking full control by the operating system. Other OS's allow the ADMIN to change these settings.

Cheers,
Dan L

Post a comment