Links to the Columnar Database series:
This is a continuation entry in my series on columnar databases. Today, I will talk about how columnar databases address the modern architectural bottlenecks.
Row-based MPP is a large cluster of thousands of processors. Each node can be SMP shared-nothing or shared-disk. The interconnect connects the nodes, allowing for processing to span nodes. Row-based MPP architectures support some of the largest and most successful data warehouses in the world. And this is not going to change any time soon.
In row-based MPP, one of the most important design choices we make include how to distribute data amongst the nodes. Typically some form of 'randomization' is used, like hash or round-robin. We also have slowly made our block sizes larger over the years and the reason is the block is the unit of I/O (notwithstanding prefetch) and we want to get as much data as possible in a single I/O. I/Os have become the bottleneck in our environments.
While disk density has gone up significantly in the last 20 years or so, packing much more data down into smaller spaces, I/O is still limited by the physical head movement of the arm. Physics simply won't allow such a small component to move much faster without the arm flying right off its handle (which could ultimately be why solid state disk and phase change memory become the norm over time).
Other than the larger block sizes, our designs have not changed much over the years to accommodate this I/O-bound reality. We have tried to make the OLTP database more analytic with specialized indices, OLAP cubes, summary tables and partitioning, but we would need hundreds of drives to keep 1 CPU truly busy in a robust, complex, utilized data warehouse environment. It's not feasible. Incidentally, because of this bottleneck, random I/O has sped up much more slower over the years than sequential I/O, which doesn't require nearly as much head movement.
Naturally, on a project-by-project basis, you just make the best of the systems you have to work with. The reason to discuss this in my columnar series is because columnar is one technique that makes the most of the I/O.
Regardless of row- or column-orientation, the data is stored on disk and there are various gates that store and process data until it gets to the CPU. Each works on ever-decreasing data sizes. Think of it as a pyramid.
In row-based MPP, the entire row goes "up" the pyramid, creating bottlenecks at each level and, based on the architecture, causing rows to "skip" the screening potential of each level, especially the L2, whereupon the predicates end up being applied directly by the CPU. The processes are the same in columnar, but, as previously described in this series, only columns are passed up the pyramid, creating a clearer path to the CPU. This is one way to work around the I/O bottleneck - ask the I/O to process only the data that it needs!
In my next entry in this series, I will talk about the strategies columnar databases use to pull it all together and materialize a result set.
Posted May 17, 2010 9:01 PM
Permalink | No Comments |