Links to the Columnar Database series:
I received several direct questions on my latest blog posting regarding the row-oriented data page. Before I move on to more discussion of columnar, I thought I'd answer those questions here.
1. How does the DBMS know where fields begin that are after variable length fields since it will not be in the same place for every record?
It's correct that any field that follows a variable length field will not be in the same position on each record. The DBMS actually reads the field length of the variable length field in its first 2 bytes and uses that to "jump" to the start of the next field. All subsequent fields, provided they are not variable length also, will begin at fixed offsets from the end of the variable length field. This is why you should put variable length fields at or towards the end of the record and why you should only use variable length fields when the field size really does vary. Variable length fields save space, but keep in mind 2 bytes are added to each field for the length bytes.
2. If a field is null, wouldn't it be 0 or 1 bit in length?
First of all, nothing inside actual fields is a bit in length. Everything in the DBMS is done on byte (8 bit) boundaries. So, there will be at least 1 byte for each field, even if that field were a decimal (1), which could only contain values up to 9 and could be held in 4 bits. For nulls, there is an ADDITIONAL byte pre-pended to the field. If that byte is 'on', the field is null and the DBMS will ignore whatever leftover data may happen to be in the field. If the byte is 'off', the value is not null and the field value is legitimate. So, if a value is set to null, space is not saved! For any nullable field, you actually have to ADD a byte to storage. However, nullability does not make the column variable length (see answer #1 above.) Of course, a variable length column can be nullable.
3. What happens when a record is deleted?
Not much really happens when a record is deleted. There is a bit in the record header that is on/off depending on whether it's a valid record or not. That bit is set to 'off' when a record is deleted. I'll refer to invalid records as holes. Holes just sit there taking up space. Holes are actually linked together within the page! Why? Because if this is the page that the DBMS wants to insert a record, it can use an existing hole or compress holes together to make a bigger hole to use for the new record. If you have a clustering index, the DBMS MUST put the records in order. However, that 'order' is only according to the Row ID map. Physically on the page, they can still be out of order. This is fixed, as well as all holes removed, during reorganization processes, either explicitly called or, in the case of some DBMS, as background processes.
Posted April 30, 2010 8:08 AM
Permalink | No Comments |