Linear hash

From ArticleWorld


The linear hashing is a dynamic hash table algorithm that allows the table to be expanded one slot at a time. The linear hashing algorithm was was invented by Witon Litwin in 1980.

The algorithm

Like all other hash algorithms, the linear hashing algorithm uses a hash function that controls the address calculation. The address calculation is always a power of two in the case of linear hashing, determined using the formula: address(level, key) = hash(key) mod 2^level. The read and the expansion operations are controlled using a split variable.

Reading from a linear hashing table is done using the determined address (if it is greater than or equal) to the split variable. Otherwise, a new address is used, determined using an incremented level (address(level+1, key)).

Programmers can do some optimization by determining how often the expansion operations should be performed. In general, one expansion per insertion research is used, but using a load factor can also be a possibility.

Use

Linear hashing is generally used in interactive applications. Traditional hashing algorithms have the disadvantage taking the whole performance loss at once. Linear hashing balances the performance loss by spreading the heavy decrease of performance to each expansion step.

Nevertheless, linear hashing is a general-purpose hashing algorithm. Many programmers choose it even for non-interactive applications. It is also implemented in the Simple C++ Macro Library.

Disadvantages

There are some disadvantages associated with using linear hashing. The main problem is that the front partition's size is always a power of two. This means that the algorithm's performance can be heavily drawn back by a poorly designed or implemented hash function. Extra care has to be taken with these, so that the algorithm doesn't end up hashing only to a few buckets. This also makes the algorithm rather hard to optimize and/or debug at times.