Notes

2023-07-14

It is folklore that, under some conditions, open-addressed hash tables with quadratic probing do not miss free slots. This note explains why with a proof.

Let me break things down a little. With hash tables, the prime problem to solve when implementing insertion is conflicts: You are handed a key value pair (K₁,V₁) and, after hasing the key K₁, you realize that the table slot designated by the hash value is already filled with a binding (K₂,V₂). Chaining implementations will just cram the binding in the slot anyway, hence creating a bucket with more than one binding in a single table slot. Open-addressing implementations will instead go look for another slot. One obvious strategy for searching is to go look at the next slot, and the next one, and so on, potentially wrapping around when the last slot of the table is visited. Quadratic probing offers an alternative scheme: instead of scanning slot h+i mod N at the i-th step, it scans the slot h+P(i) mod N where P is a polynomial function of degree 2. This strategy helps with the clustering problem that linear probing may create.

The polynomial P of interest to us here is P(X) = (X + X²)/2. The polynomial P outputs so-called triangular numbers (they can be stacked in a triangle shape), and the sequence P(0), P(1), P(2), ... is easily computed:

p_i = 0;
for (i=0; i<N; i++) {
	assert(p_i == (i + i*i)/2);
 	p_i += i;
}

But, more importantly, if the table size is a power of 2, and there is a single free slot, quadratic probing will find it. It is not an obvious fact and I prove it below.

Formally, we want to show that as i ranges over the interval I=[0,2ⁿ), h+P(i) mod 2ⁿ ranges over I as well. Said differently, we want to show that the function mapping x to h+P(x) mod 2ⁿ is a bijection over I. Since I is finite, it is sufficent to show that the map is injective; that is, it maps different values of I to different values.

To prove injectivity we consider i and j in I such that h+P(i) mod 2ⁿ = h+P(j) mod 2ⁿ and show that i must be equal to j. We first reason by equivalence:

   (h + P(i) mod 2ⁿ) = (h + P(j) mod 2ⁿ)
⟺ (h + (i + i²)/2) - (h + (j + j²)/2) = k2ⁿ
⟺ (i + i²) - (j + j²) = k2ⁿ⁺¹
⟺ (i - j)(1 + i + j) = k2ⁿ⁺¹

Breaking the products on both sides we get that there must be three integers k₁, k₂, and m such that:

A := i - j     = k₁2ᵐ           with k = k₁k₂
B := 1 + i + j = k₂2ⁿ⁺¹⁻ᵐ        and 0 ≤ m ≤ n + 1

We now consider three cases:

If m is not 0 nor n+1 then it must be that both A and B are even. That is however impossible because B = 1 + 2j + A, so when A is even, B is odd, and vice versa.
If m is n+1, we must then have i - j = k₁2ⁿ⁺¹, but since i and j are in the interval I, i - j is in (-2ⁿ,2ⁿ), so k₁ can only be zero and thus i equals j.
If m is 0, we again use a size argument. Since i and j are in I, we have that 1 + i + j is in (0,2ⁿ⁺¹-1). So k₂ must be positive and less than 1; these requirements prove that the present case is impossible.

The only possible outcome for the case analysis is thus that i equals j. Unwinding everything, the map h+P(x) mod 2ⁿ is injective on I, so it is also bijective on I, so when i ranges from 0 to 2ⁿ-1, the map ranges over all of I, and thus quadratic probing indeed vists all slots!