16 June 2026

One thought that i should have capitalised on in the "Linear-time KL-optimal frequency normalisation" paper:

The core of Huffman's algorithm is a KL-optimal frequency normaliser from any distribution to the dyadic distribution. once any PMF is turned dyadic (1/2^{-x}), lengths are obvious and thus codes are also obvious via canonical coding.

My paper is a KL-optimal frequency normaliser from (after minor adjustments) any distribution to any distribution. So in principle we can turn the result from my paper into an optimal prefix code as follows:

  • ask the O(n) algorithm for quantisation to 1/2^{-x_i}
  • -\log_2 x_i are integer; computable via __builtin_clz. standard canonical code algorithm provides an onto-mapping from lengths to codes.

so the paper is in fact the unifying theory of huffman (1/2^{-x_i}), ANS (x_i/2^N) and arithmetic coding (x_i/\sum x_i).

< back to journal