6. My explanation of the Hu-Tucker Algorithm

The presentation I gave of the Hu-Tucker algorithm in class differed from Prof. Kleitman's last year. I think my explanation is somewhat simpler, so I'm writing it up and posting it on the class web page

The Hu-Tucker Algorithm: Shortest Alphabetical Codes.

Suppose that the blocks we are trying to compress have a natural ordering. In the case considered in the notes, they are bit strings of length k, and so are ordered as numbers.

We now ask, how can we construct a code that has minimum total message length as before, but where the ordering of the code words is required to be the same as the ordering of their corresponding blocks.

Of course the blocks also have frequencies as before and we now want to minimize the length of the total message, subject to the restriction that the codewords maintain the natural order of the blocks.

There is a neat algorithm for this problem, called the Hu-Tucker, though it is somewhat strange, and not at all easy to prove that it works. (The first few published proofs were wrong.)

Here is how it works:

There are three steps: First you merge vertices together, just as in the Huffman algorithm, except that the rules are slightly different.

Second, you use the resulting merge pattern to determine the length of each code word.

Third you construct the final tree from these lengths.

How does the merging step differ here?

In the previous problem we merged the two blocks that have the smallest frequencies together.

Now we introduce a notion of compatibility among blocks. Again, we will have original blocks and artificial merged blocks.

The compatibility rule is: you can only merge two blocks if there are no original blocks left between them. Thus if you have three blocks in order with frequencies 2,4,3 you cannot merge the 2 and 3 frequency blocks together unless the 4 frequency block is artificial.

And here is the rule for merging: if x is the lowest frequency block compatible to y and y is the lowest frequency block compatible to x , you should merge them.

So here is how the algorithm goes: you merge blocks together according to this rule until they all merge into one.; you keep track of the number of merges of each block, in order, which will be the lengths of the code words of the blocks,.

Then you construct an alphabetic tree having these lengths. There will be only one possible way to do this.

We have yet to describe how to perform this last step. But before doing so let us look at an example.

Suppose our frequencies are, in the order that we want to preserve:

1,2,23,4,3,3,5,19.

At this stage each block is compatible only with its immediate neighbor. The only pairs that obey our condition that x is the lowest compatible with y and vice versa are the first two and the 3,3 . we can merge each of these, getting

3, 23, 4, 6, 5, 19.

Now we can merge the the 3 and 23, and also the 4 and 5,  which we can locate where the old 4 was. (The actual location chosen doesn't matter, as long as it is between the original locations of the 4 and 5.)

26, 9, 6, 19

Here the restriction about compatibility no longer requires anything and merging is like it was in the Huffman algorithm.

Thus, next we can merge the 9 and 6 which gives (26,15,19), then the 15 and 19 and finally the whole thing together, getting (26,34) and then 60. We have now constructed the tree

We now have a tree, which may have some crossing edges in it. We look at the tree, and calculate the depth of each of the original blocks. This is the only information we will use from the tree. This gives us a list

1(3), 2(3), 23(2), 4(4), 3(4), 3(4), 5(4), 19(2).

where we have put the depth of the node in parentheses after the frequency of the block.

Now, we put down the nodes at the deepest level in the tree (which in our figures, is the highest level, since we are drawing the tree with the root at the bottom, so I'll use the words "closest to" and "farthest from" the root). We then connect them in pairs with no crossing edges:

Next, we put down the nodes at the next level in the tree, making sure that we keep them in the original order. That is, we want to put the nodes after all subtrees containing nodes which were before them in the original order, and before all subtrees containing nodes which were after them in the original order.

It's not immediately clear that this can always be done. If, for example, there was a subtree containing nodes lying both before and after one of the new nodes, we would be in big trouble. However, we can prove that no such subtree exists. We then again connect all the nodes on this level, both the original blocks and the combined blocks, in pairs using no crossing edges, and add the new nodes on the third level.

We now repeat construction of the tree, level by level, until we have an entire tree.

Note that all the nodes are at the same level as they were in the first tree (with the crossing edges) that we constructed, but now we have eliminated the crossing edges. The codeword corresponding to a node can be read off by proceeding from the root to the node, with a left branch being a 0 and a right branch being a 1. The frequencies and codewords for the blocks in our example are thus

1:000, 2:001, 23:01, 4:1000, 3:1001, 3:1010, 5:1011, 19:11

There are proofs that this method gives the best possible order preserving (prefix free) code, but they are surprisingly hard to find. There are two things to be shown here. First, that the construction of the Hu-Tucker coding tree, always works. That is, given the levels of the nodes obtained from the original tree, the new nodes on level i can always be inserted between the subtrees constructed up to level i, so that each new node is at the proper place in the order. Second, that this construction is the optimal order-preserving tree. We will give a proof of the first fact but not of the second.

We now show that the algorithm for constructing the Hu-Tucker tree given above always produces a tree with the nodes in the correct order. There are two important lemmas, neither of which is very hard, that I will let you prove. The first lemma says that if the pair of compatible nodes with smallest sum is merged at each step, then the interior nodes are constructed in increasing order of frequency. The second lemma says that if there is a node on some level i in the final Hu-Tucker whose subtree contains nodes both to the left of and to the right of a leaf C, then there is also such a node on level i in the first tree we constructed.

Now, suppose that the algorithm for constructing the Hu-Tucker tree does not work. This can only happen when there is some original leaf node C which is supposed to get added at level i, and you find a subtree (with root y, say) at level i which has leaves both before and after C in the original order. For example, in the figure below, if you want to insert C between B and E, you are in trouble.

So we want to show this can't happen. Now, either node y or some node in the subtree rooted at y will have been formed by joining two subtrees each lying entirely on one side of C. We will assume it was node y, but the proof in the other case is essentially the same. Node y must have been created after C was no longer a singleton leaf node, since we aren't allowed to join two nodes on opposite side of a leaf node. This must then have happened after C was joined to some other node x to form w. After w formed, u (left of C) and v (right of C) were joined across C to create y. Thus, by our lemma, we have freq(y) is larger than freq(w). In this proof, we'll assume that we don't have any ties in frequency. If we allow ties, then it turns out that we will need to figure out some consistent way to break them.

Now, we're almost done. Because freq(y) is larger than freq(w), and because after they are created, node y and node w are compatible with exactly the same nodes, node y will end up on a level at least as close to the root as node w (this can be proved by induction by showing that if y isn't joined to w, the parent of y must have frequency larger than the parent of w). This means that C will either be put on the same level as u and v, or it will be put on a level farther from the root. In either case, we don't have any trouble putting C down in an order consistent with all the already-placed leaves.

For the tie-breaking rule, we can use the rule that whenever we have two equal nodes, the one on the left is considered to have smaller frequency. If we don't have a consistent tie-breaking rule, and break ties arbitrarily, the proof above no longer works, and we can find a counterexample where the algorithm fails to construct an order-preserving tree.

Exercises:

1.  Show that in the Hu-Tucker algorithm, if the two compatible nodes with smallest sum of frequencies are merged at each step, then the non-leaf nodes are created from smallest frequency to largest.

2.  Find a set of frequencies in which, if the Hu-Tucker tree is constructed with the ties being broken improperly, the algorithm breaks and fails to produce an order-preserving tree.