### Errata list for Data Compression: The Complete Reference, 3rd Edition

Last Updated 23 Mar 2008

Error found by Cosmin Truta

Page 347, line -14: "The third step uses the absolute values of the three region numbers
Q_i (there are 365 of them, since one of the 729 values is zero) ..."

should be changed to:

"The third step maps the region numbers $Q_i$ to an integer $Q$ in the
range $[0 \ldots 364]$. The tuple $(0,0,0)$ is mapped to 0, and the
728 remaining tuples are mapped to $[1 \ldots \frac{728}{2}=364]$,
such that $(a,b,c)$ and $(-a,-b,-c)$ are mapped to the same value."

[The original text mentions the use of absolute values in the third step
which implies that a tuple like (-1,+2,-3) is mapped to the same value
of $Q$ as (+1,+2,+3). This is incorrect. In fact, (-1,+2,-3) is mapped
to the same value of $Q$ as (+1,-2,+3).]

Page 606, figure 5.62. The box under 18 and 3 should be

2(1,1) 2(1,1)

1(1,1) -2(1,1)
-------

Errors found by Stephan Wolf

page 648, line 7: "With high-resolution and 60 frames per second
the transmitter must be able to send 124,416,000 bits/sec (about 14.83 Mbyte/sec)"

should be (considering that each pixel is 24 bits)

"With high-resolution and 60 frames per second
the transmitter must be able to send 2,985,984,000 bits/sec (about 356 Mbyte/sec)"

Also, the six numbers in Table 6.7 on the same page, should be corrected to reflect the 24 bits per
pixel.

refresh rate
total #
lines x pixels  of pixels       24              30           60

1080 x 1920     2,073,600  1,194,393,600  1,492,992,000  2,985,984,000
720 x 1280        921,600    530,841,600    663,552,000  1,327,104,000

Table 6.7: Resolutions and Capacities of Six DTV Formats.

-------

Errors found by Yves Charton

page 695, caption to figure 7.1. "PSL" should be "SPL".

page 738, line 1: "II and II" should be "II and III".

------

Error found by Liyang HU

Page 856, glossary item for Bitplane: "A bi-level image, for example, consists of two bitplanes."
Should be "A bi-level image, for example, consists of one bitplane."

------

Error found by Frederic Kayser

Page 92, paragraph 3: The number 6'' in the example should be 3''. (See explanation of the repeat
count on page 21.) Thus, this paragraph should read:

Example: Suppose that the character with ASCII code 52 repeats six times. Stage 1 will generate
the four bytes 52, 52, 52, 3,'' and stage 2 will replace each with a token, will increment the
entry for 52'' (entry 53 in the table) by 3, but will not increment the entry for 3'' (which is
entry 4 in the table). (The three tokens for the three bytes of 52'' may all be different, since
tokens may be swapped after each 52'' is read and processed.)

-------

25 May 2005. Error found by Dan Pugh

Page 314: I coded the matrices of the LLM method and found an error in one of the matrices. Matrix
C3 should have its fourth row, seventh element changed from a one to a zero in order to simulate
properly. If you are interested, I have Matlab code that verifies the correction, in the
case that you decide for a fourth edition of this book. I have also noted that the result of the
matrix product is off by 1/(sqrt(8)), which is probably discussed in the original LLM paper.

-------

25 June 2005. Error found by Jason Corless

Page 230: Paragraph 4. "4-bit codes" should be "3-bit codes". Also "2, 4, 4, 4, 2, 4" should be "2,
3, 3, 3, 2, 3"

-------

3 Sept 2005. Errors found by Shashank Khanvilkar

Page 228: line -11. "A length of 12 becomes edoc 266 followed by a single bit 1..."
should be
"A length of 12 becomes edoc 265 followed by a single bit 1..."

Page 230: line 3.
"The first example employs a set of six symbols A-E with probabilities..."

should be
"The first example employs a set of six symbols A-F with probabilities.. "

Page 729: line 4. Reference [Pan and Hwang 96] should be [Pan 95].

Page 729: line -1.
".. Another resource is the association of audio engineers (AES) .."
should be
".. Another resource is the audio engineering society (AES).."

Page 733: Second last paragraph.
"...(The matrix is a modified version of the well-known DCT, so it is
called the MDCT matrix)"..

This should be

"...(The matrix is a variant of MDCT, which is a modified version of the
well-known DCT algorithm)"..

Page 739: Second paragraph.
"..The 32 values q-1 are then written, as 4 bit numbers,..."
should be
"..The 32 (q-1) values are then written, as 4 bit numbers,..."

Page 743: Equation for B_f

It is given  that B_f = bitrate/sampling rate.
However this cannot be correct. Assume a bitrate of 128 Kbps and a
sampling rate of 48Khz. In this case B_f will be ~2.67.
The correct equation should be
B_f = bitrate*(samples per frame)/sampling rate.
here samples per frame = 384 for Layer -I
and 1152 for layers II and III.

Page 749: Figure 7.41

I think the figure is incorrect. The MDCT windows need to be
overlapping. For example

_______36_______ _______36_______
|                |                 |
_______36_______
|                 |
========================================================
|        |        |        |       |
---18--- ---18--- ---18--- ---18---

For simplicity i have shown all windows to be long. If both long and
short windows are used, then special windowing function needs to be used
that can transition from long windows to short windows and vice versa.

Page 752: Second paragraph.

"...This is followed by 59 bits of .."

should be

"...This is followed by 136 bits (for single channel) and  256 bits (for
dual channel) of.."
Size of side info for MPEG/Audio Layer III, for single and dual channel
are listed below:

---------------------------------------------------------------
Field Name              |         Single channel | Dual Channel |
---------------------------------------------------------------
main_Data                            9                  9
private_bits=                        5                  3
scfsi:                               4                  8

part2_3length =                     12x2              12x4
big_values  =                        9x2               9x4
global_gain =                        8x2               8x4
scalefac_compress =                  4x2               4x4
preflag:                             1x2               1x4
scalefac_scale:                      1x2               1x4
count1table_Select =                 1x2               1x4
window_Switching_flag =              1x2               1x4
(dependent on window switching flag)

Total                               136 bits           256

-------

3 Sept 2005. Error found by nitinshetty?

Page 58: Par 1.

"The values of P(n) for n = 1, 2, ..., 10  are 0.973, 0.0263, 0.007, .........."

The above values are wrong. The values of p and q have been exchanged.

Given :

P(n) = q^(n-1) * p
p = 0.027027
q = 0.972973

Hence, for n = 1, P(n) = p = 0.027027

-------

24 Oct 2005. Error found by Boris Tripolski

Page 168: Section 3.2 (Simple Dictionary Compression) states that in
the worst case the compression ratio achieved is a least 0.9375 (besides the 257 Bytes for the
dictionary). This is wrong though, because this would first of all mean that any file greater
than 4kb, even a totally random one, could be compressed. As you will easily see, this is
impossible.

The error in the calculations resides in the probability for the codes you used. You have said that
the probability of the codes with the length of 4 bit is the same as the one for 11 bit.
Unfortunately there are only 2 codes with 4 bits, while there exist 128 codes with the length of
11. So the right compression ratio for the worst case scenario would be:

(2*4+2*5+4*6+8*7+16*8+32*9+64*10+128*11) /(256*8)=2562 / 2048=1.2509765625

this results in an expansion. I have tried this for several files and have gained an expansion as
expected.

to explain the probabilities here are the codes:
4-bit codes:
000|0 000|1
5-bit codes:
001|10 001|11
6-bit codes :
010|100 010|101 010|110 010|111
7-bit codes
011|1000 011|1001 011|1010 011|1011
011|1100 011|1101 011|1110 011|1111
and so on.

Then there is an improvement possible, in order to speed up the encoding: it is said that the
search of the bytes is too slow. A solution would be to create two arrays. One would hold the
probabilities/ frequency of the bytes, the other would be an array which maps the bytes to their
indices/positions in the dictionary. To encode, no searching of the entire table is necessary any
more, you can just lookup the position and encode directly.

-------

21 Mar 2006. Error found by Artur Ferreira

Page 705: Line 13. "The G.711 standard recommends the use of A=255" should be
"The G.711 standard recommends the use of A=87.6".

-------

15 Aug 2006. Errors found by Raymond Martin

Page 522: par 2. The sentence "(Strictly speaking, the Nyquist rate is the difference between the
maximum and minimum frequencies, the bandwidth of the signal.)" should be omitted.

Page 692: par 7. "periods" should be changed to "wavelengths".

Page 697: par 2. The solution to the sampling problem is to sample sound at a little over the
Nyquist frequency} (page 522), which is twice the maximum frequency contained in the sound. Thus,
if a sound contains frequencies of up to 2 kHz, it should be sampled at a little more than 4 kHz.
(A detail that's often ignored in the literature is that signal reconstruction is perfect only if
something called the Nyquist-Shannon sampling theorem [Wikipedia 03] is used.) Such a sampling rate
guarantees true reproduction of the sound. This is illustrated in Figure7.3c, which shows 10
equally-spaced samples taken over four periods. Notice that the samples do not have to be taken
from the maxima or minima of the wave; they can come from any point.

-------------

15 March 2007. Errors found by Pedro Moreira

page 86, 7th,8th paragraph (incomplete, incorrect discussion)
page 87 fig. 2.32

"Figure 2.32d shows the tree after the frequency of A has been
incremented to 4.
Once we decide that A is the current node, its frequency (which is still
3) is compared to
that of its successor (4), and the decision is not to swap. AŐs
frequency is incremented,
followed by incrementing the frequencies of its parents."

But, when updating the frequencies, the nodes at the second level of
tree (just below the root)
(see fig. 2.32d) become in descending order (they should have been
swapped, accordingly to the algorithm)
Figure 2.32 should also be modified in respect (from 2.32d to 2.32g).
(note that they are then swapped between figures 2.32f <-> 2.32g)
Consequently, the next paragraph incorrectly discusses the swapping of
nodes (it would not be needed).

-------------

23 March 2008. Error found by Benjamin Laugraud

Page 219, Table 3.30(b). The value "7" is missing. It should be in the same position as in part
(a).

-------------

3 Apr 2011. Error found by Frédéric Kayser.

Page 233, line -20. "This table has codes 0--256 for..." should be "This table has codes 0--255 for..."

--------

"How many letters are actually read into a word by a careless person who knows what to expect, who
sets out with the idea that the message is from a certain person, how many words into the sentence?
We guess as we read, we create; everything starts from an initial mistake; the mistakes that follow
(and not only in the reading of letters and telegrams, not only in reading as a whole),
extraordinary as they may appear to a person who has not begun at the same starting-point, are all
quite natural."

Marcel Proust, (The Sweet Cheat Gone, Chapter 3, Venice)