25.08.2015 Views

In the Beginning was Information

6KezkB

6KezkB

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

The corresponding value for English is H 1 = 4.04577 bits per letter. Weknow that <strong>the</strong> probability of a single letter is not independent of <strong>the</strong> adjacentletters. Q is usually followed by u, and, in German, n follows e muchmore frequently than does c or z. If we also consider <strong>the</strong> frequency ofpairs of letters (bigrams) and triplets (trigrams), etc., as given in Table 4,<strong>the</strong>n <strong>the</strong> information content as defined by Shannon, decreases statisticallybecause of <strong>the</strong> relationships between letters, and we have:H 0 > H 1 > H 2 > H 3 > H 4 > ... > H ∞ . (13)With 26 letters <strong>the</strong> number of possible bigrams is 26 2 = 676, and <strong>the</strong>recould be 26 3 - 26 = 17,550 trigrams, since three similar letters are neverconsecutive. Taking all statistical conditions into consideration, Küpfmüller[K4] obtained <strong>the</strong> following value for <strong>the</strong> German language:H ∞ = 1.6 bits/letter. (14)For a given language <strong>the</strong> actual value of H 0 is far below <strong>the</strong> maximumvalue of <strong>the</strong> entropy. The difference between <strong>the</strong> maximum possible valueH max and <strong>the</strong> actual entropy H, is called <strong>the</strong> redundance R . The relativeredundance is calculated as followsr = (H max - H)/H max . (15)For written German, r is given by (4.755 – 1.6)/4.755 = 66%. Brillouinobtained <strong>the</strong> following entropy values for English [B5]:= 4.03 bits/letter,= 3.32 bits/letter,= 3.10 bits/letter,H ∞ = 2.14 bits/letter.H 1H 2H 3We find that <strong>the</strong> relative redundance for English, r = (4.755 - 2.14)/4.755= 55 % is less than for German. <strong>In</strong> Figure 32 <strong>the</strong> redundancy of a languageis indicated by <strong>the</strong> positions of <strong>the</strong> different points.Languages usually employ more words than are really required for fullcomprehensibility. <strong>In</strong> <strong>the</strong> case of interference certainty of reception isimproved because messages usually contain some redundancy (e. g. illegiblywritten words, loss of signals in <strong>the</strong> case of a telegraphic message, orwhen words are not pronounced properly).2. Syllables: Statistical analyses of <strong>the</strong> frequencies of German syllableshave resulted in <strong>the</strong> following value for <strong>the</strong> entropy when <strong>the</strong>ir frequencyof occurrence is taken into account [K4]:197

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!