25.08.2015 Views

In the Beginning was Information

6KezkB

6KezkB

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

according to equation (10). This is equal to <strong>the</strong> information contained in750,000 typed A4 pages each containing 2,000 characters.Example 2: The statistical information content of <strong>the</strong> Bible: The KingJames version of <strong>the</strong> English Bible consists of 3,566,480 letters and783,137 words [D1]. When <strong>the</strong> spaces between words are also counted,<strong>the</strong>n n = 3,566,480 + 783,137 - 1 = 4,349,616 symbols. The average informationcontent of a single letter (also known as entropy) thus amounts toH = 4.046 bits (see Table 1). The total information content of <strong>the</strong> Bible is<strong>the</strong>n given by I tot = 4,349,616 x 4.046 = 17.6 million bits. Since <strong>the</strong> GermanBible contains more letters than <strong>the</strong> English one, its information contentis <strong>the</strong>n larger in terms of Shannon’s <strong>the</strong>ory, although <strong>the</strong> actual contentsare <strong>the</strong> same as regards <strong>the</strong>ir meaning. This difference is carried toextremes when we consider <strong>the</strong> Shipipo language of Peru which is madeup of 147 letters (see Figure 32 and Table 2). The Shipipo Bible <strong>the</strong>n containsabout 5.2 (= 994/191) times as much information as <strong>the</strong> EnglishBible. It is clear that Shannon’s definition of information is inadequateand problematic. Even when <strong>the</strong> meaning of <strong>the</strong> contents is exactly <strong>the</strong>same (as in <strong>the</strong> case of <strong>the</strong> Bible), Shannon’s <strong>the</strong>ory results in appreciabledifferences. Its inadequacy resides in <strong>the</strong> fact that <strong>the</strong> quantity of informationonly depends on <strong>the</strong> number of letters, apart from <strong>the</strong> language-specificfactor H in equation (6). If meaning is considered, <strong>the</strong> unit of informationshould result in equal numbers in <strong>the</strong> above case, independent of<strong>the</strong> language.The first four verses of <strong>the</strong> Gospel of John is rendered in three African andfour American languages in Table 2. <strong>In</strong> my book “So steht’s geschrieben”[“It is written”, G12, p 95 – 98] <strong>the</strong> same verses are given in 47 differentEuropean languages for purposes of comparison. The annotation “86 W,325 L” means that 86 words and 325 letters are used. The seventh languagein Table 2 (Mazateco) is a tonal language. The various values of Band L for John 1:1-4 are plotted for 54 languages in Figure 32. These 54languages include 47 European languages (italics) and seven African andAmerican languages. It is remarkable that <strong>the</strong> coordinates of nearly allEuropean languages fall inside <strong>the</strong> given ellipse. Of <strong>the</strong>se <strong>the</strong> Maltese languageuses <strong>the</strong> least number of words and letters, while <strong>the</strong> Shipipo <strong>In</strong>diansuse <strong>the</strong> largest number of letters for expressing <strong>the</strong> same information.The storage requirements of a sequence of symbols should be distinguishedfrom its information content as defined by Shannon. Storage spaceis not concerned with <strong>the</strong> probability of <strong>the</strong> appearance of a symbol, butonly with <strong>the</strong> total number of characters. <strong>In</strong> general 8 bits (= 1 byte) arerequired for representing one symbol in a data processing system. It followsthat <strong>the</strong> 4,349,616 letters and spaces (excluding punctuation marks)of <strong>the</strong> English Bible require eight times as many bits, namely 34.8 million.179

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!