25.08.2015 Views

In the Beginning was Information

6KezkB

6KezkB

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

with equal frequency, <strong>the</strong>n sequences of letters (output A in Figure 38) areobtained which do not at all reflect <strong>the</strong> simplest statistical characteristicsof German, or English, or any o<strong>the</strong>r language. Seen statistically, we wouldnever obtain a text which would even approximately resemble <strong>the</strong> morphologicalproperties of a given language.One can go a step fur<strong>the</strong>r by writing program (2) which takes <strong>the</strong> actualfrequency of letter combinations of a language into consideration (Germanin this case). It may happen that <strong>the</strong> statistical links between successiveletters are ignored, so that we would have a first order approximation.Karl Küpfmüller’s [K4] example of such a sequence is given as output B,but no known word is generated. If we now ensure that <strong>the</strong> probabilities oflinks between successive letters are also accounted for, outputs C, D, andE are obtained. Such sequences can be found by means of stochasticMarkov processes, and are called Markov chains.Program (2) requires extensive inputs which take all <strong>the</strong> groups of letters(bigrams, trigrams, etc) appearing in Table 4, into account, as well as <strong>the</strong>irprobability of occurrence in German. With increased ordering, syn<strong>the</strong>ticwords arise, some of which can be recognised as German words, but structureslike “gelijkwaardig”, “ryljetek”,and “fortuitousness” are increasinglyprecluded by <strong>the</strong> programming. What is more, only a subset of <strong>the</strong> morphologicallytypical German sounding groups like WONDINGLIN,ISAR, ANORER, GAN, STEHEN, and DISPONIN are actual Germanwords. Even in <strong>the</strong> case of <strong>the</strong> higher degree approximations one cannotprevent <strong>the</strong> generation of words which do not exist at all in speech usage.A next step would be program (3) where only actual German syllablesand <strong>the</strong>ir frequency of occurrence are employed. Then, in conclusion, program(4) prevents <strong>the</strong> generation of groups of letters which do not occurin German. Such a program requires a complete dictionary to be stored,and word frequencies are also taken into account (first approximation). Asa second approximation <strong>the</strong> probability of one word following ano<strong>the</strong>r isalso considered. It should be noted that <strong>the</strong> programs involved, as well asFigure 38: “Language syn<strong>the</strong>sis” experiments for determining whe<strong>the</strong>r informationcan arise by chance.Sequences of letters, syllables, and words (including spaces) are obtained bymeans of computer programs. The letters, all combinations of letters, syllables,and words (a complete German lexicon) were used as inputs. Their known frequenciesof occurrence in German texts are fully taken into account in this “languagesyn<strong>the</strong>sis”. The resulting random sequences A to I do not comprise information, inspite of <strong>the</strong> major programming efforts required. These sequences are semanticnonsense, and do not correspond with any aspect of reality.203

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!