The.Algorithm.Design.Manual.Springer-Verlag.1998

The.Algorithm.Design.Manual.Springer-Verlag.1998 The.Algorithm.Design.Manual.Springer-Verlag.1998

18.04.2013 Views

Text Compression The best general-purpose program for text compression is gzip, which implements a public domain variation of the Lempel-Ziv algorithm. It is distributed under the GNU software licence and can by obtained from ftp://prep.ai.mit.edu/pub/gnu/gzip-1.2.4.tar. Unix compress is another popular compression program based on the patented LZW algorithm. It is available from ftp://wuarchive.wustl.edu/packages/compression/compress-4.1.tar. A JPEG implementation is available from ftp://ftp.uu.net/graphics/jpeg/jpegsrc.v6a.tar.gz. MPEG can be found at ftp://havefun.stanford.edu/pub/mpeg/MPEGv1.2.2.tar.Z. Algorithm 673 [Vit89] of the Collected Algorithms of the ACM is a Pascal implementation of dynamic Huffman codes, which is a one-pass, adaptive text compression algorithm. See Section for details on fetching this program. Notes: Many books on data compression are available, but we highly recoomend Bell, Cleary, and Witten [BCW90] and Storer [Sto88]. Another good source of information is the USENET newsgroup comp.compression. Check out its particularly comprehensive FAQ (frequently asked questions) compendium at location ftp://rtfm.mit.edu/pub/usenet/news.answers/compression-faq. Good expositions on Huffman codes [Huf52] include [AHU83, BR95, CLR90, Eve79a, Man89]. Expositions on the LZW [Wel84, ZL78] algorithm include [BR95]. There is an annual IEEE Data Compression Conference, the proceedings of which should be studied seriously before attempting to develop a new data compression algorithm. On reading the proceedings, it will become apparent that this is a mature technical area, where much of the current work (especially for text compression) is shooting for fairly marginal improvements on special applications. On a more encouraging note, we remark that this conference is held annually at a world-class ski resort in Utah. Related Problems: Shortest common superstring (see page ), cryptography (see page ). Next: Cryptography Up: Set and String Problems Previous: Approximate String Matching Algorithms Mon Jun 2 23:33:50 EDT 1997 file:///E|/BOOK/BOOK5/NODE205.HTM (4 of 4) [19/1/2003 1:32:11]

Cryptography Next: Finite State Machine Minimization Up: Set and String Problems Previous: Text Compression Cryptography Input description: A plaintext message T or encrypted text E, and a key k. Problem description: Encode T using k giving E, or decode E using k back to T. Discussion: Cryptography has grown substantially in importance in recent years, as computer networks have made confidential documents more vulnerable to prying eyes. Cryptography is a way to increase security by making messages difficult to read if they fall into the wrong hands. Although the discipline of cryptography is at least two thousand years old, its algorithmic and mathematical foundations have recently solidified to the point where there can now be talk of provably secure cryptosystems. There are three classes of cryptosystems everyone should be aware of: ● Caesar shifts - The oldest ciphers involve mapping each character of the alphabet to a different letter. The weakest such ciphers rotate the alphabet by some fixed number of characters (often 13), and thus have only 26 possible keys. Better is to use an arbitrary permutation of the letters, so there are 26! possible keys. Even so, such systems can be easily attacked by counting the frequency of each symbol and exploiting the fact that `e' occurs more often than `z'. While there are variants that will make this more difficult to break, none will be as secure as DES or file:///E|/BOOK/BOOK5/NODE206.HTM (1 of 5) [19/1/2003 1:32:13]

Text Compression<br />

<strong>The</strong> best general-purpose program for text compression is gzip, which implements a public domain<br />

variation of the Lempel-Ziv algorithm. It is distributed under the GNU software licence and can by<br />

obtained from ftp://prep.ai.mit.edu/pub/gnu/gzip-1.2.4.tar. Unix compress is another popular<br />

compression program based on the patented LZW algorithm. It is available from<br />

ftp://wuarchive.wustl.edu/packages/compression/compress-4.1.tar.<br />

A JPEG implementation is available from ftp://ftp.uu.net/graphics/jpeg/jpegsrc.v6a.tar.gz. MPEG can be<br />

found at ftp://havefun.stanford.edu/pub/mpeg/MPEGv1.2.2.tar.Z.<br />

<strong>Algorithm</strong> 673 [Vit89] of the Collected <strong>Algorithm</strong>s of the ACM is a Pascal implementation of dynamic<br />

Huffman codes, which is a one-pass, adaptive text compression algorithm. See Section for details on<br />

fetching this program.<br />

Notes: Many books on data compression are available, but we highly recoomend Bell, Cleary, and<br />

Witten [BCW90] and Storer [Sto88]. Another good source of information is the USENET newsgroup<br />

comp.compression. Check out its particularly comprehensive FAQ (frequently asked questions)<br />

compendium at location ftp://rtfm.mit.edu/pub/usenet/news.answers/compression-faq.<br />

Good expositions on Huffman codes [Huf52] include [AHU83, BR95, CLR90, Eve79a, Man89].<br />

Expositions on the LZW [Wel84, ZL78] algorithm include [BR95].<br />

<strong>The</strong>re is an annual IEEE Data Compression Conference, the proceedings of which should be studied<br />

seriously before attempting to develop a new data compression algorithm. On reading the proceedings, it<br />

will become apparent that this is a mature technical area, where much of the current work (especially for<br />

text compression) is shooting for fairly marginal improvements on special applications. On a more<br />

encouraging note, we remark that this conference is held annually at a world-class ski resort in Utah.<br />

Related Problems: Shortest common superstring (see page ), cryptography (see page ).<br />

Next: Cryptography Up: Set and String Problems Previous: Approximate String Matching<br />

<strong>Algorithm</strong>s<br />

Mon Jun 2 23:33:50 EDT 1997<br />

file:///E|/BOOK/BOOK5/NODE205.HTM (4 of 4) [19/1/2003 1:32:11]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!