18.04.2013 Views

The.Algorithm.Design.Manual.Springer-Verlag.1998

The.Algorithm.Design.Manual.Springer-Verlag.1998

The.Algorithm.Design.Manual.Springer-Verlag.1998

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Dictionaries<br />

easily experiment with different implementations of your dictionary.<br />

In choosing the right data structure for your dictionary, ask yourself the following questions:<br />

● How many items will you typically have in your data structure? - Will you know this number in<br />

advance? Are you looking at a problem small enough that the simple data structure will be best, or<br />

will it be so large that we must worry about using too much memory or swapping?<br />

● Do you know the relative number of insertions, deletions, and search queries? - Will there be any<br />

modifications to the data structure after it is first constructed, or will it be static from that point<br />

on?<br />

● Do you have an understanding of the relative frequency with which different keys will be<br />

accessed? - Can we assume that the access pattern will be uniform and random, or will it exhibit a<br />

skewed access distribution (i.e. certain elements are much more popular than others) or a sense of<br />

locality (i.e. elements are likely to be repeatedly accessed in clusters, instead of at fairly random<br />

intervals). Usually, the world is both skewed and clustered.<br />

● Is it critical that individual operations be fast, or only that the total amount of work done over the<br />

entire program be minimized? - When response time is critical, such as in a program controlling a<br />

heart-lung machine, you can't wait too long between steps. When you have a program that is doing<br />

a lot of queries over the database, such as identifying all sex offenders who happen to be<br />

Republicans, it is not so critical that you pick out any particular congressman quickly as that you<br />

get them all with the minimum total effort.<br />

Once you understand what your needs are, try to identify the best data structure from the list below:<br />

● Unsorted linked lists or arrays - For small data sets, say up to 10 to 20 items, an unsorted array is<br />

probably the easiest and most efficient data structure to maintain. <strong>The</strong>y are easier to work with<br />

than linked lists, and if the dictionary will be kept this small, you cannot possibly save a<br />

significant amount of space over allocating a full array. If your dictionary will be too much larger,<br />

the search time will kill you in either case.<br />

A particularly interesting and useful variant is a self-organizing list. Whenever a key is accessed<br />

or inserted, always move it to head of the list. Thus if the key is accessed again in the near future,<br />

it will be near the front and so require only a short search to find it. Since most applications<br />

exhibit both uneven access frequencies and locality of reference, the average search time for a<br />

successful search in a self-organizing list is typically much better than in a sorted or unsorted list.<br />

Of course, self-organizing data structures can be built from arrays as well as linked lists.<br />

● Sorted linked lists or arrays - Maintaining a sorted linked list is usually not worth the effort<br />

(unless you are trying to eliminate duplicates), since we cannot perform binary search in such a<br />

data structure. A sorted array will be appropriate if and only if there are not many insertions or<br />

deletions. When the array gets so large that it doesn't fit in real memory, think B-trees instead.<br />

● Hash tables - For applications involving a moderate-to-large number of keys (say between 100<br />

and 1,000,000), a hash table with bucketing is probably the right way to go. In a hash table, we<br />

file:///E|/BOOK/BOOK3/NODE129.HTM (2 of 5) [19/1/2003 1:30:03]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!