II. Notes on Data Structuring * - Cornell University
II. Notes on Data Structuring * - Cornell University
II. Notes on Data Structuring * - Cornell University
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
NOTES ON DATA STRUCTURING 149<br />
The representati<strong>on</strong> and manipulati<strong>on</strong> of powersets and mappings with<br />
infinite domains can be accomplished, provided that c<strong>on</strong>siderati<strong>on</strong> is re-<br />
stricted to sets with <strong>on</strong>ly a finite number of members, and mappings in which<br />
<strong>on</strong>ly a finite number of elements take significant values; where "significant"<br />
is defined as different from some specified null or default value. The powerset<br />
of an infinite set is obviously also infinite; but since each value of the powerset<br />
type c<strong>on</strong>tains <strong>on</strong>ly a finite number of elements, each value can be specified<br />
simply by listing those elements in a finite period of time, and the list will<br />
occupy <strong>on</strong>ly a finite amount of storage. Similarly, each value of a mapping<br />
type with infinite domain can be finitely specified by listing all elements of<br />
the domain which map <strong>on</strong>to significant values of the range type, together<br />
with the value mapped in each case. A type which is restricted in this way is<br />
known as sparse.<br />
In fact the c<strong>on</strong>cept of sparsity is not c<strong>on</strong>fined to infinite bases and domains;<br />
it may also be applied to very large but finite powersets, when the pro-<br />
grammer knows that each actual set in which he is interested will c<strong>on</strong>tain<br />
<strong>on</strong>ly a very small proporti<strong>on</strong> of the potential members. For example, the<br />
base type may c<strong>on</strong>tain hundreds of milli<strong>on</strong>s of values, but the programmer<br />
may know that he <strong>on</strong>ly has to deal with sets of less than a hundred in size,<br />
and perhaps most of them less than ten. It would be impossible to use the<br />
bitpattern representati<strong>on</strong>, since this requires hundreds of milli<strong>on</strong>s of bits;<br />
but since each value actually used in a program c<strong>on</strong>tains <strong>on</strong>ly a few members,<br />
these members can readily be listed in a comparatively small amount of<br />
store. A powerset type of this sort is known as sparse. Similarly, arrays<br />
with a very large domain, nearly all of which map <strong>on</strong>to the same default<br />
value of the range, are said to bel<strong>on</strong>g to a sparse array type.<br />
Sparse sets and arrays are frequently encountered in advanced data<br />
processing applicati<strong>on</strong>s, and their representati<strong>on</strong> and manipulati<strong>on</strong> present a<br />
number of familiar problems. Our first example is the definiti<strong>on</strong> of a type<br />
whose values are sets of car numbers. The cardinality of the carnumber type<br />
is perhaps something like four thousand milli<strong>on</strong>; but the programmer<br />
wishes <strong>on</strong>ly to deal with sets of cars owned by a single pers<strong>on</strong>; most of these<br />
will have <strong>on</strong>ly <strong>on</strong>e member, and very few will have more than ten. The<br />
carset type may therefore be declared as sparse powerset:<br />
type carset = sparse powerset carnumber;<br />
As an example of a sparse array, we may take the type of mappings<br />
between car owners and the set of cars they own. Each owner is represented<br />
by name and address; since these are of arbitrary length, the owner type<br />
may be defined:<br />
type owner = sequence character;