II. Notes on Data Structuring * - Cornell University

II. Notes on Data Structuring * - Cornell University II. Notes on Data Structuring * - Cornell University

cs.cornell.edu
from cs.cornell.edu More from this publisher
20.03.2013 Views

148 c.A.R. HOARE end; p: = compile primary; fs: = [factor (times, p)]; while source.first = times v source.first - div do begin sign from source; end; p: = compile primary; fs" [factor (sign, p)] compile term: = term (s, fs) function compile primary: primary; begin s" symbol; end; s from source; with s do (constant:compile primary:= const (value), variable" compile primary'= var (identifier), leftbracket: begin from source; compile primary'= bracketed (compile expression); s from source; if s g: rightbracket then go to error end, else go to error } Exercise Write programs to convert an expression from tree representation to bitstream and back again. 10. SPARSE DATA STRUCTURES In dealing with representations of arrays and powersets, we have hitherto assumed that the base type of a powerset and the domain type of an array is reasonably small, so that it is possible to allocate a bit or larger area of store to hold the value of every potential element of the structure. The examples also were confined to such cases. In this chapter we investigate the conse- quences and problems which arise when the base or domain types are very large or infinite, and when the standard representations are therefore impossible.

NOTES ON DATA STRUCTURING 149 The representation and manipulation of powersets and mappings with infinite domains can be accomplished, provided that consideration is re- stricted to sets with only a finite number of members, and mappings in which only a finite number of elements take significant values; where "significant" is defined as different from some specified null or default value. The powerset of an infinite set is obviously also infinite; but since each value of the powerset type contains only a finite number of elements, each value can be specified simply by listing those elements in a finite period of time, and the list will occupy only a finite amount of storage. Similarly, each value of a mapping type with infinite domain can be finitely specified by listing all elements of the domain which map onto significant values of the range type, together with the value mapped in each case. A type which is restricted in this way is known as sparse. In fact the concept of sparsity is not confined to infinite bases and domains; it may also be applied to very large but finite powersets, when the pro- grammer knows that each actual set in which he is interested will contain only a very small proportion of the potential members. For example, the base type may contain hundreds of millions of values, but the programmer may know that he only has to deal with sets of less than a hundred in size, and perhaps most of them less than ten. It would be impossible to use the bitpattern representation, since this requires hundreds of millions of bits; but since each value actually used in a program contains only a few members, these members can readily be listed in a comparatively small amount of store. A powerset type of this sort is known as sparse. Similarly, arrays with a very large domain, nearly all of which map onto the same default value of the range, are said to belong to a sparse array type. Sparse sets and arrays are frequently encountered in advanced data processing applications, and their representation and manipulation present a number of familiar problems. Our first example is the definition of a type whose values are sets of car numbers. The cardinality of the carnumber type is perhaps something like four thousand million; but the programmer wishes only to deal with sets of cars owned by a single person; most of these will have only one member, and very few will have more than ten. The carset type may therefore be declared as sparse powerset: type carset = sparse powerset carnumber; As an example of a sparse array, we may take the type of mappings between car owners and the set of cars they own. Each owner is represented by name and address; since these are of arbitrary length, the owner type may be defined: type owner = sequence character;

148 c.A.R. HOARE<br />

end;<br />

p: = compile primary;<br />

fs: = [factor (times, p)];<br />

while source.first = times v source.first - div do<br />

begin sign from source;<br />

end;<br />

p: = compile primary;<br />

fs" [factor (sign, p)]<br />

compile term: = term (s, fs)<br />

functi<strong>on</strong> compile primary: primary;<br />

begin s" symbol;<br />

end;<br />

s from source;<br />

with s do (c<strong>on</strong>stant:compile primary:= c<strong>on</strong>st (value),<br />

variable" compile primary'= var (identifier),<br />

leftbracket:<br />

begin from source;<br />

compile primary'= bracketed (compile expressi<strong>on</strong>);<br />

s from source;<br />

if s g: rightbracket then go to error<br />

end,<br />

else go to error }<br />

Exercise<br />

Write programs to c<strong>on</strong>vert an expressi<strong>on</strong> from tree representati<strong>on</strong> to<br />

bitstream and back again.<br />

10. SPARSE DATA STRUCTURES<br />

In dealing with representati<strong>on</strong>s of arrays and powersets, we have hitherto<br />

assumed that the base type of a powerset and the domain type of an array is<br />

reas<strong>on</strong>ably small, so that it is possible to allocate a bit or larger area of store<br />

to hold the value of every potential element of the structure. The examples<br />

also were c<strong>on</strong>fined to such cases. In this chapter we investigate the c<strong>on</strong>se-<br />

quences and problems which arise when the base or domain types are very<br />

large or infinite, and when the standard representati<strong>on</strong>s are therefore<br />

impossible.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!