II. Notes on Data Structuring * - Cornell University
II. Notes on Data Structuring * - Cornell University
II. Notes on Data Structuring * - Cornell University
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
126 c.A.R. HOARE<br />
7.2 REPRESENTATION<br />
In choosing a computer representati<strong>on</strong> for powersets, it is desirable to<br />
ensure that all the basic operati<strong>on</strong>s can be executed simply by single machine<br />
code instructi<strong>on</strong>s; and further, that the amount of store occupied is<br />
minimised. For most data structure storage methods, there is a fundamental<br />
c<strong>on</strong>flict between these two objectives, and c<strong>on</strong>sequently a choice between<br />
representati<strong>on</strong> methods must be made by the programmer; but in the case<br />
of" powersets the two objectives can be fully rec<strong>on</strong>ciled, provided that the<br />
base type is not too large.<br />
The recommended method of representati<strong>on</strong> is to allocate as many bits<br />
in the store as there are potential members in the set. Thus to each value<br />
of the base type there is a single bit which takes the value <strong>on</strong>e if it is in fact a<br />
member, or zero if it is not. For example, each value of type colour can be<br />
represented in three bits; the most significant corresp<strong>on</strong>ding to the primary<br />
colour red, and the least significant corresp<strong>on</strong>ding to blue. Thus the orange<br />
colour is represented as 110 and red as 100. Each set of size n is represented<br />
as a bitpattern with exactly n <strong>on</strong>es in the appropriate positi<strong>on</strong>s. The null set<br />
is accordingly represented as an all-zero bitpattern.<br />
Another example is afforded by the "hand" type, which requires fifty-two<br />
bits for its representati<strong>on</strong>, <strong>on</strong>e corresp<strong>on</strong>ding to each value of type cardface.<br />
In this case, it is advisable to use the minimal representati<strong>on</strong> of the base<br />
type, to avoid unused gaps in the bitpattern representati<strong>on</strong>.<br />
Since the number of values of a powerset type is always an exact power of<br />
two, for powersets of small base there can be no more ec<strong>on</strong>omical method<br />
of utilising storage <strong>on</strong> a binary computer than that of the bitpattern repre-<br />
sentati<strong>on</strong>. It remains to show that the operati<strong>on</strong>s defined over the powerset<br />
type can be executed with high efficiency.<br />
(1) The unitset of x may be obtained by loading a single 1 into the signbit<br />
positi<strong>on</strong>, and shifting it right x places. On computers <strong>on</strong> which shifting is<br />
slow, the same effect may be obtained by table lookup. The c<strong>on</strong>structi<strong>on</strong> of a<br />
set out of comp<strong>on</strong>ents may be achieved by taking the logical uni<strong>on</strong> of all the<br />
corresp<strong>on</strong>ding unit sets.<br />
(2) A membership test x in s may be made by shifting s up x places and<br />
looking at the most significant bit: 1 stands for true and 0 for false.<br />
(3) Logical intersecti<strong>on</strong>, uni<strong>on</strong>, and complementati<strong>on</strong> are often available<br />
as single instructi<strong>on</strong>s <strong>on</strong> binary computers.<br />
(4) The size of a set can sometimes be discovered by a builtin machine<br />
code instructi<strong>on</strong> for counting the bits in a word. Otherwise the size can be<br />
determined by repeated standardisati<strong>on</strong>, masking off the next-to-sign bit <strong>on</strong>