The Quick Count and Election Observation

The Quick Count and Election Observation The Quick Count and Election Observation

27.10.2014 Views

THE QUICK COUNT AND ELECTION OBSERVATION 8 and 10, the number in the middle of the observations is 7; there are three observations smaller than 7 and three observations with values that are greater than 7. The mode for this dataset, however, is 8 because 8 occurs most frequently. The arithmetic mean for this data set is 6.14. Statisticians usually report the mean, rather than the median or the mode, as the most useful measure of central tendency. 67 Measures of Dispersion A second feature of data concerns measures of dispersion, which indicate how widely, or how narrowly, observed values are distributed. From the example above, it is clear that any given data set will have an arithmetic mean. However, that mean provides no information about how widely, or narrowly, the observed values are dispersed. The following data sets have the same arithmetic mean of 3: 2, 2, 3, 4, 4 -99, -99, 3, 99, 99, These two datasets have quite different distributions. One way to express the difference in the two datasets is to consider the range of numbers. In the first set, the smallest number is 2 and the largest number is 4. The resulting range, then, is 4 minus 2, or 2. In the second set, the smallest number is negative 99 and the largest number is positive 99. The resulting range is positive 99 minus negative 99, or 198. Obviously, the different ranges of the two datasets capture one aspect of the fundamental differences between these two sets of numbers. Even so, the range is only interested in two numbers -- the largest and the smallest; it ignores all other data points. Much more information about the spread of the observations within the dataset can be expressed with a different measure, the variance. In non-technical terms, the variance expresses the average of all the distances between each observation value and the mean of all observation values. The variance takes into account the arithmetic mean of a dataset and the number of observations, in addition to each of the datapoints themselves. As a result, it includes all the information needed to explain the spread of a dataset. The variance for any set of observations can be determined in four steps: 1. Calculate the arithmetic mean of the dataset. 2. Calculate the distance between every data point and the mean, and square the distance. 3. Add all the squared distances together. 4. Divide this by the number of observations.

CHAPTER FIVE: STATISTICAL PRINCIPLES AND QUICK COUNTS 68 The formula, then, is as follows: For a dataset containing observations x 1 , x 2 , x 3 … x n s 2 = (x 1 -x)2 + (x 2 -x) 2 + (x 3 -x) 2 ... (x n -x) 2 n-1 Where s 2 = variance x 1 , x 2 , x 3 … x n are the observations x is the mean n is the number of observations In short form, it appears as: s 2 = ∑ (x-x)2 n-1 The standard deviation is the square root of the variance. Statisticians usually rely on the standard deviation because it expresses the variance in standardized units that can be meaningfully compared. The larger the standard deviation for any dataset, the more the data are spread out from the mean. The smaller the standard deviation, the more tightly are the individual data points clustered around the mean. There is one additional measurement concept that needs to be considered: the normal distribution. The preceding discussion shows that, in every data set, individual data points will cluster around an average, or mean, point. Another way to express the same idea is to consider what proportion of all of the observations fall within one standard deviation of the mean. If datasets are large enough, and if they conform to the principles of randomness, the dispersion of the data values will conform to what is called a normal distribution. The normal distribution has well-known properties: the normal curve, as seen in Figure 5-2, is bell-shaped and symmetrical, and the mean, mode and median coincide. FIGURE 5-2: NORMAL DISTRIBUTION CURVES 35 30 25 20 15 10 5 0

THE QUICK COUNT AND ELECTION OBSERVATION<br />

8 <strong>and</strong> 10, the number in the middle of the observations is 7; there are three<br />

observations smaller than 7 <strong>and</strong> three observations with values that are greater<br />

than 7. <strong>The</strong> mode for this dataset, however, is 8 because 8 occurs most frequently.<br />

<strong>The</strong> arithmetic mean for this data set is 6.14. Statisticians usually<br />

report the mean, rather than the median or the mode, as the most useful measure<br />

of central tendency.<br />

67<br />

Measures of Dispersion<br />

A second feature of data concerns measures of dispersion, which indicate how<br />

widely, or how narrowly, observed values are distributed. From the example above,<br />

it is clear that any given data set will have an arithmetic mean. However, that mean<br />

provides no information about how widely, or narrowly, the observed values are<br />

dispersed. <strong>The</strong> following data sets have the same arithmetic mean of 3:<br />

2, 2, 3, 4, 4<br />

-99, -99, 3, 99, 99,<br />

<strong>The</strong>se two datasets have quite different distributions. One way to express the<br />

difference in the two datasets is to consider the range of numbers. In the first<br />

set, the smallest number is 2 <strong>and</strong> the largest number is 4. <strong>The</strong> resulting range,<br />

then, is 4 minus 2, or 2. In the second set, the smallest number is negative 99<br />

<strong>and</strong> the largest number is positive 99. <strong>The</strong> resulting range is positive 99 minus<br />

negative 99, or 198.<br />

Obviously, the different ranges of the two datasets capture one aspect of the<br />

fundamental differences between these two sets of numbers. Even so, the range<br />

is only interested in two numbers -- the largest <strong>and</strong> the smallest; it ignores all<br />

other data points. Much more information about the spread of the observations<br />

within the dataset can be expressed with a different measure, the variance.<br />

In non-technical terms, the variance expresses the average of all the distances<br />

between each observation value <strong>and</strong> the mean of all observation values. <strong>The</strong><br />

variance takes into account the arithmetic mean of a dataset <strong>and</strong> the number<br />

of observations, in addition to each of the datapoints themselves. As a result,<br />

it includes all the information needed to explain the spread of a dataset. <strong>The</strong><br />

variance for any set of observations can be determined in four steps:<br />

1. Calculate the arithmetic mean of the dataset.<br />

2. Calculate the distance between every data point <strong>and</strong> the mean, <strong>and</strong><br />

square the distance.<br />

3. Add all the squared distances together.<br />

4. Divide this by the number of observations.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!