Statistical inference for exploratory data analysis ... - Hadley Wickham
Statistical inference for exploratory data analysis ... - Hadley Wickham
Statistical inference for exploratory data analysis ... - Hadley Wickham
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
count<br />
40<br />
30<br />
20<br />
10<br />
0<br />
40<br />
30<br />
20<br />
10<br />
0<br />
40<br />
30<br />
20<br />
10<br />
0<br />
40<br />
30<br />
20<br />
10<br />
0<br />
40<br />
30<br />
20<br />
10<br />
0<br />
1<br />
5<br />
9<br />
13<br />
17<br />
2 4 6 8 10<br />
<strong>Statistical</strong> <strong>inference</strong> <strong>for</strong> graphics 4373<br />
2<br />
6<br />
10<br />
14<br />
18<br />
2 4 6 8 10<br />
tips<br />
3<br />
7<br />
11<br />
15<br />
19<br />
2 4 6 8 10<br />
4<br />
8<br />
12<br />
16<br />
20<br />
2 4 6 8 10<br />
Figure 4. Simulation lineup: Tips Data, histograms of tip. Which histogram is most different?<br />
Explain what makes it different from the others.<br />
Figure 6 shows residuals plotted against the order in the <strong>data</strong>. Structure in the<br />
real plot would indicate the presence of lurking variables. The plot of the real<br />
<strong>data</strong> is embedded among decoys, which were produced by simulating from the<br />
residual rotation distribution, as discussed in §3 and the electronic supplementary<br />
material. Which is the plot of the real <strong>data</strong>? Why? What may be ‘lurking’?<br />
(v) Leukaemia <strong>data</strong>: This <strong>data</strong>set originated from a study of gene expression<br />
in two types of acute leukaemia (Golub et al. 1999), acute lymphoblastic<br />
leukaemia (ALL) and acute myeloid leukaemia (AML). The <strong>data</strong>set consists of<br />
Phil. Trans. R. Soc. A (2009)<br />
Downloaded from<br />
rsta.royalsocietypublishing.org on January 7, 2010