24.02.2013 Views

Optimality

Optimality

Optimality

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

1<br />

. . .<br />

1<br />

i<br />

. . .<br />

. . .<br />

R<br />

j<br />

. . .<br />

Optimal sampling strategies 267<br />

C<br />

l i,j<br />

l<br />

1,1 l1,2<br />

l2,1<br />

l2,2<br />

l2,1<br />

l2,2<br />

l1,1 l1,2<br />

leaves<br />

(a) (b)<br />

Fig 1. (a) Finite population on a spatial rectangular grid of size R × C units. Associated with<br />

the unit at position (i, j) is an unknown value ℓi,j. (b) Multiscale superpopulation model for a<br />

finite population. Nodes at the bottom are called leaves and the topmost node the root. Each leaf<br />

node corresponds to one value ℓi,j. All nodes, except for the leaves, correspond to the sum of their<br />

children at the next lower level.<br />

Denote an arbitrary sample of size n by L. We consider linear estimators of S<br />

that take the form<br />

(1.1)<br />

� S(L,α) := α T L,<br />

where α is an arbitrary set of coefficients. We measure the accuracy of � S(L,α) in<br />

terms of the mean-squared error (MSE)<br />

�<br />

(1.2) E(S|L,α) := E S− � �2 S(L,α)<br />

and define the linear minimum mean-squared error (LMMSE) of estimating S from<br />

L as<br />

(1.3) E(S|L) := min<br />

α∈R nE(S|L,α).<br />

Restated, our goal is to determine<br />

(1.4) L ∗ := arg min<br />

L E(S|L).<br />

Our results are particularly applicable to Gaussian processes for which linear estimation<br />

is optimal in terms of mean-squared error. We note that for certain multimodal<br />

and discrete processes linear estimation may be sub-optimal.<br />

1.2. Multiscale superpopulation models<br />

We assume that the population is one realization of a multiscale stochastic process<br />

(see Fig. 1(b)) (see Willsky [20]). Such processes consist of random variables organized<br />

on a tree. Nodes at the bottom, called leaves, correspond to the population<br />

ℓi,j. All nodes, except for the leaves, represent the sum total of their children at<br />

the next lower level. The topmost node, the root, hence represents the sum of the<br />

entire population. The problem we address in this paper is thus equivalent to the<br />

following: Among all possible sets of leaves of size n, which set provides the best<br />

linear estimator for the root in terms of MSE?<br />

Multiscale stochastic processes efficiently capture the correlation structure of a<br />

wide range of phenomena, from uncorrelated data to complex fractal data. They<br />

root

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!