29.07.2014 Views

qreg - Stata

qreg - Stata

qreg - Stata

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>qreg</strong> — Quantile regression 7<br />

Example 1: Estimating the conditional median<br />

Consider a two-group experimental design with 5 observations per group:<br />

. use http://www.stata-press.com/data/r13/twogrp<br />

. list<br />

x<br />

y<br />

1. 0 0<br />

2. 0 1<br />

3. 0 3<br />

4. 0 4<br />

5. 0 95<br />

6. 1 14<br />

7. 1 19<br />

8. 1 20<br />

9. 1 22<br />

10. 1 23<br />

. <strong>qreg</strong> y x<br />

Iteration 1: WLS sum of weighted deviations = 121.88268<br />

Iteration 1: sum of abs. weighted deviations = 111<br />

Iteration 2: sum of abs. weighted deviations = 110<br />

Median regression Number of obs = 10<br />

Raw sum of deviations 157 (about 14)<br />

Min sum of deviations 110 Pseudo R2 = 0.2994<br />

y Coef. Std. Err. t P>|t| [95% Conf. Interval]<br />

x 17 18.23213 0.93 0.378 -25.04338 59.04338<br />

_cons 3 12.89207 0.23 0.822 -26.72916 32.72916<br />

We have estimated the equation<br />

y median = 3 + 17 x<br />

We look back at our data. x takes on the values 0 and 1, so the median for the x = 0 group is 3,<br />

whereas for x = 1 it is 3 + 17 = 20. The output reports that the raw sum of absolute deviations about<br />

14 is 157; that is, the sum of |y − 14| is 157. Fourteen is the unconditional median of y, although<br />

in these data, any value between 14 and 19 could also be considered an unconditional median (we<br />

have an even number of observations, so the median is bracketed by those two values). In any case,<br />

the raw sum of deviations of y about the median would be the same no matter what number we<br />

choose between 14 and 19. (With a “median” of 14, the raw sum of deviations is 157. Now think<br />

of choosing a slightly larger number for the median and recalculating the sum. Half the observations<br />

will have larger negative residuals, but the other half will have smaller positive residuals, resulting in<br />

no net change.)<br />

We turn now to the actual estimated equation. The sum of the absolute deviations about the<br />

solution y median = 3 + 17x is 110. The pseudo-R 2 is calculated as 1 − 110/157 ≈ 0.2994. This<br />

result is based on the idea that the median regression is the maximum likelihood estimate for the<br />

double-exponential distribution.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!