Evaluation of the Australian Wage Subsidy Special Youth ...
Evaluation of the Australian Wage Subsidy Special Youth ... Evaluation of the Australian Wage Subsidy Special Youth ...
188 weighting, both the weighted (columns 2 and 4) and unweighted (columns 1 and 3) results are shown for each missing data approach. Of main interest is the comparison of the estimates using each missing data approach, thus column 1 versus column 3, or column 2 versus column 4. Discussion of Table A2.2 generalises the results to comparison between the first panel (using mean substitution) and the second panel (deletion of cases with missing data – this gives the smaller base of 2150 cases where 1984 survey information has no information missing on any explanatory variables) by discussing the weighted results. Moving from mean substitution to the deletion approach for the weighted data does change which variables are statistically significant, as the t- statistic size changes – for example ’other city before aged 14’ becomes statistically significant, and ‘age in 1984’ becomes insignificant. It also leads to a change in the size of coefficients for statistically significant variables – for example the coefficient for ‘proportion of pre-June unemployment’ falls, and the t-statistic falls. In a probit, the coefficient size is not clearly interpretable, so the interpretation of a positive influence of this variable on participation in SYETP is not changed by the various missing data approaches, however the calculated marginal effect would be affected. When the variable is not statistically significant, changing the missing data approach can also lead to change in the sign, such as for ‘children 1984’. Although not discussed in detail here, it can be seen that substantively important changes in the estimates arise depending on each approach used. The chief variation is in which variables are statistically significant, so that choice of treatment of missing data affects which coefficients are interpreted as statistically significant. The tradeoff between bias and efficiency and using more information affects the interpretation of results. In light of this, it may be worth pursuing the application of the imputation algorithm of King et al. (2001) in future research, which is argued to outperform the mean substitution and deletion methods. However, only the more commonly accepted approaches are dealt with here. The effects of a set of dummy variables for mean imputation is shown in the first 2 columns of Table 5.8. Missing information on the parental occupation predicts failure perfectly, the problem of collinearity, and these variables must be dropped from the regression to enable estimation. The missing information for parental qualifications,
189 proportion of time spent unemployed and number of siblings is controlled for using the dummies. The estimation on the data where those cases with missing information in these variables are dropped is given in Appendix Table A2.3. The results in columns one and two are slightly different to that of Table 5.8. Of course the number of observations in columns one and two of Appendix Table A2.3 are lower at 2150 because the observations with missing information are dropped, whereas in Table 5.8 they are 2368. The Akaike Information Criterion does not vary much in size between the models, and so does not assist much in model selection here (because the sample and variables change between the models, this fit measure is more relevant).The arguments of King et al. (2001) suggest that dropping those cases, casewise deletion, gives the correct standard error, although estimates do suffer the problems of bias. It is then a subjective choice as to whether the analyst prefers to trade-off bias, however correct standard error estimation is essential if the statistical significance of the coefficients is important to analysis. In light of this, it is deemed more useful to apply casewise deletion than mean imputation dummies. 5.7.2 Sample reduction effects on model of SYETP participation Columns 3 and 4 of Table 5.8 give the probit results for SYETP participation for the final data set after sample reduction. Column 3 136 shows the unweighted results, and column 4 shows the results weighted with the survey weight. 137 As for the whole sample discussed earlier, the variables that are statistically significant alter with the use of the weight. The variables that gain significance when using the weight are married in 1984, attended a private school, interviewed in Western Australia/ Tasmania, longest job held before 1984 was 3 years or more, mostly lived in a city until aged 14. The variables that lose statistical significance are CEP referrals in 1984, father held a post-school qualification when respondent aged 14, mother worked as plant operative when respondent aged 14. A worrying change is the loss of statistical significance for CEP referrals in 1984. This is further discussed later in the modelling of the treatment effect of SYETP, because this is a key element of the identifying restriction in the bivariate probit of employment 136 The results in column 3 are equivalent to the univariate probit estimated in Richardson (1998). 137 Note that no account has been made of sample reduction from the 1984 survey in this weight. This is treated next.
- Page 153 and 154: 137 Table 6.3 using Swedish data wi
- Page 155 and 156: 139 matching is the ability to weed
- Page 157 and 158: 141 Table 4.7 Matching results, All
- Page 159 and 160: 143 the unobserved component. If th
- Page 161 and 162: 145 5: Study 3 Attrition and non-re
- Page 163 and 164: 147 occur by design, because the mi
- Page 165 and 166: 149 (1990) extended and improved th
- Page 167 and 168: 151 (10) A* = δ 0 + δ 1 x +δ 2 z
- Page 169 and 170: 153 again from September to Novembe
- Page 171 and 172: 155 5.5.2 Univariate examination of
- Page 173 and 174: 157 lower, the job lengths are only
- Page 175 and 176: 159 Work limited by health 1984 0.1
- Page 177 and 178: 161 The characteristics of the SYET
- Page 179 and 180: 163 para-professional Mother not em
- Page 181 and 182: 165 comparison group where the shar
- Page 183 and 184: 167 5.5.4 Attrition: natural attrit
- Page 185 and 186: 169 both sources that impose change
- Page 187 and 188: 171 para-professional Father not em
- Page 189 and 190: 173 work in later sections, this su
- Page 191 and 192: 175 Table 5.6: Effect of selection/
- Page 193 and 194: 177 appropriate to discard these fr
- Page 195 and 196: 179 Australia/Tasmania. Amongst tho
- Page 197 and 198: 181 Table 5.5a Summary statistics b
- Page 199 and 200: 183 5.6.1.2 Effects of the non-resp
- Page 201 and 202: 185 3 years + -0.35 -0.47 -0.34 -0.
- Page 203: 187 5.7 Multivariate analysis of ef
- Page 207 and 208: 191 post-school qualification, and
- Page 209 and 210: 193 Generally, those variables foun
- Page 211 and 212: 195 longj0 Longest job by 1984 < 1
- Page 213 and 214: 197 adopted in order to maintain co
- Page 215 and 216: 199 6: Study 4 Weighting to counter
- Page 217 and 218: 201 Table 6.1, part A Employment eq
- Page 219 and 220: 203 Methodist 0.133 0.261 (0.77) (1
- Page 221 and 222: 205 CEP referrals 1984 0.143* 0.128
- Page 223 and 224: 207 6.2 Results of weighting the PS
- Page 225 and 226: 209 The distribution of the propens
- Page 227 and 228: 211 Table 6.3 Weighted probit used
- Page 229 and 230: 213 (0.76) Tradesperson mtrad 0.20
- Page 231 and 232: 215 Table 6.5 Summary statistics fo
- Page 233 and 234: 217 Table 6.7 Matching results, sin
- Page 235 and 236: 219 6.3 Discussion The comparison o
- Page 237 and 238: 221 the selection into SYETP and th
- Page 239 and 240: 223 Heteroskedasticity is a violati
- Page 241 and 242: 225 Table 7.1, Part A Employment eq
- Page 243 and 244: 227 (1.26) (1.28) (1.16) Mothers oc
- Page 245 and 246: 229 Other Post-School qualification
- Page 247 and 248: 231 7.1.2 Exclusion restriction in
- Page 249 and 250: 233 Finally, the third panel of new
- Page 251 and 252: 235 Table 7.2 summary of changes to
- Page 253 and 254: 237 schooling that was statisticall
188<br />
weighting, both <strong>the</strong> weighted (columns 2 and 4) and unweighted (columns 1 and 3)<br />
results are shown for each missing data approach. Of main interest is <strong>the</strong> comparison <strong>of</strong><br />
<strong>the</strong> estimates using each missing data approach, thus column 1 versus column 3, or<br />
column 2 versus column 4. Discussion <strong>of</strong> Table A2.2 generalises <strong>the</strong> results to<br />
comparison between <strong>the</strong> first panel (using mean substitution) and <strong>the</strong> second panel<br />
(deletion <strong>of</strong> cases with missing data – this gives <strong>the</strong> smaller base <strong>of</strong> 2150 cases where<br />
1984 survey information has no information missing on any explanatory variables) by<br />
discussing <strong>the</strong> weighted results. Moving from mean substitution to <strong>the</strong> deletion approach<br />
for <strong>the</strong> weighted data does change which variables are statistically significant, as <strong>the</strong> t-<br />
statistic size changes – for example ’o<strong>the</strong>r city before aged 14’ becomes statistically<br />
significant, and ‘age in 1984’ becomes insignificant. It also leads to a change in <strong>the</strong> size<br />
<strong>of</strong> coefficients for statistically significant variables – for example <strong>the</strong> coefficient for<br />
‘proportion <strong>of</strong> pre-June unemployment’ falls, and <strong>the</strong> t-statistic falls. In a probit, <strong>the</strong><br />
coefficient size is not clearly interpretable, so <strong>the</strong> interpretation <strong>of</strong> a positive influence <strong>of</strong><br />
this variable on participation in SYETP is not changed by <strong>the</strong> various missing data<br />
approaches, however <strong>the</strong> calculated marginal effect would be affected. When <strong>the</strong> variable<br />
is not statistically significant, changing <strong>the</strong> missing data approach can also lead to change<br />
in <strong>the</strong> sign, such as for ‘children 1984’. Although not discussed in detail here, it can be<br />
seen that substantively important changes in <strong>the</strong> estimates arise depending on each<br />
approach used. The chief variation is in which variables are statistically significant, so<br />
that choice <strong>of</strong> treatment <strong>of</strong> missing data affects which coefficients are interpreted as<br />
statistically significant. The trade<strong>of</strong>f between bias and efficiency and using more<br />
information affects <strong>the</strong> interpretation <strong>of</strong> results. In light <strong>of</strong> this, it may be worth pursuing<br />
<strong>the</strong> application <strong>of</strong> <strong>the</strong> imputation algorithm <strong>of</strong> King et al. (2001) in future research, which<br />
is argued to outperform <strong>the</strong> mean substitution and deletion methods. However, only <strong>the</strong><br />
more commonly accepted approaches are dealt with here.<br />
The effects <strong>of</strong> a set <strong>of</strong> dummy variables for mean imputation is shown in <strong>the</strong> first 2<br />
columns <strong>of</strong> Table 5.8. Missing information on <strong>the</strong> parental occupation predicts failure<br />
perfectly, <strong>the</strong> problem <strong>of</strong> collinearity, and <strong>the</strong>se variables must be dropped from <strong>the</strong><br />
regression to enable estimation. The missing information for parental qualifications,