13.10.2014 Views

How to use HLM 6 for hierarchical linear

How to use HLM 6 for hierarchical linear

How to use HLM 6 for hierarchical linear

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>How</strong> <strong>to</strong> <strong>use</strong> <strong>HLM</strong> 6 <strong>for</strong> <strong>hierarchical</strong> <strong>linear</strong> modeling<br />

(aka “mixed modeling”, aka “generalized estimating equations”)<br />

Use <strong>HLM</strong> when you have random effects (e.g., outcomes over time, a continuous<br />

variable) nested within fixed effects (e.g., participants, a categorical variable).<br />

Options <strong>for</strong> the procedure include “PROC MIXED” in SAS, “PROC GLM” with a<br />

“RANDOM” command statement in SAS, “Repeated Measures” under the GLM<br />

menu in SPSS, or the <strong>HLM</strong> program.<br />

The main disadvantage of using a GLM-based procedure is that it is not able <strong>to</strong> deal<br />

with missing data – if not all participants have data at all time intervals (especially if<br />

observations are scattered over varying amounts of time or at different intervals <strong>for</strong><br />

each person, e.g., in the case of cued diary-type measures), then GLM-based analyses<br />

will drop the time intervals at which not all participants have data (or else drop the<br />

participants who do not have data at all time intervals). This weakness is overcome by<br />

PROC MIXED in SAS (Littell, Milliken, Stroup, & Wolfinger, 1996) or by the <strong>HLM</strong><br />

program designed specifically <strong>for</strong> <strong>hierarchical</strong> <strong>linear</strong> modeling (Raudenbush, Bryk,<br />

Cheong, & Congdon, 2001). “The basic theory on which PROC MIXED [in SAS] is<br />

based holds even with unbalanced and missing data, so long as the missing data are<br />

random,” (Littell, et al., 1996). With multiple repeated measures <strong>for</strong> specific<br />

participants, another issue is that the observations <strong>for</strong> a given participant are serially<br />

au<strong>to</strong>correlated with each other, violating the assumption of independence of<br />

observations which is fundamental <strong>to</strong> GLM procedures. For these reasons, it is<br />

preferable <strong>to</strong> <strong>use</strong> <strong>HLM</strong> or SAS PROC MIXED when dealing with multiple repeated<br />

measures, rather than the GLM-based Repeated Measures ANOVA option in SPSS.<br />

The basic <strong>HLM</strong> procedure is not specifically limited by sample size, although some<br />

procedures do require larger samples in order <strong>to</strong> be reliable (e.g., <strong>use</strong> of “robust<br />

standard errors” <strong>to</strong> improve estimates of beta- and gamma-weights in the <strong>HLM</strong><br />

program). Assumptions of the procedure include:<br />

• Variables are normally distributed<br />

• Level-2 cases are independent of one another (Level-1 cases are expected <strong>to</strong><br />

be dependent)<br />

• There is homogeneity of variance <strong>for</strong> the variability in the Level-1 cases (the<br />

<strong>HLM</strong> program has an option <strong>to</strong> test <strong>for</strong> whether this assumption is violated)<br />

UCDHSC Center <strong>for</strong> Nursing Research Updated 5/20/06<br />

Page 1 of 8


STEPS FOR USING <strong>HLM</strong>:<br />

1. Configure data – Level 1 file has multiple observations per case (change over<br />

time <strong>for</strong> the case, with a “time” or “sequence” variable, and an<br />

“outcome” variable)<br />

– Level 2 file has only one observation per case (additional<br />

descrip<strong>to</strong>rs <strong>for</strong> the case)<br />

2. Import in<strong>to</strong> <strong>HLM</strong> program and “make new MDM” file. There are a number of<br />

steps here, and it’s important <strong>to</strong> do them all in the right order:<br />

— select “stat package input”<br />

— select “<strong>HLM</strong>2” <strong>for</strong> the type of analysis (2 levels)<br />

— set “input file type” <strong>to</strong> “SPSS”<br />

— make up a name <strong>for</strong> the MDM file (with “.mdm” extension)<br />

— identify the files using “browse” but<strong>to</strong>ns<br />

— <strong>use</strong> “choose variables” <strong>to</strong> define the Level 1 & Level 2<br />

variables of interest<br />

— select the variable that is the primary key between the<br />

Level 1 & Level 2 files – it gets flagged as “ID” in both<br />

— if there are any missing data, select “missing data” in the<br />

Level 1 file. If you are using the student version, select<br />

“delete missing data when making MDM” beca<strong>use</strong> this<br />

makes the analysis less complex. Otherwise, select “delete<br />

missing data when running analyses,” beca<strong>use</strong> this will<br />

conserve statistical power as much as possible.<br />

— click “save mdmt file” and give it a file name.<br />

— click the “make MDM” but<strong>to</strong>n.<br />

UCDHSC Center <strong>for</strong> Nursing Research Updated 5/20/06<br />

Page 2 of 8


3. Click on the “check stats” but<strong>to</strong>n <strong>to</strong> check descriptive stats <strong>for</strong> each variable:<br />

LEVEL-1 DESCRIPTIVE STATISTICS<br />

VARIABLE NAME N MEAN SD MINIMUM MAXIMUM<br />

SCORE 48 10.00 5.17 1.00 19.00<br />

TRIAL 48 2.50 1.13 1.00 4.00<br />

LEVEL-2 DESCRIPTIVE STATISTICS<br />

VARIABLE NAME N MEAN SD MINIMUM MAXIMUM<br />

ANXIETY 12 1.50 0.52 1.00 2.00<br />

TENSION 12 1.50 0.52 1.00 2.00<br />

4. Click on the “Done” but<strong>to</strong>n <strong>to</strong> go <strong>to</strong> the next screen<br />

UCDHSC Center <strong>for</strong> Nursing Research Updated 5/20/06<br />

Page 3 of 8


5. Specify the <strong>HLM</strong> model – beta coefficients <strong>for</strong> Level 1 variables; gamma<br />

coefficients <strong>for</strong> Level 2 variables. Start by specifying the “outcome” variable at<br />

Level 1, then add other variables <strong>to</strong> the model at Level 1 and Level 2. (Left-click<br />

on each variable name on the list on the left-hand side of the screen, in order <strong>to</strong><br />

specify their role in the equation)<br />

6. Clicking either “Basic Settings” or “Outcome” lets you say where <strong>to</strong> save the<br />

output file, whether <strong>to</strong> graph results, etc.<br />

7. Save the model, under the “File” menu.<br />

8. Click on “Run Analysis” <strong>to</strong> see results.<br />

9. “View Output” is under the “file” menu. Results include the model coefficients<br />

and tests <strong>for</strong> the statistical significance of each predic<strong>to</strong>r. The model also gives<br />

you the level-2 values <strong>for</strong> each level-1 regression equation:<br />

SAMPLE OUTPUT<br />

Summary of the model specified (in equation <strong>for</strong>mat)<br />

---------------------------------------------------<br />

Level-1 Model<br />

Y = B0 + B1*(TRIAL) + R<br />

Level-2 Model<br />

B0 = G00 + U0<br />

B1 = G10 + G11*(ANXIETY) + G12*(TENSION) + U1<br />

UCDHSC Center <strong>for</strong> Nursing Research Updated 5/20/06<br />

Page 4 of 8


Level-1 OLS regressions<br />

-----------------------<br />

Level-2 Unit INTRCPT1 TRIAL slope<br />

------------------------------------------------------------------------------<br />

1 22.00000 -3.80000<br />

2 23.00000 -4.90000<br />

3 18.00000 -4.00000<br />

4 20.00000 -3.80000<br />

5 15.00000 -3.20000<br />

6 22.50000 -5.60000<br />

7 19.00000 -3.80000<br />

8 21.50000 -5.50000<br />

9 21.00000 -4.80000<br />

10 23.00000 -3.90000<br />

The average OLS level-1 coefficient <strong>for</strong> INTRCPT1 = 20.12500<br />

The average OLS level-1 coefficient <strong>for</strong> TRIAL = -4.05000<br />

Least Squares Estimates<br />

-----------------------<br />

sigma_squared = 5.60202<br />

The outcome variable is<br />

SCORE<br />

Least-squares estimates of fixed effects<br />

----------------------------------------------------------------------------<br />

Standard<br />

Fixed Effect Coefficient Error T-ratio d.f. P-value<br />

----------------------------------------------------------------------------<br />

For INTRCPT1, B0<br />

INTRCPT2, G00 20.125000 0.836811 24.050 44 0.000<br />

For TRIAL slope, B1<br />

INTRCPT2, G10 -5.216667 0.611120 -8.536 44 0.000<br />

ANXIETY, G11 0.361111 0.249489 1.447 44 0.155<br />

TENSION, G12 0.416667 0.249489 1.670 44 0.102<br />

----------------------------------------------------------------------------<br />

Interpretation: In this example, the level-2 constant (intercept 2, G00) is a significant<br />

predic<strong>to</strong>r of the level-one constant (beta-zero), which is the participant’s initial level of<br />

per<strong>for</strong>mance on the SCORE variable. The level-2 constant (intercept 2, G10) is also a<br />

significant predic<strong>to</strong>r of the level-1 slope (beta-one). Although anxiety level (G11)<br />

UCDHSC Center <strong>for</strong> Nursing Research Updated 5/20/06<br />

Page 5 of 8


approaches significance (p = .155) as a predic<strong>to</strong>r of beta-one, and tension level (G12) also<br />

approaches significance (p = .102) as a predic<strong>to</strong>r of beta-one, neither of these level-2<br />

predic<strong>to</strong>rs had a strong enough effect <strong>to</strong> be considered statistically significant as a<br />

predic<strong>to</strong>r of the within-person change in the SCORE variable over time (i.e., beta-one).<br />

SAMPLE OUTPUT (CONTINUED)<br />

Final estimation of variance components:<br />

-----------------------------------------------------------------------------<br />

Random Effect Standard Variance df Chi-square P-value<br />

Deviation Component<br />

-----------------------------------------------------------------------------<br />

INTRCPT1, U0 1.87129 3.50174 11 73.95178 0.000<br />

level-1, R 1.56446 2.44754<br />

-----------------------------------------------------------------------------<br />

Interpretation: The level-1 intercept (i.e., people’s starting score) is a significant<br />

predic<strong>to</strong>r of the SCORE variable over time. This means that people are significantly<br />

different from one another (there is variability among the level-1 units), even though the<br />

level-2 predic<strong>to</strong>rs weren’t able <strong>to</strong> account <strong>for</strong> this variability.<br />

Statistics <strong>for</strong> current covariance components model<br />

--------------------------------------------------<br />

Deviance = 198.591730<br />

Number of estimated parameters = 2<br />

Interpretation: The deviance statistic is the same as a -2 log likelihood, and the larger it<br />

is, the worse the fit between the model and the data. This -2LL is fairly high (greater than<br />

100), so the model is not an adequate fit <strong>for</strong> the data. Other predic<strong>to</strong>rs or other<br />

combinations of variables should be considered in trying <strong>to</strong> account <strong>for</strong> individual<br />

participants’ outcomes on the SCORE variable.<br />

UCDHSC Center <strong>for</strong> Nursing Research Updated 5/20/06<br />

Page 6 of 8


You can also test one <strong>HLM</strong> model against another, by using the “hypothesis testing”<br />

command under the “other settings” menu. Just put in this model’s deviance and df (from<br />

the output above), specify and different model, and re-run the analysis <strong>to</strong> compare them.<br />

One other great new feature in <strong>HLM</strong> 6 is that you can graph each individual participant’s<br />

level-1 regression line <strong>to</strong> see the overall pattern and any outliers. Here’s how: <strong>use</strong> the<br />

“graph equations – level 1 equation graphing” command in the “file” menu.<br />

In the next screen, again select your outcome variable, and the level-1 predic<strong>to</strong>r and<br />

level-2 predic<strong>to</strong>r that you specifically want <strong>to</strong> focus on. You can select either a subset of<br />

cases (e.g., “first ten cases”) if the sample size is very large, or all cases. In this example<br />

the <strong>to</strong>tal n was only 12 cases, so we included all of them.<br />

UCDHSC Center <strong>for</strong> Nursing Research Updated 5/20/06<br />

Page 7 of 8


The output graph looks like this. It shows you each individual participant’s score (y-axis)<br />

over time (x-axis) as a separate regression line. It further highlights people with the two<br />

different levels of tension (the level-2 predic<strong>to</strong>r) in different colors. This graph confirms<br />

our statistical results, showing that the “tension” variable didn’t significantly differentiate<br />

among people, even though there was a significant association <strong>for</strong> everyone between<br />

“score” and “time.”<br />

UCDHSC Center <strong>for</strong> Nursing Research Updated 5/20/06<br />

Page 8 of 8

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!