13.07.2015 Views

Kuiper and Hoijtink

Kuiper and Hoijtink

Kuiper and Hoijtink

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

A Fortran 90 Program for the Generalization of theOrder-Restricted Information CriterionRebecca M. <strong>Kuiper</strong>Utrecht UniversityHerbert <strong>Hoijtink</strong>Utrecht UniversityAbstractThe generalized order-restricted information criterion GORIC can evaluate hypothesesthat are closed convex cones for multivariate normal linear models. It can examine thetraditional hypotheses H 0 : β 1,1 = · · · = β t,k <strong>and</strong> H u : β 1,1 , . . . , β t,k <strong>and</strong> hypothesescontaining simple order restrictions H m : β 1,1 ≥ . . . ≥ β t,k , where any“≥”may be replacedby “=”, β h,j denotes a parameter for the hth dependent variable <strong>and</strong> the jth predictor ina t-variate regression model with k predictors (which might include the intercept), <strong>and</strong>m is the model/hypothesis index. But, the GORIC can also be applied to restrictions ofthe form H m : R 1 β ≥ r 1 , R 2 β = r 2 , with β a vector of length tk, R 1 a c m1 × tk matrix,r 1 a vector of length c m1 , R 2 a c m2 × tk matrix, <strong>and</strong> r 2 a vector of length c m2 . It shouldbe noted that [R ′ 1, R ′ 2] ′ should be of full rank when [r ′ 1, r ′ 2] ′ ≠ 0. A Fortran 90 program ispresented, which enables researchers to compute the GORIC for hypotheses in the contextof multivariate regression models.Keywords: Fortran 90, inequality constraint, model selection, order restriction, regressionmodel.1. IntroductionAnraku (1999) proposes the order-restricted information criterion, ORIC. The ORIC is appliedto models of the form y ij = β j + ɛ ij , where y ij is observation i (with i = 1, . . . , N j )for group j (with j = 1, . . . , k), β j is the mean for group j, <strong>and</strong> ɛ ij is the error term, whichfollows a normal distribution with mean 0 <strong>and</strong> variance σ 2 . This model selection criterioncan only be used to select the best of a set of hypotheses that can be written as simple orderrestrictions (e.g., H 1 : β 1 ≥ . . . ≥ β k <strong>and</strong> H 2 : β 1 = . . . = β k ′ ≥ . . . ≥ β k ). <strong>Kuiper</strong>, <strong>Hoijtink</strong>,<strong>and</strong> Silvapulle (2011) propose a generalization of the ORIC, called the GORIC, that can beapplied to a more general form of order restrictions, namely H m : Rβ ≥ 0 for m ∈ M, whereM is the set of hypothesis indices, β a vector of length k, <strong>and</strong> R a c m ×k matrix. Special cases


2 A Fortran 90 Program for the Generalization of the Order-Restricted Information Criterionof these matrix order restrictions are the simple order (i.e., H m : β 1 ≥ . . . ≥ β k ) <strong>and</strong> the treeorder (i.e., H m : β 1 ≥ β 2 , . . . , β 1 ≥ β k ). <strong>Kuiper</strong>, <strong>Hoijtink</strong>, <strong>and</strong> Silvapulle (unpublished) extendthe use of the GORIC to univariate <strong>and</strong> multivariate normal linear models with hypothesesof the type H m : β ∈ C m , where C m is a closed convex cone or a relocated one <strong>and</strong> β is avector of length tk containing the parameters in a t-variate normal linear model, with k thenumber of predictors (which can include an intercept) as deliberated below. The hypothesesof interest <strong>and</strong> therewith the closed convex cones are further discussed in Section 2.2.In the next section, the GORIC will be presented in the context of multivariate regressionmodels. The GORIC comprises a likelihood part <strong>and</strong> a penalty part. The likelihood is computedusing order-restricted maximum likelihood estimators. The iteration process employedto obtain the order-restricted maximum likelihood estimators is described in Section 3. InSection 4, we will elaborate on the penalty part. Section 5 illustrates the application ofthe GORIC in the context of univariate <strong>and</strong> multivariate analysis of variance. Appendix Acontains a user manual for the software.2. The GORICIn this section, we provide the GORIC applicable to hypotheses of the form H m : β ∈ C mformulated for a t-variate regression model. The derivation is shown in <strong>Kuiper</strong> et al. (2011).First, we briefly discuss the t-variate regression model. Then, we give the expression of theGORIC. Finally, we elaborate on the hypotheses that can be evaluated by it.2.1. The t-variate regression modelA multivariate regression model with t dependent variables can be written asy 1i = β 1,1 d 1i + . . . + β 1,k ′d k ′ i + β 1,k ′ +1x k ′ +1,i + . . . + β 1,k x ki + ɛ 1i.(1)y ti = β t,1 d 1i + . . . + β t,k ′d k ′ i + β t,k ′ +1x k ′ +1,i + . . . + β t,k x ki + ɛ tiwhere y hi denotes the score of the ith person on the hth dependent variable for i = 1, . . . , N<strong>and</strong> h = 1, . . . , t. The d variables are the predictors that represents group membership, whend ji = 1 person i belongs to group j for j = 1, . . . , k ′ . The mean of dependent variableh of group j (conditional upon the x variables) is denoted by β h,j . The x variable arecontinuous predictors, where x ji reflects the score of the ith person on the jth predictor forj = k ′ + 1, . . . , k. The relationship between x ji <strong>and</strong> y ti (controlled for the other predictors) isdenoted by β t,j . Finally, it is assumed that⎡⎢⎣ɛ 1i.ɛ ti⎤ ⎛⎡0⎥ ⎜⎢⎦ ∼ N t ⎝⎣.0⎤⎥⎦ , Σ =⎡σ 2 ⎤⎞1 · · · σ 1t⎢⎣.. ..⎥⎟. ⎦⎠ .σ 1t · · · σt2It is noteworthy that the βs associated with x variables regarding the same dependent variableare only comparable when the corresponding x variables are st<strong>and</strong>ardized. Moreover, βsassociated with x variables belonging to different dependent variables can solely be examinedif both the dependent variables <strong>and</strong> the x variables are st<strong>and</strong>ardized.


Journal of Statistical Software 32.2. The hypotheses of interestLet β = (β 1,1 , . . . , β 1,k , . . . , β t,1 , . . . , β t,k ) <strong>and</strong> β l the lth element of β for l = 1, . . . , tk. TheGORIC can be applied to hypotheses that are closed convex cones or relocated ones; bothdenoted by C m . In this article, we will focus onH m : R 1 β ≥ r 1 , R 2 β = r 2 , (2)where R 1 is a c m1 × tk matrix, R 2 a c m2 × tk matrix, r 1 a vector of length c m1 , <strong>and</strong> r 2 avector of length c m2 . For closed convex cones it holds true that r 1 = r 2 = 0. Special casesof closed convex cones are the simple order, the tree order, <strong>and</strong> the matrix order (Silvapulle<strong>and</strong> Sen 2005, pp. 82). In case of a relocated closed convex cone, that is, for [r ′ 1 , r′ 2 ]′ ≠ 0,a requirement is needed (see (<strong>Kuiper</strong> et al. 2011) <strong>and</strong> Section 4): R = [R ′ 1 , R′ 2 ]′ is of fullrank. Note that full rank of R may be obtained by discarding redundant restrictions. Forexample, a set of restrictions containing β l ≥ r 11 , β l ≤ r 12 is not a relocated closed convexcone for r 11 ≠ r 12 , since R is not of full rank <strong>and</strong> there are no redundant restrictions. Forβ l ≥ r 11 , β l ′ ≥ r 12 , β l + β l ′ ≥ r 13 for l ≠ l ′ , R is not of full rank either. However, whenr 11 + r 12 ≥ r 13 , β l + β l ′ ≥ r 13 is redundant. In case this redundant restriction is discarded, Ris of full rank, that is, H m : β l ≥ r 11 , β l ′ ≥ r 12 is a relocated closed convex cone.2.3. The GORICLetY =⎡⎢⎣⎤y 11 , . . . , y t1⎥. . ⎦ ,y 1N , . . . , y tNy i = [y 1i , . . . , y ti ] ′ ,⎡⎤d 11 , . . . , d k⎢′ 1, x k ′ +1,1, . . . , x k1⎥X = ⎣ .. .. ⎦ , (3)d 1n , . . . , d k ′ n, x k ′ +1,n, . . . , x knx i = [d 1i , . . . , d k ′ i, x k ′ +1,i, . . . , x ki ] ′ ,⎡⎤β 1,1 , . . . , β t,1⎢⎥B = ⎣ . . ⎦ .β 1,k , . . . , β t,kAccording to <strong>Kuiper</strong> et al. (unpublished), it holds true for t-variate regression models withH m : β ∈ C m thatGORIC m = −2 log f(Y |X, ˜B m , ˜Σ m ) + 2 P T m , (4)withlog f(Y |X, ˜B m , ˜Σ m ) = − tN 2 log{2π} − N 2 log |˜Σ m | − 1 2<strong>and</strong>P T m = 1 +tk∑l=1w l (tk, W, C m ) l,N∑ɛ ′ i(˜Σm ) −1ɛi ,i=1


4 A Fortran 90 Program for the Generalization of the Order-Restricted Information Criterionwhere log f(Y |X, ˜B m , ˜Σ m ) is the log-likelihood, ˜B m <strong>and</strong> ˜Σ m are the order-restricted maximumlikelihood estimators of B <strong>and</strong> Σ, respectively, P T m is the penalty part, w l (tk, W, C m ) denotesthe level probability for level l, <strong>and</strong>ɛ i = y i − ( ˜B m ) ′ x i ,W = ˆΣ ⊗ [X ′ X] −1 , (5)withˆΣ = N −1 (Y − X ˆB) ′ (Y − X ˆB) (6)<strong>and</strong>ˆB = (X ′ X) −1 X ′ Y.Hence, ˆΣ <strong>and</strong> ˆB are the (unrestricted) maximum likelihood estimators of Σ <strong>and</strong> B, respectively.The derivation of the penalty can be found in <strong>Kuiper</strong> et al. (unpublished). In that,Σ is assumed to be known up to a positive constant, that is, Σ = σ 2 S with S a known t × tmatrix <strong>and</strong> σ 2 a constant which represents the variance when t = 1. Since Σ is often notknown, it is estimated by ˆΣ, see Equation (6). The GORIC is easily applied, namely thehypothesis/model H m (see Equation (2)) with the lowest GORIC value (see Equation (4)) isthe preferred one.In the next two sections, we will subsequently elaborate upon the the order-restricted maximumlikelihood estimators ˜B m <strong>and</strong> ˜Σ m <strong>and</strong> the penalty term P T m .3. Order-Restricted maximum likelihood estimatorsThe order-restricted maximum likelihood estimators, ˜Bm <strong>and</strong> ˜Σ m , are obtained byFrom this it follows thatminβ∈H m,ΣN∑(y i − ( ˜B m ) ′ x i ) ′ Σ −1 (y i − ( ˜B m ) ′ x i ).i=1˜B m = arg minβ∈H mN ∑i=1(y i − B ′ x i ) ′ (˜Σm ) −1(yi − B ′ x i ), (7)˜Σ m = N −1 (Y − X ˜B m ) ′ (Y − X ˜B m ). (8)It should be stressed that in univariate regression (i.e., for t = 1) the β parameters do notdepend on ˜Σ m = ˜σ 2 m. In multivariate regression, ˜B m depends on the unknown ˜Σ m (for t > 1)<strong>and</strong> ˜Σ m on the unknown ˜B m , iterations are required to calculate them. The iteration processcomprises the following steps1. Set ˜B 0 m equal to ˆB = (X ′ X) −1 X ′ Y , the (unrestricted) maximum likelihood estimatorof B. Note that any value for ˜B 0m can be chosen. We employ ˆB to increase the speedof convergence <strong>and</strong>, therefore, to reduce computing time.2. Optimize ˜Σ m p by substituting ˜B m with ˜B m p−1in Equation (8) for p = 1, . . . , P .3. Optimize ˜B m p by replacing ˜Σ m with ˜Σ m p in Equation (7) for p = 1, . . . , P . For the calculationof ˜B m , one can use a quadratic programming algorithm like the IMSL subroutineQPROG (Visual Numerics 2003, pp. 1307–1310) in Fortran 90.


Journal of Statistical Software 54. Continue steps 2 <strong>and</strong> 3 until convergence is reached (at step P ) <strong>and</strong> set ˜B m <strong>and</strong> ˜Σ mequal to ˜B Pm <strong>and</strong> ˜Σ m P, respectively. We base the convergence criterion on the values ofthe parameter estimates. Namely, we stop iterating when the elements of | ˜β p m − ˜β p−1 m |<strong>and</strong> |˜Σ m p − ˜Σ m p−1 | are less than 1e − 10.4. The penalty partIn this section, we elaborate on the calculation of the penalty term. First, we briefly describelevel probabilities in case Σ is known up to a positive constant. In that case, ˆΣ in Equation (5)is replaced by Σ. Moreover, we give an interpretation of the penalty term. Then, we discussthe consequences of estimating Σ from the data by ˆΣ.A level probability w l (tk, W, C m ) is the probability that there are l levels among the tk orderrestrictedmaximum likelihood estimators, which are in accordance with C m , given that theparameters β are generated from a normal distribution with a mean vector of zeros <strong>and</strong> covariancematrix W (see also Anraku (1999); Silvapulle <strong>and</strong> Sen (2005, pp. 77–83); Robertson,Wright, <strong>and</strong> Dykstra (1988, pp. 69)).According to <strong>Kuiper</strong> et al. (unpublished), all closed convex cones (r 1 = r 2 = 0) <strong>and</strong> relocatedones (r = [r ′ 1 , r′ 2 ]′ ≠ 0) can be written in the form H m : R 1 β ∗ ≥ 0, R 2 β ∗ = 0, with β ∗ = βwhen r 1 = r 2 = 0 <strong>and</strong> β ∗ = β − q <strong>and</strong> [R ′ 1 , R′ 2 ]′ q = r when r ≠ 0. Note that q only exist when[R ′ 1 , R′ 2 ]′ is of full rank (after discarding redundant restrictions). Let C m = {β ∈ R tk : R 1 β ∗ ≥0, R 2 β ∗ = 0}.Below, we first assume that Σ is known up to the positive constant σ 2 : Σ = σ 2 S with S aknown matrix. After that, we discuss the consequences of Σ being estimated from the data.The calculation of the level probabilities can be done via simulation (Silvapulle <strong>and</strong> Sen 2005,pp. 78–81). The simulation consists of 5 steps:1. Generate z (of dimension tk) from N tk (β 0 = 0, W ), with W = σ 2 S ⊗ [X ′ X] −1 , where Sis a known matrix. Silvapulle <strong>and</strong> Sen (2005, pp. 86) <strong>and</strong> Robertson et al. (1988, p. 69)prove that the calculation of the level probabilities does not depend on the mean valueβ 0 for closed convex cones. Furthermore, Robertson et al. (1988, p. 69) demonstratefor closed convex cones that the calculation of the level probabilities are invariant forpositive constants like σ 2 <strong>and</strong> N. However, there is one exception, which is discussedbelow.2. Compute ˜z m via ˜z m = arg min β ∗ ∈{β ∗ ∈R tk :R 1 β ∗ ≥0,R 2 β ∗ =0}(z −β ∗ ) ′ W −1 (z −β ∗ ), such thatthe parameters are in accordance with R 1 β ∗ ≥ 0, R 2 β ∗ , the hypothesis of interest.To implement this in software, one requires a quadratic programming algorithm. Forexample, one can use the IMSL subroutine QPROG (Visual Numerics 2003, pp. 1307–1310) in Fortran 90.3. Determine the number of levels in ˜z m <strong>and</strong> denote this by L m . Let restriction a be denotedby R 1a β ∗ ≥ 0 for a = 1, . . . , c m1 , A = {a : R 1a˜z m = 0}, that is, the set of restrictionindices for which the restriction is binding, <strong>and</strong> φ = {β : R 1a β ∗ = 0 ∀ a ∈ A}. Then,L m is the dimension of φ.


6 A Fortran 90 Program for the Generalization of the Order-Restricted Information Criterion4. Repeat the previous steps T (e.g., T = 100, 000) times. To examine the stability of thepenalty term, one could calculate it a second time with another seed value. If the twopenalties are dissimilar, one should increase the value of T .5. Estimate the level probability w l (tk, W, C m ) by the proportion of times L m is equal tol (l = 1, . . . , tk) in the T simulations.As discussed in the first simulation step, the level probabilities are invariant for the meanvalue β 0 <strong>and</strong> the variance term σ 2 . This holds almost always true for closed convex conesH m : R 1 β ≥ 0, R 2 β = 0 <strong>and</strong> relocated ones H m : R 1 β ≥ r 1 , R 2 β ≥ r 2 where [r 1 ′ , r′ 2 ]′ ≠ 0<strong>and</strong> [R 1 ′ , R′ 2 ]′ is of full rank after discarding redundant restrictions. There is one exception,namely restrictions of the type β l ≥ r 11 (including r 11 = 0) for l = 1, . . . , tk. When thehypothesis of interest contains this type of restriction, one must use β 0 = 0. This results inlevel probabilities that are invariant for the value of σ 2 .Notably, the level probabilities for H m : β l ≥ r 11 are the same as for H m : β l ≥ 0, that is,here is no difference in complexity for these two hypothesis. When sampling z from N 1 (0, W )with W a scalar, half of the time H m : z ≥ 0 is valid <strong>and</strong> ˜z m has one level; the other timeH m : z ≥ 0 will be invalid <strong>and</strong> ˜z m has zero levels. As a consequent, the expected dimensionof β l for H m : β l ≥ r 11 is a half.The penalty termP T m = 1 +tk∑l=1w l (tk, W, C m ) lcan be seen as the expected dimension of the parameters. That is, the expected dimension ofβ values plus 1 because of the unknown variance term σ 2 in Σ = σ 2 S with S a known matrix.Until now, we have assumed in the calculation of the level probabilities that Σ is known upto the constant σ 2 . Often Σ is unknown, in that case one should estimate it to determine thelevel probabilities. However, when t = 1, no estimation of Σ = σ 2 is required, since the levelprobabilities are invariant of positive constants like σ 2 (see Step 1). In contrast, Σ needs tobe estimated for t > 1. One can estimate Σ by ˆΣ, see Equation (6); as is done in the software.If Σ is estimated from the data, the dimension of Σ, which is the number of unknown distinctelements of Σ, is (t + 1)t/2 instead of 1. Since the restrictions are always on the β parameters<strong>and</strong> never on the elements of Σ, the number of unknown distinct elements is equal for allhypotheses of interest (H m ). So, although the penalty should then be corrected, the correctionis equal for all H m for m ∈ M.In the next section, we will demonstrate evaluating hypotheses with the GORIC for differenttypes of models.5. The GORIC illustrated5.1. Analysis of variance (ANOVA)In this section, we will illustrate the GORIC supported by real data for which the descriptivestatistics are available in Lievens <strong>and</strong> Sanchez (2007). They investigated the effect of training


Journal of Statistical Software 7m ˜β 1m ˜β 2m ˜β 3 m log f(Y |X, ˜B m , ˜Σ m ) P T m GORIC m1 0.79 0.64 0.29 -24.85 2.84 55.382 0.79 0.64 0.29 -24.85 2.90 55.50u 0.79 0.64 0.29 -24.85 4.00 57.70Note. Bolding indicates the lowest value.Table 1: GORIC of the three specified hypotheses.on the quality of ratings made by consultants. One variable of interest is the signal detectionaccuracy index, which “refers to the extent to which individuals were accurate in discerningessential from nonessential competencies for a given job” <strong>and</strong> is measured by “st<strong>and</strong>ardizedproportion of hits - st<strong>and</strong>ardized proportion of false alarms” (Lievens <strong>and</strong> Sanchez 2007, p.817). Three groups of consultants are distinguished: 1) expert, 2) training, <strong>and</strong> 3) control.There are 21 raters in the expert group, 25 in the training group, <strong>and</strong> 26 in the controlgroup. Hence, the ANOVA model can be written as Equation (1) with t = 1, k ′ = 3, <strong>and</strong>N = ∑ kj=1 n j = 21 + 25 + 26 = 72, where d 1 , d 2 , <strong>and</strong> d 3 denote group membership variables.Since t = 1, we will drop the first subscript in the index for ease of notation <strong>and</strong> use β j insteadof β 1,j . Note that for t = 1 no iteration is required between ˜B m <strong>and</strong> ˜Σ m (see Section 3), <strong>and</strong>that Σ does not need to be estimated to calculate the level probabilities (see Section 4).The authors expected that accuracy of competency ratings would be higher among experts<strong>and</strong> trained raters than among raters in the control group (i.e., β 1 ≥ β 3 <strong>and</strong> β 2 ≥ β 3 ) <strong>and</strong>furthermore, that it would be highest among raters who already had competency modelingexperience (i.e., β 1 ≥ β 2 ). These expectations can be represented by the hypothesis H 1 : β 1 ≥β 2 ≥ β 3 . Another theory could be that the accuracy of the training group is at least twice ashigh as the one in the control group <strong>and</strong> that that of the export group is higher than that of thetraining group. This leads to H 2 : β 1 ≥ β 2 ≥ 2 β 3 . Since both can be bad/weak hypotheses,it is informative to evaluate the unconstrained hypothesis (H u ) as well, in which there areno restrictions on the parameters. Namely, its inclusion ensures that no weak hypothesis isselected, since H u will be preferred if the other two hypotheses are weak / do not fit the data.The set of hypotheses, therefore, consists ofH 1 : β 1 ≥ β 2 ≥ β 3 ,H 2 : β 1 ≥ β 2 ≥ 2 β 3 ,H u : β 1 , β 2 , β 3 .Table 5.1 displays the order-restricted means ˜β jm in Equation (7), the log likelihood values logf(Y |X, ˜B m , ˜Σ m ), the penalty terms P T m , <strong>and</strong> the GORIC values in Equation (4), for the threehypotheses of interest. Since the sample means are in accordance with the restrictions in allthe three hypotheses, the order-restricted means of these hypotheses are equal to the samplemeans. Therefore, the three hypotheses have the same log likelihood <strong>and</strong> the distinctionbetween the three is based on the penalty, that is, the complexity of the hypotheses. SinceH 1 is less complex than H 2 <strong>and</strong> H u (i.e, P T 1 < P T 2 <strong>and</strong> P T 1 < P T u ), H 1 is the preferredhypothesis. As a result, the first theory is preferred over the second <strong>and</strong> it is not a weaktheory.


8 A Fortran 90 Program for the Generalization of the Order-Restricted Information Criterion˜β m 1,2SDH SGOT SGPT˜β m 1,3˜β m 1,4˜β m 2,1˜β m 2,2m ˜β 1,1m0 24.13 24.13 24.13 24.13 105.38 105.38 105.38 105.38 59.70 59.70 59.70 59.701 24.13 24.13 24.13 24.13 105.37 105.37 105.37 105.37 63.00 63.00 60.64 52.16u 22.70 22.80 23.70 27.30 99.30 108.40 100.90 112.90 61.90 63.80 60.20 52.90Table 2: The order-restricted means for predictor j, dependent variable h <strong>and</strong> hypothesis m.˜β m 2,3˜β m 2,4˜β m 3,1˜β m 3,2˜β m 3,3˜β m 3,45.2. Multivariate analysis of variance (MANOVA)In this section, we will illustrate the GORIC supported by real data which are available onpage 10 of Silvapulle <strong>and</strong> Sen (2005) <strong>and</strong> in a report prepared by Litton Bionetics Inc in 1984.These data were used in an experiment to find out whether vinylidene fluoride gives rise toliver damage. Since increased levels of serum enzyme are inherent in liver damage, the focusis on whether enzyme levels are affected by vinylidene fluoride.Hence, the variable of interest is the serum enzyme level. Three types of enzymes are inspected,namely SDH, SGOT, <strong>and</strong> SGPT. To study whether vinylidene fluoride has an influenceon the three serum enzymes, four dosages of this substance are examined. In each ofthese four treatment groups, ten male Fischer-344 rats received the substance. The ANOVAmodel can be written as Equation (1) with t = 3, k ′ = 4, <strong>and</strong> N = 10. Hence, (y 1i , y 2i , y 3i ) ′denotes the observations on the three enzymes for rat i, d 1 to d 4 are the group membershipvariables, <strong>and</strong> β h,j denote the mean response for dose j <strong>and</strong> dependent variable h.If vinylidene fluoride induces liver damage, we expect that each serum level increases withthe dosage of the substance, see H 1 below. Another theory could be that there is no effect ofdosage, see H 0 below. Since both can be bad/weak hypotheses, it is informative to evaluatethe unconstrained hypothesis (H u ) in which there are no restrictions on the parameters. Theset of hypotheses, therefore, comprisesH 0 : β 1,h = β 2,h = β 3,h = β 4,h for all h = 1, 2, 3,H 1 : β 1,h ≥ β 2,h ≥ β 3,h ≥ β 4,h for all h = 1, 2, 3,H u : β 1,h , β 2,h , β 3,h , β 4,h for all h = 1, 2, 3.Note that there are twelve parameters in total.Since the covariance matrix Σ is unknown, it is estimated from the data by the maximumlikelihood estimator of Σ:⎡⎤10.79750 −0.85750 −0.07000ˆΣ = ⎣ −0.85750 226.75750 21.00500 ⎦ .−0.07000 21.00500 24.67500The formula of ˆΣ is displayed in Equation (6). This estimate, ˆΣ, is used in determining thelevel probabilities (see Section 4).Table 5.2 displays the order-restricted means ˜β jh m in Equation (7). Furthermore, Table 5.2presents the log likelihood values (log f(Y |X, ˜B m , ˜Σ m )), the penalty terms (P T m ), <strong>and</strong> theGORIC values in Equation (4), for the three hypotheses of interest. The hypothesis with thelowest GORIC value is the preferred hypothesis. Thus, from this table it is concluded that


Journal of Statistical Software 9H u is the preferred hypothesis. Although H 1 is preferred over H 0 , it is not preferred over theunconstrained hypothesis H u . Hence, H 1 is a weak theory.AcknowledgmentsWe are grateful to Mervyn J. Silvapulle for his comments. In addition, this research is financedby the Dutch Organization for Scientific Research NWO-VICI-453-05-002.ReferencesAnraku K (1999). “An Information Criterion for Parameters under a Simple Order Restriction.”Biometrika, 86, 141–152.<strong>Kuiper</strong> RM, <strong>Hoijtink</strong> H, Silvapulle MJ (2011). “An Akaike-type Information Criterion forModel Selection under Inequality Constraints.” Biometrika, 98 (2), 495–501. URL doi:10.1093/biomet/asr002.<strong>Kuiper</strong> RM, <strong>Hoijtink</strong> H, Silvapulle MJ (unpublished). “Generalization of the Order-RestrictedInformation Criterion for Multivariate Normal Linear Models.”Lievens F, Sanchez JI (2007). “Can Training Improve the Quality of Inferences Made byRaters in Competence Modeling? A Quasi-Experiment.” Journal of Applied Psychology,92, 812–819.Robertson T, Wright F, Dykstra R (1988). Order Restricted Statistical Inference. Chichester:Wiley.Silvapulle MJ, Sen PK (2005). Constrained Statistical Inference. New Jersey: Wiley.Visual Numerics (2003). IMSL Fortran Library User’s Guide MATH/LIBRARY Volume 2 of2: Mathematical Functions in Fortran.m log f(Y |X, ˜B m , ˜Σ m ) P T m GORIC m0 -406.54 4.00 821.091 -396.85 7.48 808.66u -388.80 13.00 803.61Note. Bolding indicates the lowest value.Table 3: The GORIC values of the three specified hypotheses.


10 A Fortran 90 Program for the Generalization of the Order-Restricted Information CriterionA. GORIC.exe user manualThis user manual will describe <strong>and</strong> illustrate the options available in GORIC.exe (publishedalong with this article <strong>and</strong> also available at http://www.fss.uu.nl/ms/<strong>Kuiper</strong>/). It alsoincludes a directory with the input <strong>and</strong> output files of the ANOVA <strong>and</strong> MANOVA examplegiven in this article. This program is made in Fortran 90 using the Intel Visual FortranCompiler 9.1 for Windows. This compiler uses IMSL 5.0.GORIC.exe is free, however, when results obtained with this program are published, pleaserefer to this paper, <strong>Kuiper</strong> et al. (2011), <strong>and</strong> <strong>Kuiper</strong> et al. (unpublished).A.1. GORIC.exeIn the software, the notation differs a bit from the one in Equation (1). First, all the d <strong>and</strong>x variables are combined, resulting in a N × k matrix X, like in Equation (3). Note thata variable of group membership is obtained by filling in ones <strong>and</strong> zeros at the appropriateplaces in a predictor/vector. Furthermore, the order of the predictors is not of importance,that is, the group membership variables do not need to come first. In addition, when thereare no group variables, one should include an intercept by adding a vector ones in X. Second,the parameters are taken together as well, leading to a vector of tk parameters β withindices 1 to tk. Notably, when k = 0, they will be denoted by θ, a vector of t variable /group means. The order of the parameters corresponds to the order of the k predictors <strong>and</strong>the order of the t dependent variables. Namely, the first k parameters belong to the firstdependent variable, · · · , <strong>and</strong> the last k parameters belong to the last one. Stated differently,(β 1 , . . . , β k , . . . , β (t−1)k+1 , . . . , β tk ) corresponds to β = (β 1,1 , . . . , β 1,k , . . . , β t,1 , . . . , β t,k ). Bearin mind that β 1 , β k+1 , . . . , β (t−1)k+1 reflect the intercepts when the first column of X consistsof ones.As discussed in Step 4 in Section 3, we stop iterating when the elements of | ˜β p m − ˜β p−1 m |<strong>and</strong> |˜Σ m p − ˜Σ m p−1 | are less than C = 1e − 10. But, to increase computing time, after 50, 000iterations, C is lowered to C = 1e − 9 <strong>and</strong> after 100, 000 to C = 1e − 8. When after 200, 000iterations still no convergence is achieved, the program uses the current estimates ˜β Pm <strong>and</strong> ˜Σ m P<strong>and</strong> displays these estimates together with ˜β P m −1 <strong>and</strong> ˜Σ m P −1in the dos box <strong>and</strong> the output file.The consequence of lowering C is that the procedure might not result in good approximationsof ˜β m <strong>and</strong> ˜Σ m . However, slow convergence only occurs when the hypothesis of interest doesnot fit the data.A.2. Modification input filesNo matter what analysis should be performed, two text files have to be modified (such thatthey apply to your data), namely Input.txt <strong>and</strong> Data.txt.It should be noted that:ˆ The names of the text files are fixed <strong>and</strong> cannot be changed. These files have to be textfiles (also known as ASCII files).ˆ The format of these files should not be changed, that is, do not add empty lines <strong>and</strong> donot delete lines containing labels.ˆ The data in Data.txt should be complete, that is, missing data are not allowed.


Journal of Statistical Software 11Data.txtThe file Data.txt looks as follows (in the MANOVA example):18 101 65 1 0 0 0...27 88 56 1 0 0 025 113 65 0 1 0 0...27 98 65 0 1 0 022 88 54 0 0 1 0...21 107 61 0 0 1 031 104 57 0 0 0 1...29 99 48 0 0 0 1In the data file, a N × (t + k) matrix must be given. The t dependent variables must begiven first, followed by the k predictors. In this example, the predictors only consist of groupmembership variables, denoted by d in Equation (1). In case there are no group membershipvariables, a vector of ones should be included, which represents the intercept.It should be stressed that a dot (“.”) should be used as decimal separator. When a comma(“,”) is used, only the number proceeding it is read (e.g., “1,9” is read as “1”). Furthermore,text or extra hard returns/enters should not be added to Data.txt.Input.txtThe file Input.txt looks as follows (in the MANOVA example):t k intercept N St<strong>and</strong> x St<strong>and</strong> y3 4 0 40 0 0Seed T123 100000M3Number of Equality (c_2) <strong>and</strong> Order (c_1) Restrictions for Each Model(resulting in M lines with 2 numbers)9 00 90 0R for Model 11 -1 0 0 0 0 0 0 0 0 0 0...


12 A Fortran 90 Program for the Generalization of the Order-Restricted Information Criterion0 0 0 0 0 0 0 0 0 0 1 -1R for Model 21 -1 0 0 0 0 0 0 0 0 0 0...0 0 0 0 0 0 0 0 0 0 1 -1R for Model 3r for Model 10...0r for Model 20...0r for Model 3t, k, <strong>and</strong> N t is the number of dependent variable, k the number of predictors, <strong>and</strong> N thenumber of observations, see Section 2.1; for k see also the item below.intercept This should be a 1 if you want the software to incorporate the intercept <strong>and</strong> a 0when you do not.When you want the software to include a vector of ones to the set of predictors, thesoftware will change k into k + 1. Consequently, the restrictions should be given fort(k + 1) parameters as opposed to tk. Note that the first parameter will represent theintercept.When your data (represented by the N × k matrix X) includes a vector of ones, thenumber of predictors (k) should include the intercept (see Section 2.1). In that case,“intercept” should be set to 0, otherwise the program will fail to continue.St<strong>and</strong> x <strong>and</strong> St<strong>and</strong> y If you set “St<strong>and</strong> x” to 1, the predictors (X) will be st<strong>and</strong>ardized.The analogue hods true for “St<strong>and</strong> y”.The parameters regarding the same dependent variable are only comparable when thex variables are st<strong>and</strong>ardized (see Section 2.2). Additionally, the parameters belongingto different dependent variables can solely be examined if both the dependent variables<strong>and</strong> the corresponding x variables (if any) are st<strong>and</strong>ardized.Seed <strong>and</strong> T The seed value is represented by “Seed” <strong>and</strong> the number of iterations requiredfor computing the penalty part of the GORIC by T . These are discussed in Simulationstep 4 in Section 4.M, c 2 <strong>and</strong> c 1 M denotes the number of models/hypotheses <strong>and</strong> c 2=c 2 <strong>and</strong> c 1=c 1 thenumber of equality <strong>and</strong> order restrictions, respectively, see Section 2.2.R <strong>and</strong> r R is the restriction matrix <strong>and</strong> equals [R ′ 2 , R′ 1 ]′ <strong>and</strong> r the right h<strong>and</strong> side <strong>and</strong> equals[r ′ 2 , r′ 1 ]′ (see Sections 2.2).The models are of the form H m : R 2 β = r 2 , R 1 β ≥ r 1 . It should be stressed that theorder of the restrictions are of importance: the c 2 equality restrictions must be given


Journal of Statistical Software 13first <strong>and</strong> the c 1 order restrictions second.One must give a restriction matrix (R = [R ′ 2 , R′ 1 ]′ ) <strong>and</strong> a right h<strong>and</strong> side (r = [r ′ 2 , r′ 1 ]′ )for each model. Hence, you need to fill in M restriction matrices with each a heading<strong>and</strong> then M right h<strong>and</strong> side vectors with each a heading. Note that there is only aheading when there are no restrictions, that is, in case of the unconstrained model.Bear in mind that the ordering of the columns in the restriction matrix depend on theordering of the parameters. In the software, the first k parameters belong to the firstdependent variable (h = 1), · · · , <strong>and</strong> the last k to the last dependent variable (h = t).Hence, in the example, β 1 corresponds to β 1,1 , β 2 to β 1,2 , · · · , β 5 to β 2,1 , · · · , <strong>and</strong> β 12to β 3,4 .As in Data.txt, text or extra hard returns/enters should not be added to Input.txt, exceptfor headings for additional models.A.3. Error messagesIn the program GORIC.exe error messages are incorporated to detect wrongly stated input.However, it is possible to make a mistake that we have not foreseen. In that case, check theinput <strong>and</strong> compare it to the data. If you cannot solve the problem, send the input <strong>and</strong> datafile to r.m.kuiper@uu.nl.The requirement that R = [R 2 ′ , R′ 1 ]′ should be of full rank when r = [r 2 ′ , r′ 1 ]′ ≠ 0 (see <strong>Kuiper</strong>et al. (2011) <strong>and</strong> Section 4) is investigated in the software. However, note that R is notexamined on redundant restrictions. Therefore, the software does not detect hypotheses thatare no relocated closed convex cones. A warning appears when R is not of full rank when r ≠ 0<strong>and</strong> the user is asked to investigate whether the additional restrictions are redundant. Bypressing the enter button, the program proceeds. It should be stressed that the program stopswithout a warning in case of conflicting restrictions (e.g., H m : β l ≤ −r 11 , β l ≥ r 11 for r 11 > 0).Moreover, the GORIC is calculated in presence of non-redundant restrictions, like rangerestrictions (e.g., H m : β l ≥ −r 11 , β l ≤ r 11 for r 11 > 0), which is is not a (relocated) closedconvex cone. In that case, the GORIC should be interpret with care for two reasons. First,the GORIC is not (yet) defined for these types of restrictions. Second, the level probabilitiesare now no longer invariant for β 0 <strong>and</strong> σ 2 . In the software, we use β 0 = 0. As a consequence,H m : β l = 0 is examined in determining the penalty.A.4. Save <strong>and</strong> closeWhen you have modified Input.txt <strong>and</strong> Data.txt (such that it applies to your data), youshould save <strong>and</strong> close it.A.5. Run GORIC.exeWhen GORIC.exe is completed, the output file Output.txt will be created in the folder youare working in.Output.txtThe output is given in Output.txt <strong>and</strong> will look as follows (in case of the MANOVA example):


14 A Fortran 90 Program for the Generalization of the Order-Restricted Information CriterionThis program is free. However, when results obtained with this program arepublished, please refer to:Rebecca M. <strong>Kuiper</strong>, Herbert <strong>Hoijtink</strong>, <strong>and</strong> Mervyn J. Silvapulle (2011).An Akaike-type Information Criterion for Model Selection under InequalityConstraints. Biometrika, 98 (2), 495-501.Rebecca M. <strong>Kuiper</strong>, Herbert <strong>Hoijtink</strong>, <strong>and</strong> Mervyn J. Silvapulle (unpublished).Generalization of the Order-Restricted Information Criterion for MultivariateNormal Linear Models.Rebecca M. <strong>Kuiper</strong> <strong>and</strong> Herbert <strong>Hoijtink</strong> (unpublished).A Fortran 90 Program for the Generalization of the Order-RestrictedInformation Criterion.N.B. The latter is included in this software <strong>and</strong> the second is available uponrequest (R.M.<strong>Kuiper</strong>@uu.nl).- Summary of observed data -- Number of observations (N) -N = 40- Sigma estimated from the data -h, estimated Sigma1 10.79750 -0.85750 -0.070002 -0.85750 226.75750 21.005003 -0.07000 21.00500 24.67500- Order-restricted betas -Note that the first 4 parameters belong to the first dependent variable, ...,<strong>and</strong> the last 4 to the last dependent variable.Group number: 1 2 3 4 5 6 7 8 9 1011 12Sample betas: 22.70 22.80 23.70 27.30 99.30 108.40 100.90 112.90 61.90 63.8060.20 52.90Hypothesis 1 24.13 24.13 24.13 24.13 105.38 105.38 105.38 105.38 59.70 59.70


Journal of Statistical Software 1559.70 59.70Hypothesis 2 24.13 24.13 24.13 24.13 105.37 105.37 105.37 105.37 63.00 63.0060.64 52.16Hypothesis 3 22.70 22.80 23.70 27.30 99.30 108.40 100.90 112.90 61.90 63.8060.20 52.90- GORIC -The value of the Generalized Order-Restricted Information Criterion (GORIC) =-2 * log likelihood + 2 * penalty:for Hypothesis 1, GORIC = -2 * -406.54 + 2 * 4.00 = 821.09for Hypothesis 2, GORIC = -2 * -396.85 + 2 * 7.48 = 808.66for Hypothesis 3, GORIC = -2 * -388.80 + 2 * 13.00 = 803.61According to the Order-Restricted Information Criterion,out of the set of hypotheses the preferred one is number 3,which is the unconstrained model, that is, the model without restrictions onthe parameters.Number of observations (N) See Section 2.1.Sigma estimated from the data In the software, Σ is estimated by ˆΣ in Equation (6), themaximum likelihood estimator of Σ. Bear in mind that Σ is only estimated when t > 1.For more details see Section 4.Order-restricted betas The order-restricted βs can be found in Equation (7), see alsoSection A.1. Note that the subscripts are 1 to 12 in the software, where ˜β 1m correspondsto ˜β 1,1 m , ˜β 2 m to ˜β 1,2 m , · · · , ˜β 5 m to ˜β 2,1 m , · · · , <strong>and</strong> ˜β 12 m to ˜β 3,4 m in Equation (1).GORIC The expression of the GORIC is displayed in Equation (4).The model/hypothesis with the lowest GORIC value is the preferred one: hypothesis“number 3”, that is, H u : β 1 , β 2 , β 3 , β 4 , β 5 , β 6 , β 7 , β 8 , β 9 , β 10 , β 11 , β 12 .


16 A Fortran 90 Program for the Generalization of the Order-Restricted Information CriterionAffiliation:Rebecca M. <strong>Kuiper</strong> <strong>and</strong> Herbert <strong>Hoijtink</strong>Department of Methodology <strong>and</strong> StatisticsUtrecht UniversityP.O. Box 801403508TC Utrecht, the Netherl<strong>and</strong>sE-mail: r.m.kuiper@uu.nlURL: http://www.fss.uu.nl/ms/<strong>Kuiper</strong>/

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!