 Home Mathematics  # Ordered Alternatives

Again consider the null hypothesis of equal distributions. Test this hypothesis vs. the ordered alternative hypothesis

HA : Fi(x) > Fl+i(x)Vx

for all indices г € {1.....К — 1}, with strict equality at some x and some i.

This alternative reduces parameter space to 1 /2л" of former size. One might use as the test statistic where Uij is the Mann-Whitney-Wilcoxon statistic for testing groups i vs. j. Reject the null hypothesis when J is large. This statistic may be expressed as ^2k=i ckR/c. plus a constant, for some Cfc satisfying (4.5); that is, J may be defined as a contrast of the rank means, and the approach of this subsection may be viewed as the analog of the parametric approach of §4.1.1.

Critical values for J can be calibrated using a Gaussian approximation. Under the null hypothesis, the expectation of J is and the variance is here [/, is the Mann-Whitney statistic for testing group i vs. all preceding groups combined, and m( = X!j=i ^j- The second equality in (4.20) follows from independence of the values I/,; (Terpstra, 1952). A simpler expression for this variance is This test might be corrected for ties, and has certain other desirable properties (Terpstra, 1952).

Jonckheere (1954), apparently independently, suggested a statistic that is twice J, centered to have zero expectation, and calculated the variance, skewness, and kurtosis. The resulting test is generally called the Jonckheere-Terpstra test.

Example 4.6.1 Consider again the Maize data from area ТЕ AN in Example 4-3.1. The treatment variable contains three digits; the first indicated nitrogen level, with four levels, and is extracted in the code in Example 4-3.1. Apply the Jonckheere-Terpstra test:

library(clinfun)# For the Jonckheere-Terpstra test j onckheere.test(tean\$wght,teanlnitrogen) cat(’ K-W Test for Maize, to compare with JT ’) kruskal.test(tean\$wght,teanSnitrogen)

to perform this test, and the comparative three degree of freedom Kruskal- Wallis test. The Jonckheere-Terpstra Test gives ap-value 0.3274,• as compared with the Kruskal- Wallis p-value 0.4994-

# Powers of Tests

This section considers powers of tests calculated from linear and quadratic combinations of of indicators The Jonckheere-Terpstra statistic (4.19) is of this form, as is the Kruskal- Wallis statistic (4.17), since the constituent rank sums can be written in terms of pairwise variable comparisons. Powers will be expressed in terms of the expectations Under the null hypothesis of equal populations, Kij = 1/2 for all i ф j.

In the case of multidimensional alternative hypotheses, effect size and efficiency calculations are more difficult than in earlier one-dimensional cases. In the case with К ordered categories, there are effectively К — 1 identifiable parameters, since, because the location of the underlying common null distribution for the data is unspecified, group location parameters 9j can all be increased or decreased by the same constant amount while leaving the underlying model unchanged. On the other hand, the notion of relative efficiency requires calculating an alternative parameter value corresponding to, at least approximately, the desired power, and as specified by (2.23). This single equation can determine only a single parameter value, and so relative efficiency calculations in this section will consider alternative hypotheses of the form for a fixed direction O '. Arguments requiring solving for the alternative will reduce to solving for Л. The null hypothesis is still 0 = 0.

## Power of Tests for Ordered Alternatives

The statistic J of (4.19) has an approximate Gaussian distribution, and so powers of tests of ordered alternatives based on J are approximated by (2.18). Under both null and alternative hypotheses, with Kij defined as in (4.22). Alternative values for under shift models (4.2) are calculated as in (3.25). Without loss of generality, one may take в = 0.

Consider parallels with the two-group setup of §3. The cumulative distribution function F of (4.2) corresponds to F of (3.1), and F2 corresponds to G of (3.1). Then ii(0) of (3.25) corresponds to «12. Calculation of «и, defined in (4.22), and applied to particular pairs of distributions, such as the Gaussian in (3.26) and the logistic in (3.27), and other calculations from the exercises of §3, hold in this case as well. Each of the difference probabilities кы, for к ф l, depends on the alternative distribution only through 0iOk.

Power may be calculated from (2.18).

Example 4.7.1 Consider К = 3 groups of observations, Gaussian, with unit variance and expectations в = 0, = 1/2, and вз = 1, and all

groups of size Mi = 20. Consider a one-sided level 0.025 test. Applying (3.26), K2 = «23 = Ф(.5/s/2) = 0.638, and «i3 = Ф(1/У2) = 0.760. The null and alternative expectations of J are and respectively, from (.2f). The null variance of J, from (4.21), is Applying (2.18), power is 1 — Ф((600 — 814.6)//5433.3 + 1.96) = 0.829. I used (2.18), rather than (2.20), since the variance of the distribution of the statistic was most naturally given above without division by sample size, and rather than (2.17), because calculating the statistic variance under the alternative is tedious.

This may be computed in R using

library(MultNonParam)

terpstrapower(rep(20,3),(0:2)/2,"normal")

and the approximate power may compared to a value determined by Monte Carlo; this value is 0.857.

## Power of Tests for Unordered Alternatives

Power for unordered alternatives is not most directly calculated as an extension of (2.22). In this unordered case, as noted above, the approximate null distribution for Wh is Xk~i• 0ne ln'glit attempt to act by analogy with (2.17), and calculate power using alternative hypothesis expectation and variance matrix. This is possible, but difficult, since Wh (and the other approximately Хк-i statistics in this chapter) are standardized using the null hypothesis variance structure. Rescaling this under the alternative hypothesis gives a statistic that may be represented approximately as the sum of squares of independent Gaussian random variables; however, not only do these variables have non-zero means, which is easily addressed using a non-central x2 argument as in (1.3), but also have unequal variances for the variables comprising the summands; an analog to (1.3) would have unequal weights attached to the squared summands.

A much easier path to power involves analogy with (2.22): Approximate the alternative distribution using the null variance structure. This results in the non-central chi-square approximation using (1.4).

Because the Mann-Whitney and Wilcoxon statistics differ only by an additive constant, the Kruskal-Wallis test may be re-expressed as Here k° = 1/2; this is the null value of кы, and the null hypothesis specifies that this does not depend on k or l, and ф = 1 / v^i~2, a multiplicative constant arising in the variance of the Mann-Whitney-Wilcoxon statistic. Many of the following equations follow from (4.25); furthermore, (4.25) also approximately describes other statistics to be considered later, and analogous consequences may be drawn for these statistics as well, with a different value for ф. Hence the additional complication of leaving a variable in (4.25) whose value is known will be justified by using consequences of (4.25) later without deriving them again. Here, the Mann-Whitney statistic for testing whether group k differs from all of the other groups, with all of the other groups collapsed.

The variance matrix for rank sums comprising is singular (that is, it does not have an inverse), and the argument justifying (1.4) relied on the presence of an inverse. The argument of §4.2.2 calculated the appropriate quadratic form, dropping one of the categories to obtain an invertible variance matrix, and then showed that this quadratic form is the same as that generating t. The same argument shows that the appropriate non-centrality parameter is where Ел [Tfc] = Mk J2h=i i^k ^lKkl- The non-centrality parameter is The restriction on the range of the inner summation l ф к may be dropped in (4.27), because the additional term is zero.

Let Gfi: and Gj"1 be the chi-square cumulative distribution and quantile functions respectively, as in §4.3.1. The power for a the Kruskal-Wallis test with К groups, under alternative (4.2), is approximately with £ given by (4.27).

Example 4.7.2 Continue example 4-7.1. Again, consider К = 3 groups of observations, Gaussian, with unit variance and expectations 9 = 0,

• 0-2 = 1/2, and 63 = 1, and all groups of size Л/, = 20. The three inner sums in (4.27) are 20 x (0.5 —0.5)+ 20 x (0.638 —0.5)+ 20 x (0.760 — 0.5) = 7.968, 20 x (0.362 - 0.5) + 20 x (0.5 - 0.5) + 20 x (0.638 - 0.5) = 0, and 20 x (0.240 - 0.5) + 20 x (0.362 - 0.5) + 20 x (0.5 - 0.5) = -7.968. Squaring, multiplying each of these by Mk = 20, and adding gives 2539.6. Multiplying by 12/(60 x 61) gives 8.327. The critical value for a test of level 0.05 is given by the y2 distribution with 2 degrees of freedom, and is
• 5.99. The tail probability associated with the non-central 2 distribution with non-centrality parameter 8.327 and two degrees of freedom, beyond
• 5.99, is 0.736; this is the power for the test. As expected, this power is less than that given in example (4-7.1) for the Jonckheere-Terpstra test. This might be compared with a Monte Carlo approximation of 0.770.

This may be computed using the R package MultNonParam using

kwpower(rep(20,3),(0:2)/2,"normal")

Approximation (4.27) may be approximated to give a simpler relation between the non-centrality parameter and sample size, allowing for the calculation of the sample size producing a desired power, denoted in this subsection as 1 — /3. In (4.28), sample size enters only through the non-centrality parameter. As in standard one-dimensional sample size calculations, re-express the relation between power and non-centrality as From (4.27), for •0=1//T2, and for A/t = limjV->oo Mk/N. Assume that At, > 0 for all k. Hence, to determine the sample size needed for a level a test to obtain power 1 —j3, for an alternative with group differences /сы, and with К groups in proportions At, first solve (4.29) for £, and then apply (4.31). An old-style approach to solving (4.29) involves examining tables of the sort in Haynarn et al. (1982a) and Haynam et al. (1982b).

Example 4.7.3 Again, consider К = 3 groups of observations, Gaussian, with unit variance and expectations в = 0, 9-2 = 1/2, and 9Л = 1 Calculate the sample size needed for the level a Kruskal- Wallis test involving equal-sized groups to obtain power 0.8. Quantities Kki were calculated in example f.7.1 to be 0.362, 0.5, and 0.638. The three inner sums in (f.Sl) are (0.5 - 0.5)/3 + (0.638 - 0.5)/3 + (0.760 - 0.5)/3 = 0.133, (0.362-0.5)/3+ (0.5 -0.5)/3 + (0.638 -0.5)/3 = 0, and (0.240-0.5)/3 + (0.362 — 0.5)/3 + (0.5 — 0.5)/3 = —0.133. Squaring, multiplying each of these by Xk = 1/3, adding, and multiplying by 12 gives 0.1408. Solving the equation (4-29) gives £ = 9.63, and the sample size is 9.63/0.1408 я» 69, indicating 23 subjects per group.

This may be computed using the R package MultNonParam using kwsamplesize((0:2)/2,"normal")

As in the one-dimensional case, one can solve for effect size by approximating the probabilities as linear in the shift parameters. Express and explore the multiplier Д from (4.23) giving the alternative hypothesis in a given direction. Then for f = y/" J2k=i (9l) ~ (ELi ! C plays a role analogous to its

role in §3.8, except that here it incorporates the vector giving the direction of departure from the null hypothesis.

Example 4.7.4 The sum with respect to j disappears from (f1, since the sum of the proportions Aj is 1. Under the same conditions in Example 4-7.3, one might use (4-33) rather than (4-31 )■ Take О1 = 0A = (0.1/2,1) and A = 1. The derivative к' is tabulated, for the Gaussian and logistic distributions, in Table 3.6 as /t'(O). In this Gaussian case, к' = //(()) = (2/^)-х = 0.282. Also, <2 = (02/3 + (l/2)2/3 + l2/3 — (0/3 + (1/2)/3 + 1/3)2) = 5/12— 1/4 = 1/6. The non-centrality parameter, solving (4-29), is f = 9.63. The approximate sample size is 9.63/(12 x l2 x 0.2822/6) = 61, or approximately 21 observations per group; compare this with an approximation of 23 per group using a Gaussian approximation, without expanding the K,j.

This may be computed using the R package MultNonParam using kwsamplesize((0:2)/2,"normal",taylor=TRUE)

Relation (4.33) may be solved for the effect size Д, to yield In order to determine the effect size necessary to give a level a test power 1 — 8 to detect an alternative hypothesis A0 determine f from (4.28), and then Д from (4.34).

Example 4.7.5 Again use the same conditions as in Example 4-7.3. Consider three groups with 20 observations each; then N = 60. Recall that ф = j/Y2. Example 4-7-4 demonstrates that f = 9.63, к' = 0.282,

C2 = 1/6, and A = (-/9.63/(60 x 12 x (l/6)))/0.282 = 1.004. As A is almost exactly 1, the alternative parameter vector in the direction of (0,1/2,1) and corresponding to a level 0.05 test of power 0.80 with 60 total observations is (0,1/2,1).

This may be computed using the R package MultNonParam using kweffectsize(60,(0:2)/2,"normal")

Figure 4.3 reflects the accuracy of the various approximations in this section. It reflects performance of approximations to the power of the Kruskal- Wallis test. Tests with К = 3 groups, with the same number of observations per group, with tests of level 0.05, were considered. Group sizes between 5 and 30 were considered (corresponding to total sample sizes between 15 and 90). For each sample size, (4.34) was used to generate an alternative hypothesis with power approximately 0.80, in the direction of equally-spaced alternatives. The dashed line represents a Monte Carlo approximation to the power, based on 50,000 observations. The dotted line represents the standard non-central chi-square approximation (4.28). The solid line represents this same approximation, except also incorporating the linear approximation (4.32) representing the exceedance probabilities as linear in the alternative parameters, and hence the non-centrality parameter as quadratic in the alternative parameters.

The solid line is almost exactly horizontal at the target power. The discrepancy from horizontal arises from the error in approximating (4.27) by

(4.30). For small sample sizes (that is, group sizes of less than 25, or group sizes of less than 75), (4.28) is not sufficiently accurate; for larger sample sizes it should be accurate enough.

All curves in Figure 4.3 use the approximate critical value in (4.18).

FIGURE 4.3: Approximate Powers for the Kruskal-Wallis Test # E ciency Calculations

Power calculations for one-dimensional alternative hypotheses made use of (2.22), applying a Gaussian approximation with exact values for means under the null and alternatives, and approximating the variance under the alternative by the variance under the null. Efficiency calculations of §2.4 and §3.8 approximated means at the alternative linearly using the derivative of the mean function at the null. Tests for ordered alternatives from this chapter will be addressed similarly.

## Ordered Alternatives

Consider first the one-sided Jonckheere-Terpstra test of level a. Let Tj = J/N2. In this case, the subscript J represents a label, and not an index. Denote the critical value by tj, satisfying P^o [Tj > t°j] = I — a.

As in (4.23), reduce the alternative hypothesis to a single dimension by letting the alternative parameter vector be a fixed vector times a multiplier Л. The power function zuj_n(A) = Pg.i [Tj > tj] satisfies (2.15), (2.19), and (2.21), and hence the efficiency tools for one-dimensional hypotheses developed in §2.4.2 may be used. Expressing ji.j{A) as a Taylor series with constant and linear terms, for A, = Mi/N, where again k° is the common value of кд under the null hypothesis, and к' is the derivative of the probability in (3.25), as a function of the location shift between two groups, evaluated at the null hypothesis, and calculated for various examples in §3.8.2. Hence Recall that к.,у depended on two group indices i and j only because the locations were potentially shifted relative to one another; the value к0 and its derivative к! are evaluated at the null hypothesis of ('quality of distributions, and hence do not depend on the indices. Furthermore, from (4.21),

Var [Tj] яа -L 1 — ^2k=i Ад. /N. Consider the simple case in which Ад. = 1

for all k, and in which (9j — dj = (j — i). Then //.j(0) = к'(К2 l)/6K, Var [Tj] « ^ [1 - 1/A'2] /N, and The efficacy of the Gaussian-theory contrast, from (4.6), is JK2 — /(2fZs). Hence the asymptotic relative efficiency of the Jonckheere-Terpstra test to the contrast of means is This is the same as the asymptotic relative efficiency in the two-sample case in Table 3.6.

## Unordered Alternatives

While techniques exist to consider transform non-central chi-square statistics to an approximate Gaussian distribution (Sankaran, 1963), the most direct approach to comparing efficiency of various tests based on approximately 2 statistics is to compare approximate sample sizes for a fixed level and power, as in (4.33). Prentice (1979) does this using ratios of the non-centrality parameter. For test j, let Nj, be the sample size needed to provide power 1 — /3 for the test of level j against alternative 0A.

Calculations (4.27), and (4.30) through (4.34) were motivated specifically for the Kruskal-Wallis test, using Mann-Whitney sums (4.26), but hold more broadly for any group summary statistics replacing (4.26), so long as Eo [Тд] я» Mik° for some differentiable k°. and as long as (4.25), with the new definition of Тд, is approximately Xk_i • particular, a rescaled version of the F-test statistic (4.3) — 1)WA = £д=1 Мд(Хд. — X..)2/a2 is such a test, with ф the variance of the Xkj, Тд = Xk.X . к0 = 0, and к' = 1, approximated as the 2 statistic by treating the variance as known. Suppose that the value of £2 remains unchanged for two such tests, and suppose these tests have values of к' at the null hypothesis distinguished by indices (for concreteness, label these as k'h and к'А = 1 for the correctly-scaled Kruskal- Wallis and F tests respectively), and similarly having the norming factors ф distinguished by indices (again, фн = j/Yl and фл = £2). Equate the alternatives in (4.34), to obtain The ratio of sample sizes needed for approximately the same power for the same alternative from the F and Kruskal-Wallis tests is approximately which is the same as in the unordered case (4.35) and in the two-sample case.

# Exercises

1. The data set

http://ftp.uni-

bayreuth.de/math/statlib/datasets/federalistpapers.txt

gives data from an analysis of a series of documents. The first column gives document number, the second gives the name of a text file, the third gives a group to which the text is assigned, the fourth represents a measure of the use of first person in the text, the fifth presents a measure of inner thinking, the sixth presents a measure of positivity, and the seventh presents a measure of negativity. There are other columns that you can ignore. (The version at Statlib, above, has odd line breaks. A reformatted version can be found at stat.rutgers.edu/home/kolassa/Data/federalistpapers.txt).

a. Test the null hypothesis that negativity is equally distributed across the groups using a Kruskal-Wallis test.

b. Test at a = .05 the pairwise comparisons for negativity between groups using the Bonferroni adjustment, and repeat for Tukey’s HSD.

2. The data set

http://ftp.uni-bayreuth.de/math/statlib/datasets/Plasma_Retinol

gives the results of a study on concentrations of certain nutrients among potential cancer patients. This set gives data relating various quantities, including smoking status (1 never, 2 former, 3 current) in column 3 and beta plasma in column 13. Perform a nonparamet- ric test to investigate an ordered effect of smoking status on beta plasma.

3. Consider using a Kruskal-Wallis test of level 0.05, testing for equality of distribution for four groups of size 15 each. Consider the alternative that two of these groups have observations centered at zero, and the other two have observations centered at 1, with all observations from a Cauchy distribution.

a. Approximate the power for this test to successfully detect the above group differences.

b. Check the quality of this approximation via simulation.

c. Approximate the size in each group if a new study were planned, with four equally-sized groups, and with the groups centered as above, with a target power of 90%.

4. The data at

http://lib.stat.cmu.edu/datasets/CPS_85_Wages

reflects wages from 1985. The first 42 lines of this file contain a description of the data set, and an explanation of variables; delete these lines first, or skip them when you read the file. All fields are numeric. The tenth field is sector, and the sixth field is hourly wage; you can skip everything else. Test for a difference in wage between various sector groups, using a Kruskal-Wallis test.

 Related topics