On “The Analysis of Ranked Data Derived from Completely Randomized Factorial Designs”

Extensions of the Kruskal-Wallis procedure for a factorial design are reviewed and researched under various degrees and kinds of nonnullity. It was found that the distributions of these test statistics are a Function of effects other than those being tested except under the completely null situation and their use is discouraged.


INTRODUCTION
The need to have satisfactory rank-sum methods for factorial designs has historically been one of the main detractions of the area of nonparametric statistics. Gaito (1959), Gardner (1975), and Siegel (1956) are illustrative of the comment from the applied researcher that suitable tests for main effects and interactions were needed to complete the package of those methods not making the normality assumption. However, few statistics textbooks offer the applied researcher rank-sum methods for two-factor designs with more than one observation per cell. If any tests are given for factorial designs (see Bradley,1968) they are less than suitable in that these methods will be low in efficiency, and no unified approach is given for all tests. Typically, the Kruskal-Wallis (or Friedman for matched data) tests are done on various sums or differences of the observations; that is, the researcher collapses the data over one main

770
Tootkakoji and Chang effect to test for the other main effect and then applies the Kruskal-Wallis (or Friedman) test to the resulting sum. Various differences, the number of which depends on the design, are taken to measure interaction, and the appropriate test is done on these differences. The obvious loss in power efficiency from not considering variability due to the other effects gives the major deficiency of these methods. The lack of a common procedure for each test in each design also detracts from the usefullness of these approaches. Scheirer, Ray, and Hare (1976) also mention this problem, citing several sources which present solutions, including Kehra and Sen (1969) and Patel and Hoel (1973) who give tests on interaction effects from various models. The contribution by Mehra and Sen of a test for interaction is part of the area of "aligned ranks" which proceeded from a suggestion by Hodges and Lehmann (1962) and resulted in tests for main effects in an additive model (Mehra & Sarangi, 1967;Sen, 1968), a test for interaction in a completely randomized factorial design (Mc-Sweeney, 1967), and tests for interaction in mixed models or split-plot designs (Koch, 1969;Koch & Sen, 1968). Marascuilo and McSweeney (1977) and Puri and Sen (1971) are texts which cover aligned ranks methods. An additional rank procedure, which is a promising competitor to the aligned rank methods, is the rank transform procedure of using the usual parametric analysis of variance on the ranks (see Iman, 1974

METHOD
A computer program was written to perform simulations of the Scheirer et al. statistics and the ANOVA, for comparison purposes, for a 3 x 2 factorial design with five observations per cell. Data was generated for a normal population using the method due to Box and Muller (1958), which transforms independent unit uniform psuedo-random numbers from a procedure due to Chen (1971). For the interaction effect, one null and two nonnull cases were examined, where the interaction effects in the nonnull cases were chosen so as to give power of approximately .60 and .90 for the ANOVA F-test when a • .05. All main effects and interaction effects are given in Table I. Factorial combination of these cases gave 12 sets of means for which the empirical probability of rejecting H was recorded for one main j 72 TootkakoA and Chang effect test and the interaction test for 1,000 replications. The .10, .05, .025, and ,01 levels of significance were used. Finally, all cases were examined for the same 1,000 replications so as to facilitate comparison of rejection rates for different combinations of cases.    than 2c p from theoretical a (a p -/ a(l-a)/1000 )

ToothakoA and Chang
the null case X^j^ -y + e^j^, a^ « .036, and a HI «= .049, but for nonnull effects for the other main effect, a^ = .006 and a HI e -OH-For nonnull interaction effects only, a^ = .017 and .006 (low and high interaction power), while for nonnull effects for the other main effect plus nonnull interaction effects, ajQj -.004 and zero (low and high interaction power). The effect on the power of the tests is equally devastating; for example, the power of HM is .381 when the power of the main effect F is .910 (high interaction power, nonnull case for the other main effect). Clearly, for the normal case, the Scheirer et al. tests become more conservative as a function of the magnitude of the other effects.

CONCLUSIONS
Although rank-sum procedures for factorial designs are attractive methodologies to include in a list of useful statistics, the tests due to Scheirer et al. neither control a at the set value nor are they powerful in the presence of effects other than those being tested. Hence, the researcher desiring rank-sum procedures for a factorial design would be wise to consider the better-known aligned rank methods, if the model for which the tests are proposed indeed fits the researchers data (e.g., additive model, model including replication effects etc.) or the rank transform. Alternatively, the researcher can rely upon the well-known robustness of the ANOVA to nonnormality and proceed with parametric analysis of the data. Under no circumstances could the tests due to Scheirer et al. be recommended for use.