Confidence Intervals, Power Calculation, and Sample Size Estimation for the Squared Multiple Correlation Coefficient under the Fixed and Random Regression Models: A Computer Program and Useful Standard Tables

In this article, the authors introduce a computer package written for Mathematica, the purpose of which is to perform a number of difficult iterative functions with respect to the squared multiple correlation coefficient under the fixed and random models. These functions include, among others, computation of confidence interval upper and lower bounds, power calculation, calculation of sample size required for a specified power level, and providing estimates of shrinkage in cross validating the squared multiple correlation under both the random and fixed models. Attention is given to some of the technical issues regarding the selection of, and working with, these two types of models as well as to issues concerning the construction of confidence intervals.

Much emphasis has been placed lately on the importance of using confidence intervals in data analysis in addition to or as opposed to performing null hypothesis testing. Some authors have gone so far as to suggest the replacement or banishment of hypothesis testing (e.g., Schmidt, 1996). In the replacement of hypothesis testing, it is argued that parameter estimates be accompanied by their margin of error-confidence intervals. Clearly, this recommendation is especially valid when sample size is either very large or very small. Large sample sizes can produce statistical tests that are overly sensitive and, thus, lead to the finding of statistically significant differences where only minuscule differences between parameters actually exist. On the other hand, small samples produce tests that are not very sensitive and lead to the detection of only large differences between the parameters. There are other reasons why these authors have argued against tests of hypotheses, but we will not discuss them here. We are not taking a position regarding the merits or demerits of hypothesis testing. Others have already done this (e.g., Baril & Cannon, 1995;Cohen, 1994;Frick, 1996;Hagen, 1997). Instead, we will focus on some of the technical issues inherent in constructing confidence intervals. Confidence intervals on means are well understood and are generally simple to construct. Consequently, we will not discuss them in detail. Confidence intervals on correlations are more difficult to construct, especially when they involve the multiple correlation coefficient. We will focus on covering the procedure involved in obtaining a confidence interval on the squared multiple correlation.
Multiple correlation procedures are widely used in education and the social sciences. The underlying statistical models for these procedures are of two types. These models are referred to as fixed and random models or, respectively, correlation and regression models. A mathematical exposition of the two models is given in Sampson (1974). Mathematically, we can often view the fixed model as a conditional random regression model. The random model is more appropriate for nonexperimental situations in which the levels of the independent variables are not fixed a priori. It is common in education and other social sciences to have studies in which the levels of the multiple independent variables for each experimental unit cannot be controlled and are available only after the observations are made. This type of design clearly falls under the random model. In contrast, under the fixed model, the levels of the independent variables are fixed before data collection.
Ordinarily, when computing a multiple regression analysis in a statistical package like SAS or SPSS, the researcher need not worry whether data fit the assumptions of the random versus the fixed model. Tests of hypotheses and estimates of parameters are the same under both models (see Figure 1). One needs to be aware, however, of the model under which one is working when trying to establish the necessary sample size for a desired power level or when finding a confidence interval for the squared multiple correlation. In addition, although for large samples the results of power calculations under the two models are similar, researchers need to be aware of the model under which they are working when computing power. Lee (1972) gave tables for power calculations under the random model for the multiple correlation coefficient. These tables are, however, difficult to use. An easier set of tables is found in Gatsonis and Sampson (1989), but these tables only provide the sample size necessary to obtain a specified level of power. Sample size estimation and power calculations under the fixed model are given in Cohen (1988).
Power calculations, sample size determination, and confidence interval estimation are more complex under the random model than under the fixed model. Under the fixed model, the noncentral F distribution is used for calculations. This is a well-known distribution for which many tables and computer programs are available. Unfortunately, not many computer resources exist to aid the researcher working with the random model.
Obtaining a confidence interval on the population squared multiple correlation ρ 2 requires that we identify a model. A confidence interval obtained under the fixed model is not the same as one obtained under the random model. Confidence intervals obtained under the fixed model are usually shorter than those obtained under the random model. The implication for us is that we must determine which model is appropriate before we can embark on computing the confidence interval for ρ 2 . Figure 2 illustrates the appropriate distribution that must be used for constructing confidence intervals on ρ 2 under the random or fixed model.
The construction of a confidence interval on ρ 2 under either the random model or the fixed model is somewhat complex and requires the use of a computer or special tables. We will outline the procedures used to construct confidence intervals under both models. The construction of these intervals is similar in principle but different in execution. Constructing a confidence interval under the fixed model requires access to the noncentral F distribution. The experienced researcher can use this distribution, which is available in SAS, to obtain the interval. We have written and compiled a package in  Mathematica 3.0, in which we have included a set of functions that give the confidence interval easily and directly under either the fixed or the random model. We will illustrate the use of these functions later in this article. Additional useful functions, including a function to compute power under both models, are included in the package. See the appendix for information about accessing the Mathematica package, which is available from the authors.

Confidence Intervals
The standard situation in setting a confidence interval on the mean of a normal distribution is a straightforward procedure. Also, in the standard situation, the processes involved in obtaining a confidence interval and testing a null hypothesis are very similar. The confidence interval is obtained by "inverting" the hypothesis test. Consider the null hypothesis Ho: µ = µ o . The test of this null hypothesis is performed by checking to see whether the interval µ o ± t S/ N covers the observed sample mean x. If the interval does not include the mean, we reject the null. Using this same interval with the sample mean instead of µ o , we obtain the confidence interval on µ, x ± t S/ N . (The t in both situations refers to the appropriate cutoff point under the central t distribution, and S refers to the sample standard deviation.) Before leaving the subject of a confidence interval on the mean, it should be pointed out that because a regression weight β is a conditional mean, its estimation (point or interval) is the same under the fixed or random model (see Sampson, 1974).
The inversion process is more complex in situations in which we are unable to find a function of the estimator that is independent of the parame- 654 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT .30 .002 ters of the sampling distribution. This is the case for the squared multiple correlation ρ 2 . The confidence interval on ρ 2 is not a simple inverse of the procedure used to test the null hypothesis. The problems lie in that the variance of the sampling distribution of R 2 depends on ρ 2 . The process of finding a confidence interval under this dependency requires working directly with the sampling distribution of the squared multiple correlation R 2 . To explain briefly, we must identify the values h 1 and h 2 that demarcate an area under the sampling distribution equal to the chosen (1 -α)% level of confidence for each possible value of ρ 2 (see Mood & Grabill, 1963). The values are selected such that the sum of the conditional probabilities, P(R 2 < h 1 | ρ 2 ) and P(R 2 > h 2 | ρ 2 ), equals α. Without loss of generality, we focus on a specific n (sample size) and p (number of independent variables) to illustrate the procedure. Let us consider a 90% confidence interval on ρ 2 when p = 4 and n = 40. To find the interval, we first find the values for h 1 and h 2 that make the sum of the conditional probabilities .10. The h 1 and h 2 are found for many consecutive values of ρ 2 in the interval (0,1). (Note that h 1 and h 2 yield the lower and upper bounds of R 2 , respectively.) Plotting h 1 and h 2 for each ρ 2 , we can see that h 1 .20 .001 and h 2 define two curves on the ρ 2 -R 2 plane. Given a specific value of R 2 (say R* 2 ), the confidence interval on ρ 2 is obtained by drawing a vertical line through R* 2 parallel to the ρ 2 axis. This line intersects the two curves at points h* 1 and h* 2 . When the points are projected onto the ρ 2 axis, we obtain the upper and lower bounds for the confidence interval. In Figure 3, we see that if the sample multiple correlation squared is .45, we can say with 90% confidence that ρ 2 lies in the interval between .18 and .57. We discuss this procedure at length in the next section.

One-Sided Confidence Interval on 2
As we previously mentioned, the computation of a confidence interval on ρ 2 is complex because we cannot find a function of the sample multiple correlation R 2 (and ρ 2 ) that is independent of the parameters of the sampling distribution. When we cannot find such a function, we must work directly with the 656 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT sampling distribution. Specifically, we work with the integral of the sampling distribution of R 2 for a given value of ρ 2 . In a one-sided (1 -α)% confidence interval, we must solve for h such that The function g(R 2 ; ρ 2 ) represents the sampling distribution of R 2 under the fixed or random model. If h were only a function of ρ 2 , we could construct a figure similar to Figure 3 and find the one-sided (1 -α)% confidence interval. Unfortunately, the sampling distribution of R 2 also depends on n and p. So the approach taken in Figure 3 is not practical because h is a function of ρ 2 , n, p, and α. We are not working with a single curve but with a multidimensional surface. One practical solution to the problem is to write a computer program that can integrate the sampling distribution of R 2 . The sampling distribution of R 2 is a function of the noncentral F distribution under the fixed regression model. In contrast, the sampling distribution of R 2 under the random model is C k is also a function of the incomplete beta function B,  and (l) 0 = 1,(l) k = (l)(l + 1) . . . (l + k -1). (See Gatsonis & Sampson, 1989, for more details.) We present here documentation for a program in Mathematica 3.0 to numerically integrate this sampling distribution. This program is part of a larger package that we have developed to compute confidence intervals and power. The Mathematica 3.0 program is based on the following algorithm. For a manifest value of R 2 , say R* 2 , the program sets h = R* 2 and ρ 2 = R* 2 and computes the P(R 2 ≥ R* 2 ). If this probability is not equal to α, the value of ρ 2 is changed, and the probability is computed again. The program searches through different values of ρ 2 until it finds a ρ 2 such that P(R 2 ≥ R* 2 | ρ 2 ) = α. The ρ 2 identified in this manner is the lower bound of the one-sided (1 -α)% confidence interval. The search is conducted with an interval-halving algorithm similar to the one given in MacCallum, Browne, and Sugawara (1996).
(All other search procedures in our Mathematica Package are based also on this algorithm.) The Mathematica Package The Mathematica Package contains a set of functions for computing power, constructing a confidence interval, and estimating the level of prediction under either the random or fixed model. Table 1 lists these functions and gives a short definition of each function. We did not use Mathematica's noncentral F distribution. Instead, we created our own because we found the Mathematica function a bit unstable when we tried to integrate it under some extreme conditions. Our noncentral F probability routine resembles the one given in SAS 6.11. One of the functions in the package computes the confidence interval under the random model, whereas another function computes the confidence interval under the fixed model. The package also includes functions for power and sample size calculations under both models. The functions in this Mathematica package were checked against existing tables for accuracy. The functions always were extremely close to the table values. In addition, we took precautions with the functions to stay within the accuracy limitations of a 32-bit processor. We feel confident in the accuracy of our functions. We have used the random confidence function in the package to create Tables 2 through 8. Researchers without access to Mathematica can use these tables to find a confidence interval for ρ 2 under the random model. A Confidence Interval on the Cross-Validated Multiple Correlation Sometimes in accessing an obtained sample regression b, it is desirable to estimate how well it will predict in future samples. This is a situation likely to be encountered in many situations. The researcher generates a regression equation in a validation study and would like to know how well the obtained regression equation is likely to predict in future samples. Not exactly the same question, but a related one, is how well the obtained regression equation would predict if it were to be applied to the population. We call this measure the predictive precision of the equation, and it is given by The vector Σ xy contains the population covariances between the independent and dependent variables. Similarly, Σ xx is the variance-covariance matrix of the independent variables. The predictive precision given in Equation 3 is a random variable prior to data collection but a fixed parameter once we obtain the regression equation (see Mendoza, 1977). The ρ c 2 ranges from 0 to ρ 2 . Under the assumption of multivariate normality, Park and Dudycha (1974) have shown that the sampling distribution of ρ c 2 follows the noncentral F distribution, For a given ρ 2 , we can find the probability that By working backward, we can identify a lower bound for ρ c 2 that will satisfy the probability statement in Equation 4. We have created a function "cross-R2confidence" that for a given ρ 2 , n, p, and a specified confidence level, returns a (1-α) lower bound for the level of predictive precision ρ c 2 . For the usual situation in which we do not know ρ 2 , we could use the sample multiple correlation R 2 , or, if we wish to be conservative, we could use the lower bound of ρ 2 , which can be obtained with our random confidence interval function. By using the sample multiple R 2 in the function "crossR2confidence," we ob-tain an estimate of the "shrinkage"-the difference between the returned value and the sample R 2 .
A Practical Example Allen, Cipielewski, and Stanovich (1992) investigated the relationship between the amount of time that 63 elementary school children spent reading and a number of indicator variables. Two independent variables contributed significantly to the prediction of reading time and were retained in their model with R = .56, R 2 = .31. Assuming the random model, a one-sided confidence interval is easily established with the Mathematica Package. Using "confidence [63,2,.31,.95]" in the Mathematica package, we find that the lower bound for a 95% confidence interval on ρ 2 is .14; that is, we can say with 95% confidence that ρ 2 is between .14 and 1. We could also get an idea of the magnitude of the lower bound by looking at Table 5. Finally, we wish to obtain an estimate of the shrinkage we would expect to see in a cross-validation study. Using the estimate of ρ 2 obtained from the sample, submitted to the function in the form crossR2confidence[63,2,.31,.95], we find that the cross-validated lower bound returned by the function is .265. Therefore, the estimate of shrinkage is .045(.31 -.265).
What of a similar study conducted with, again, two independent variables, the same observed R 2 of .31, but now with a sample size of only 30? The decrease in estimation precision is seen by the lengthening of the confidence interval, the lower bound of which is now returned as .06. As we would expect, although the level of power is still good at .84, the value returned is considerably less. The increased imprecision is also reflected in the estimate of the cross-validated lower bound, which is decreased to .21, thus increasing the shrinkage estimate to .10. Next, we consider the calculation of power.
In the process of planning an experiment, a researcher is interested in detecting a medium effect size (ρ 2 = .13; Cohen, 1992) with a sample of 50 individuals and three predictors. To calculate power, the researcher enters "powerR2[50,3,.13,.05]," and Mathematica returns .56. If the researcher is not content with power of .56 and, instead, wants power to be .85, the researcher may enter "findsample[3,.13,.85,.05]," and Mathematica returns 90. In this situation, the researcher needs 90 subjects to detect a medium effect with power of .85.

Other Features
The Mathematica Package can also be used to make a few observations regarding the effect of number of predictors and sample size on the size of the confidence interval. To illustrate the relation between p and the size of the confidence interval, consider Figure 4. By increasing the number of predictors from two to six (while keeping R 2 at about .4), the width of the confidence interval on ρ 2 is increased from .82 (1 -.18) to .92 (1 -.08). As p is increased, the size of the confidence interval is also increased. We would have to have an increase in R 2 of approximately .08 just to keep the size of the confidence interval at the same level. So, unless we could get an increase greater than .08 in R 2 , we would decrease precision by increasing the number of predictors from two to six.
Our routine could be used to obtain this kind of information. We begin by typing "confidence [41,2,.4,.95]" in Mathematica after loading the package, and Mathematica returns the value of .175. The same operation for "confidence [41,6,.4,.95]" yields a value of .0875. We can see the decrease in precision as we moved from 2 to 6. By entering "confidence [41,2,.48,.95]" (= .1725), we see that it would take at least an increase of .08 in the R 2 to get back to the original level of precision. The evaluation between number of predictors and the size of the confidence interval provides yet another tool in trying to identify the "best" regression model. The package also may be used to illustrate the effect of p on power. We can show that for a given difference, as p increases, power decreases.
Before you begin, it is important to know that Mathematica commands are case sensitive. You must enter the commands and arguments exactly as they appear in this text. Also, to submit commands and arguments to Mathematica for evaluation, the shift key and the enter key must be pressed simultaneously.

Accessing the Package
If you are accessing the package from a floppy disk, at the Mathematica edit screen, with the disk in drive A, submit the following string: <<A:\MultipleR2.m All the functions within the MultipleR2 package will now be available after simultaneously pressing the SHIFT and ENTER keys.
If you are accessing the package from a fixed disk, we assume the platform under which you are running Mathematica 3.0 is Windows-95, 98, or NT. Although Mathematica runs essentially the same under other platforms, the file structure under other platforms may differ. If you wish to use the package from a hard drive, you must first copy the package into a subdirectory (folder) under Mathematica 3.0. Although it can be copied into an existing folder, we recommend creating a dedicated folder under Mathematica 3.0 for personal programs. This folder we call "myprograms." Once the package is copied into the desired folder, type the following to retrieve the package: where foldername is replaced with the name of the folder under which you have copied the package, and the ' is the grave accent symbol found in the upper-left portion of your keyboard. The functions within the package will now be available to you.

Using the MultipleR2 Functions
In this section are most of the functions available in the MultipleR2 package together with a description of the output of each. The function names are written in the correct and required case. As noted in this article, for all functions, N represents sample size, and p represents the number of predictors in the model.
The probr function (probr[N,p,r2,X]) returns the probability that R 2 < X under the random regression model in which X is the parameter value against which the estimate from the sample is compared.
The confidence function (confidence[N,p,r2,plevel]) returns the upper or lower bound for a p-level confidence interval for ρ 2 given an observed R 2 under the random regression model.
The crossR2sample function (crossR2sample[p,rho2,pcross,diff]) returns the necessary sample size to obtain a cross-validated R 2 that does not differ from ρ 2 by more than "diff" with probability pcross.
The findsample function (findsample[p,Rho2,power,alpha]) for the given power and α computes the sample size necessary to reject the hypothesis that ρ 2 = 0 when ρ 2 = R 2 under the random regression model.