Dimensions Underlying Student Ratings of Instruction: A Multidimensional Scaling Analysis

Multidimensional scaling (MDS) was used to derive the dimensions underlying student ratings of instruction. A weighted MDS revealed that departmental affiliation, course structure, and intensity of teaching were particularly salient to the raters. The findings are discussed in light of the current literature on student ratings of instruction.

imposed on the subjects in the data collection process. Finally, MDS results can be used in conjunction with regression analysis to link previous techniques that have been used to the approach presented here.
MDS refers to a class of statistical procedures that transforms proximity information into a spatial configuration of points reflecting the underlying structure of a data set. The methods for converting the proximities into points in a Euclidean space are based on the theorems of Young and Householder (1938) and subsequent work by Torgerson (1958). In the present context, MDS allows the derivation of the dimensions underlying student ratings.

Subjects
Fifty-eight undergraduate students from a small liberal arts college in Oklahoma served as subjects. All were enrolled in upper division courses in the social and behavioral sciences. Ninety-three percent indicated that they knew at least five of the eight instructors in the Division.

Instruments
The first instrument used, a similarities instrument, consisted of a list of the 28 possible pairs of instructors from the Division. The pairs were listed in random order and counterbalanced so that no instructor was consistently listed first within a pair. Following each pair was a 4V r 2-inch line labeled "exact same" at one end and "very different" at the other. The similarity judgments given on this instrument were used to derive the MDS configuration.
A second set of instruments was used to rate each instructor on communication skills, fairness, intensity of teaching, student relations, course structure, influence on students, enthusiasm, course workload, interest as a lecturer, appearance, course difficulty, dynamism/charisma, personal lifestyle, and classroom atmosphere. These scales were derived primarily from the research literature. Following each scale was a 4V2-inch line spanning the continuum for each dimension, with explanatory labels at each end. These responses were used to validate the MDS configuration.

Procedure
Subjects were given the similarities instrument with a cover sheet indicating that they were to consider the instructors' teaching qualities in the classroom. Subjects were instructed to mark an X on the line to indicate the perceived similarity between each pair of instructors. Upon completing the first form, the subject returned it and was given the rating forms, where an X was marked along the line indicating the rating for each instructor on each of the 14 scales.

DIMENSIONS OF STUDENT RATINGS
Numerical values of the similarity judgments were derived by dividing the response line into nine equal units and observing where the X was marked. This produced a square, symmetric matrix of similarities between the eight instructors for each subject. Numerical values for the rating scales were derived by the same method. An additional nominal scale of departmental affiliation was defined by assigning instructors in the same department the same numerical value, producing a total of 15 scales. This yielded an instructor by rating scale rectangular matrix for each of the subjects.
To account for individual differences among subjects, a weighted MDS analysis was performed. ALSCAL (Young & Lewyckyj, 1979) was used to analyze the similarity judgments. Analyses were performed in two, three, and four dimensions using ordinal and interval measurement levels.

RESULTS
Following the guidelines given in Kruskal and Wish (1978), the threedimensional, ordinal solution was chosen for interpretation. This solution was chosen on the basis of goodness-of-fit and interpretability. The stimulus configuration is given in Figure 1. In this figure, each letter reflects one instructor and the distance between the points reflects the degree of dissimilarity between instructors.
The three dimensions of the space were interpreted as intensity, course structure, and departmental affiliation. The vertical dimension corresponded to intensity. In general, Instructors H, B, F, and E are less intense in their teaching style than are A, C, D, and G. The horizontal dimension was interpreted as course structure. Instructors D and H teach loosely structured courses, while Instructor C's are highly structured. The last dimension corresponded almost perfectly to departmental affiliation. Instructors A, B, F, and H are behavioral scientists (psychologists, sociolo-STRUCTURE FIGURE 1. Stimulus (instructors) space for three-dimensional, ordinal MDS configuration. gists), whereas G, E, and C are social scientists (political scientists, historians).
The interpretations presented above were based on the researchers' knowledge of these instructors, and, therefore were somewhat subjective. To validate these dimensions, a linear regression analysis was performed using the 15 rating scales. The regression analysis, with the stimulus coordinates as the predictors and the rating scales as criteria, defined the least-squares projection of the rating scales into the stimulus space. The resultant squared multiple correlation (R 2 ) indicated the variability in the rating scale accounted for by the stimulus configuration.
Five of the rating scales had R 2 values that exceeded .90 and were statistically significant (p < .05). In descending order, these were difficulty, structure, affiliation, intensity, and workload. Inspection of the direction cosines suggested the three axes corresponded to departmental affiliation, intensity of teaching, and course structure. Table I provides information concerning some characteristics of the instructors, along with the R 2 values for the relevant dimensions.
Two other rating scales, difficulty and workload, had high R 2 values and were highly correlated. In weighted MDS, however, the interpretable directions are expected to closely correspond to the axes (Kruskal & Wish, 1978). According to the direction cosines, the best interpretations for the axes through the stimulus space were affiliation, intensity, and structure.
It is interesting to note the negligible relationships of factors such as appearance, influence on students, and student relations (all R 2 < .26). Also, interest as a lecturer, enthusiasm, dynamism, and communication skills had only moderate relationships (.43 < R 2 < .56), suggesting that these characteristics are not as salient as is often supposed.
In addition to the group stimulus configuration, weighted MDS also produces a subject space. Inspection of this space indicated that most subjects used all three dimensions in making their similarity judgments, with substantial variability in the students' judgments being accounted for by the derived dimensions.

DISCUSSION
Many statistical techniques have been used to define and assess student ratings of instruction. Previous research indicates that such ratings are multidimensional with a number of factors proposed as relevant to the instructional process. This study identified which of those many dimensions were most salient to the student rater. Using MDS, the underlying dimensions of student ratings were derived rather than imposed. These were departmental affiliation, intensity of teaching, and course structure. Difficulty and workload were also relevant, though not as interpretable.
Two measurement tasks were used in this study. The similarity task is fundamentally different from that used in most student evaluations. The traditional approach to evaluation corresponds more closely to the second task in this study, that is, rating the instructor on a number of proposed scales. Asking the student to make similarity judgments between instructors raises the possibility of their using any number of objective criteria, such as age, sex, rank, and so on. It is interesting that only one of the three derived dimensions is objective (affiliation) while the other two are relatively subjective (structure and intensity). Uncovering two dimensions that are subjective lends support to an assumption implicit throughout this study, that the dimensions that are salient to raters in making similarity judgments are also salient to them in making the more traditional evaluative ratings.
These findings support many of the results found in the literature on student ratings. Unique to this investigation, however, is the perceived salience of departmental affiliation. Although affiliation is not necessarily related to good teaching, when students are asked to make similarity judgments about instructors' teaching behavior, the academic discipline of the instructor is salient. Affiliation might have an indirect influence on ratings in that the preference of a student for any given discipline may influence their predisposition to like or dislike the instructor teaching that subject.
Certainly, the dimensions of teaching that are most salient to the student rater are not the only ones of interest. Nevertheless, knowledge of the aspects of instruction to which students pay particular attention can help in the interpretation of the evaluation instruments commonly used. It is hoped that this new perspective can give insight into the complex relationship between instructional evaluation and teaching effectiveness.