Factor Structure of the Criterion Task Set

A large-scale experimental study was conducted involving the training and testing of 123 human subjects on the Criterion Task Set (Version 1.0). Testing was performed under baseline and stressor conditions. The performance data and Subjective Workload Assessment Technique ratings for the first baseline trial (Trial 6) were analyzed using the SAS VARCLUS procedure to evaluate the structure of the CTS. Seven clusters of response time variables were identified for the nine tasks. In general, the Memory Search, Linguistic Processing and Mathematical Processing tasks were grouped in one cluster with each of the other clusters representing a single task. Five clusters were identified for the SWAT ratings with clusters differentiated along the dimensions of task difficulty and processing stage.


INTRODUCTION
From its initial development in 1983 and subsequent release in 1984, the Criterion Task Set (CTS) has been widely disseminated as a research tool for evaluating workload assessment techniques and as an instrument for human performance assessment. Literature exists concerning the content of the task battery and the development of the individual tasks (Shingledecker, 1984), training characteristics of the battery (Schlegel and Shingledecker, 1985), and preliminary results from the development of a large-scale data base of CTS performance data, Subjective Workload Assessment Technique (SWAT) ratings and i nd ividual difference variables .
A major selling point of the CTS is its selection of tasks based on processing stage and multiple resource theories . Table 1 summarizes the characteristics of the nine tasks ( 2 5 individual task levels) presented in alphabetic order.
Data collection has recently been completed for the large-scale standardization study of the CTS.
Performance and Subjective Workload Assessment Technique (SWAT) data were collected for approximately 123 subjects (96 males, 27 females) performing all nine tasks of the CTS at all levels for nine days. Five days were allocated for training, two days for baseline testing and two days for testing under various stressors.
Details of the methodology were 389 reported in Schlegel, Gilliland and Schlegel (1986).
In addition to providing standardization data for the CTS Version 1.0 tasks, the study provided an opportunity to examine the relationships among the dependent measures for the nine tasks and the SWAT ratings for the tasks. Specifically, data from the first baseline trial following training (Trial 6) was analyzed using the Statistical Analysis System VARCLUS procedure to cluster the nine tasks and twenty-five individual task levels. The results of these analyses wi.11 help determine if all levels of a particular CTS task are drawing from the same resource pool and how much overlap in resource demands exists between tasks.

THE SAS VARCLUS PROCEDURE
Several techniques may be employed to determine the underlying structure of a set of measures. These include factor analysis, principle components and multidimensional scaling techniques. Another approach is clustering analysis. The SAS VARCLUS procedure divides a set of numeric variables into either disjoint or hierarchical clusters in such a way that each cluster can be interpreted as essentially unidimensional (SAS, 1985).
The clusters are chosen to maximize the variation accounted for by either the first principal component or the centroid component of each cluster. Specifically, the procedure attempts to maximize the sum across clusters of the variance of the original variables that is explained by the cluster components.  VARCLUS is a type of oblique component analysis related to multiple group factor analysis. It can be used as a variable reduction method to replace a large set of variables by the Set of cluster components.
A given number of cluster components does not typically explain as much variance as the same number of principal components, but the cluster components are usually easier to interpret than the principal componen.ts even after rotation.

RESULTS
Four separate analyses were performed, three involving the performance data and one with the SWAT data.

CTS Performance Data
The first clustering analysis included the response time measures for the discrete stimulus tasks and probability monitoring plus the mean tapping rate (IPMN) and variability score (IPVS) for interval production and the absolute rror and edge violations for unstable tracking.
This provided nine variables for each of the three levels of eight of the tasks plus two variables for Interval Production for a total of 2.9 performance measures. Each of the task difficulty levels was included separately in order to determine whether the different levels tap the same resource as indicated by the clustering. A summary of the clusters generated from this analysis is given in Table 2. The second clustering analysis examined the accuracy measures of proportion correct in place of the response time measures for the discrete stimulus tasks.
Performance measures for the non-central processing tasks were not included in the analysis. The clustering exhibited more task overlap (Table  3) than in the analysis of response times.
This was probably due to the relatively high level of accuracy in several tasks.
Several significant findings are evident from the table. With one major exception, measures from different levels of the same task were placed in the same cluster indicating that the various workload levels or difficulty manipulations of any given task draw from the same resource pool.
The exception is the Continuous Recall task at the low level which was placed in the largest cluster with linguistic processing, mathematical processing and memory search. This indicates the much lower difficulty of CR at the low level and its closer association with LP and MP as a symbol manipulation task.
Each task occupied a separate cluster with the exception of LP, MP and MS. This indicates the minimal resource overlap for all tasks except these three. The overlap of these three is probably due to the relative ease of the tasks and the similarities of simple symbol manipulation whether linguistic, mathematical 01: simple memory update.

391
1low level 2medium level 3hign level A final clustering analysis with the performance data combined the response time and accuracy measures for a total of 47 variables. Ten clusters were defined as shown in Table 4 .
It is obvious that the Probability Monitoring measures do not form a separate cluster nor do they clearly associate with other clusters. With minor exceptions, the other measures form logical clusters along the lines of the first two analyses.

SWAT Ratings
The SWAT ratings reflect the perceived difficulty of the task rather than the actual performance.
Cluster analysis of the SWAT ratings for the 25 task levels produced the results given in Table 5.
In terms of subjective ratings, the tasks and task levels were clustered along the dimensions of task difficulty and processing stage.
It is evident from the table that cluster 1 contains discrete stimulus central processing tasks of low and moderate difficulty while cluster 2 contains the more difficult central processing tasks.
Cluster 3 contains the easy levels of the motor output tasks, a spatial task and a math 1low level 2medium level 3high level processing task. Cluster 4 contains the difficult levels of the visual/spatial tasks and cluster 5 contains the difficult levels of the motor output tasks. 1low level 2medium level 3high level

DISCUSSION
The results of the cluster analysis help to validate the design goals of the Criterion Task Set as a battery of tasks that tap separate information processing resources and stages. The primary area of concern is the overlap among the Linguistic Processing, Mathematical Processing and Memory Search tasks, all of which involve somewhat simple symbol manipulation.
Some differences existed between the cluster structure for the performance data and that for the SWAT ratings indicating that subjects perform differently than their estimate of task difficulty as might be expected.
Additional analyses using different techniques are being performed to substantiate these results.