The Use of Flight Progress Strips While Working Live Traffic: Frequencies, Importance, and Perceived Benefits

The Federal Aviation Administration's effort to automate air traffic control (ATC) requires that the functionality provided today be captured in future systems. We report the first quantitative naturalistic observation of paper flight progress strip interactions during operational use. Strip use was similar in a variety of situations, but some uses varied as a function of altitude, staffing, or the cooperative style used by controller teams. Design of automation should proceed by prioritizing changes based on frequency of use and importance and should ensure that an effective method of interacting with flight information is incorporated. In addition to applied relevance to the ATC domain, the results touch on several theoretical concerns relevant to dynamic environments. Actual and potential applications of this research include the establishment of a database of strip activity and an arsenal of information valuable to system designers.


INTRODUCTION The Air Traffic Control System
Highly trained men and women using sophisticated technologies and intricate procedures accomplish the safe and expeditious movement of the nation's air traffic. Air traffic control (ATC) separation responsibilities are divided among four types of controllers who work at three different kinds of facilities. This paper focuses on en route controllers responsible for the high-altitude, high-speed components of the flights. En route controllers perform a multitude of functions associated with a flight. They clear aircraft to appropriate flight levels, coordinating with other en route facilities and Terminal Radar Approach Control (TRACON). They change headings and routings (also ascents, descents, and speeds), granting pilot requests to maximize both the safety and efficiency of a flight. At airports without active control towers, en route controllers can also perform the functions of arrival and departure controllers, granting approach and departure clearances.
Facilities. En route controllers work in one of the 22 Air Route Traffic Control Centers (ARTCCs) in the United States. Each ARTCC is responsible for a defined airspace that typically covers several states and extends from the ground to 60 000 feet, except where other types of ATC are enacted. En route air traffic control services are provided to aircraft on instrument flight rules (IFR) flight plans when the aircraft are operating between departure and destination terminal airspace. When equipment, capabilities, and workload permit, certain advisory/assistance services may be provided to visual flight rules (VFR) aircraft.
Airspace. A center's airspace is divided further into contiguous segments of airspace called areas of specialization. Areas of specialization are further segmented into volumes of airspace called sectors. A certified professional controller (CPC) is qualified to work in all sectors in one area of specialization. The size and configuration of a sector is often determined by factors such as traffic volume and flow, types of aircraft operating within the sector, location The Federal Aviation Administration's effort to automate air traffic control (ATC) requires that the functionality provided today be captured in future systems. We report the first quantitative naturalistic observation of paper flight progress strip interactions during operational use. Strip use was similar in a variety of situations, but some uses varied as a function of altitude, staffing, or the cooperative style used by controller teams. Design of automation should proceed by prioritizing changes based on frequency of use and importance and should ensure that an effective method of interacting with flight information is incorporated. In addition to applied relevance to the ATC domain, the results touch on several theoretical concerns relevant to dynamic environments. Actual and potential applications of this research include the establishment of a database of strip activity and an arsenal of information valuable to system designers. and activity of nearby terminal facilities, special operations and procedures (e.g., military operations), equipment limitations, and radar/radio coverage. Although several variables can define differences between sectors, it may be possible to capture many of the differences by considering a sector's altitude. Sectors are usually labeled as low or high, depending on the altitude limits of the airspace. High and low sectors near a busy terminal facility may be further broken down into superhigh or superlow sectors, in order to better distribute the workload for the circumstances (e.g., arrivals, departures).
Equipment. The controllers have access to computer-augmented radar information displayed over a situation display (also called a plan view logical display) and can use data entry and display devices to enter and retrieve information about an aircraft's flight plan. Controllers use a variety of communication devices, including radio and telephone landlines that allow contact with aircraft, other facilities, and other sector workstations within the ARTCC. Finally, they have flight progress strips, small pieces of paper that contain pertinent information from the pilot's flight plan. One strip, occasionally more, is printed for each flight in the controller's sector. Each strip is a 1-7/16 × 6-7/16 inch (3.62 × 16.35 cm) piece of paper containing 30 fields for information (see Figure 1). In addition to providing access to stored flight plan information, the controller can manipulate and write additional information on the strip.
Staffing. The workstations used to control traffic at a sector within the ARTCC can be staffed by one or two controllers and sometimes more. The tasks to be completed remain the same regardless of the staffing. However, when more than one controller is present, there is no universal approach to the division of responsibilities. The team, as a whole, has the responsibility for determining how functions are accomplished. However, some general guidelines for tasks performed by different controller positions are available.
For instance, the radar (R) controller communicates directly with aircraft and uses radar information as the primary means of separation. In general, the R controller ensures separation, issues control instructions to pilots, monitors and operates radio communications, accepts and initiates automated handoffs, assists the radar associate (RA) controller with nonautomated handoffs and coordination when needed, scans the radar display, correlates radar and flight progress strip data, ensures that computer entries and strip markings reflect clearances issued or received, and adjusts radar equipment so it can be used by all members of the sector team.
The RA controller, also referred to as the Dside or manual controller, assists the R controller. In general, the D-side controller ensures separation, initiates control instructions, operates the interphones (to communicate with controllers at the same or other facilities), accepts and initiates nonautomated handoffs, assists the R controller by accepting or initiating automated handoffs, ensures that the R controller is immediately made aware of any action taken, coordinates (including making pointouts), monitors radio communications, scans flight progress strips and correlates flight progress strip and radar data, manages flight progress strips, ensures that computer entries and strip markings reflect clearances issued or received, and adjusts equipment at D-side position so it can be used by all members of the sector team.
Because a scientific observation of controllers in the field has not been conducted prior to the current work, it is not clear how teams of controllers actually perform these tasks, especially regarding the use of flight progress strips. There is no reason to believe that a particular strategy used by individuals or teams of controllers will apply to all controllers at a facility or that a strategy used by most controllers at a given facility will be used by controllers at other facilities.

Strip Marking
The Aviation Investment and Reform Act for the 21st Century passed by Congress has fueled the Federal Aviation Administration's (FAA's) interest in facilitating "operational acceptability of electronic flight information to replace paper strips" (Wickens, Mavor, Parasuraman, & Mc-Gee, 1998, p. 121). An important prerequisite, or at least corequisite, to the development of an electronic "flight object" (FAA, 1999) is to gain a better understanding of the current use of the strip. Any attempt to move the strip from paper to glass should be informed by the important functionality of the paper strip. An important part of understanding this functionality is to collect data on the frequencies with which certain markings or activities occur. This frequency information, when combined with information about the subjective importance of a mark and the benefits controllers perceive the mark to provide, can be a valuable part of the designer's arsenal. Ideally, designers can implement changes in the automation for markings that are both frequent and important. The way in which these changes are implemented can be informed by the kind of benefit that a particular mark provides.
Strip marking requirements exist partly to satisfy the need to maintain a legal record (see FAA, 2001) of the activities that took place should an incident (e.g., operational error or accident) occur. Because much of the flight information and controller-pilot interactions are recorded electronically, some of the required strip markings are redundant and no longer serve the legal function that they once did. A complicating factor is that controllers may not follow regulations involving strip marking in every case. Indeed, if markings were made only because they are required, for example, to establish a redundant legal record, it would be simple to eliminate strip markings. However, there are at least two lines of argument that suggest such a move would be premature. First, there are disadvantages of screen presentation as compared with paper presentation (Luff, Heath, & Greatbatch, 1992), and even casual analyses in the popular press have argued to retain paper (Gladwell, 2002). A second argument is that as technological aids become part of a humantechnical system, the aids can come to serve functions other than those originally intended. For example, Hutchins (1995) discussed how "speed bugs" in a pilot's cockpit came to serve functions other than those for which they were designed. Similarly, the flight strip may have developed benefits unintended by the original designers.
During the last decade, consideration of the ancillary cognitive benefits of flight progress strips (e.g., Hopkin, 1991) has given rise to a number of empirical studies, in addition to continued thought and speculation. Much of the empirical work suggests that strip marking and board management do not particularly aid the cognitive processing of controllers and that the controller can easily compensate for a lack of strips.
In a study using FAA Academy instructors, Vortac, Edwards, Fuller, and Manning (1993) prohibited marking and moving the strips; in fact, they took away the controller's pencil and glued the flight progress strip holders to each other so the strips could not be moved. Vortac et al. (1993) found no deficits, as compared with a condition in which controllers had normal use of strips, and did find some evidence for improved prospective memory performance in the restricted strip condition. In another study, a one-line electronic representation yielded performance similar to that for the flight progress strip (Vortac et al., 1996). Finally, in a study using field controllers from Atlanta Center (ZTL), Albright, Truitt, Barile, Vortac, and Manning (1995) found no deficits even when strips were removed completely. Controllers easily compensated for the lack of strips by gathering information from the situation display or from flight plan readouts. In fact, eliminating board management responsibilities allowed the controllers to spend more time studying the situation display.
Although such findings are encouraging precursors to attempts to eliminate the paper strip, they do not supply a complete picture of the situations in which controllers rely on the flight progress strip. Over the course of several days, in a variety of sectors, staffing configurations, and facilities, strip usage may occur at significant rates that would not occur in simulated traffic controlled for only 30 min. In addition, the issue may not be whether flight progress strips can be replaced but, rather, how an interim system should develop that would transition the workforce away from paper strips (Durso & Manning, 2002). It is important to understand current flight strip usage when designing such a transitional system.
There have been some field investigations of the use of flight progress strips (e.g., Berndtsson & Normark, 1999;Hughes, Randall, & Shapiro, 1992;MacKay, 1999). However, they have focused on more qualitative aspects of strip marking in control environments outside the United States. Thus there is no existing database of the rate at which U.S. controllers use the strips operationally, the types of marks they make, or the situations in which they make them.
There is also little work at the level of the particular strip mark. Such work is important because it is often the case that a particular marking is cited for retaining the paper flight strip. For example, some controllers believe that strips are important for recording holding clearances, pointouts, and so on. Nevertheless, it is difficult to evaluate these claims because there is no FAA database of the frequency with which such marks and activities occur. In fact, there is no easily accessible database of information as straightforward as how often a sector is staffed by an individual versus a team. This kind of frequency information would be valuable to any design-and-development effort to change the controllers' workplace. Second, there are no data to confirm an individual's assertion that the strip is valuable because particular marks are recorded there. Finally, there are no data that indicate the reasons controllers use the particular marking; it might simply be because the FAA requires it, or it may be perceived as having a true benefit. For those marks perceived as beneficial, there is no evidence to suggest the way in which they are beneficial.
The purpose of the present report was to supply information about the current use of flight progress strips. First, through the first quantitative observation of strip marking during live traffic, we supplied a database representative of the rates of strip marking and actions and how those rates change as a function of facility, position observed, and type of airspace. Second, through a series of interviews with controllers who recently marked strips, we gathered information about why particular marks tended to be made. Finally, we secured judgments of importance of the markings from controllers serving on a national automation team. Whereas the naturalistic observation allowed us to ascertain the value of markings as indicated by their frequency of occurrence, the interviews and importance judgments allowed us to ascertain the value and functionality of the markings. There were 24 observational sessions at each facility (12 10-min observations per session). The trained observers were provided with a booklet of strip marking observation forms. Each was used to record all strip markings that occurred during a 10-min period within a particular sector. The form was constructed from marking documentation in national and facility strip marking guides and from input provided by en route instructors at the FAA Academy in Oklahoma City and by our expert observers.
The areas and sectors to be observed were randomly predetermined. If a selected sector was combined with another, the observer monitored the combined sector. Counterbalancing was used to determine the order in which the position would be observed. Thus the observer monitored a predetermined random position and switched to a different, randomly determined position if that original position was not operational; in this way the individual, the R-side member of the team, or the D-side member of the team was chosen. These random selection procedures have a number of advantages over a more targeted approach: Biases of the experimenters and observers are eliminated from the selection process, and sectors, staffing, altitude, and so on are represented in the database according to their occurrence in the field, thus providing base rate information that would not be available otherwise. Thus overrepresentation of particular traffic situations that may result (e.g., by observing the busy sectors) is avoided, allowing inferences to the population to be made.
Observers began by recording information about the context in which the observations would take place, including the number of full data blocks (FDBs) on the scope, staffing, time of day, occurrence of training, and so on. Data blocks are multiple line displays that move across the radar screen coupled to the blip representing the aircraft. The data block contains information about call sign, altitude, ground speed, and beacon codes. Occasionally, controllers compress a data block of a flight no longer critical to control of the sector. Of relevance here is the fact that the uncompressed, or full, data blocks are a good indication of the number of flights for which the controller is responsible. The observers also recorded environmental factors that may have had an impact on strip marking during the observation (e.g., weather, equipment outages).
The observers tallied strip interactions at a randomly chosen position at the sector for 10 min. To increase focus on strip use and to meet confidentiality concerns, the observer did not listen to the radio or communications information (i.e., the observer did not "plug in"). Table 1 presents a listing and a description of the strip activities tallied by the observers. Six types of clearances were tallied. For each of these, the observer also determined whether it was issued, coordinated, or planned. There were also eight types of nonclearance markings, plus an "other" category that allowed controllers to note a variety of rare and idiosyncratic markings. Table 1 does not present the "other" category. Finally, three strip activities were tallied that did not involve markings (e.g., move strip).
Procedural details. Observations at each center occurred over a 3-day period. Each of four observers collected data during two 2-hr sessions per day. Thus 48 hr of strip marking observations were recorded at each facility. A session to collect 2 hours' worth of data lasted from 2 to 3 hr and occurred each morning and each afternoon/early evening. The sessions included both heavy traffic periods (i.e., rushes) and low-traffic periods.
If a strip was identified during the observation as a candidate for further investigation (see Phase 2), the observer invited the controller to participate in an interview. After completing an observation at a sector, observers continued to make 10-min observations at their own pace until 12 observations for that session were completed.

Results
There were 1320 observations included in the analysis, yielding approximately 34 000 tallies. Observations were excluded if we were unable to determine the position observed from the data or if training occurred at the sector during the time of observation. Sectors were staffed by individuals 66.7% of the time and by teams 33.3% of the time. Observations of high-altitude (52.9%) and low-altitude (47.1%) sectors were approximately equal. Weather was specified as an environmental factor affecting sector operations in 2.7% of the cases, outages in 1.2%, military operations in 2.3%, and personnel changes in 14.4%. The average number of flights under control was 10.6 FDBs in a sector staffed by a team of controllers and 7.5 in a sector staffed by only one controller.
The mean number of tallies made by each observer for each of the strip activities was analyzed to determine if the tallies were consistent across observers. Of the correlations, 93% were above .90, with the lowest r 2 indicating 77% overlap. Because no two observers ever witnessed exactly the same event sequence, these values should not be taken as reliability estimates  in the traditional sense. In addition to being based on aggregate data, these correlations captured the similarity in activity across centers as well as the extent to which observers noted those activities. All statistics used number of FDBs, reflecting sector busyness, as a covariate. Thus the patterns we discuss are not attributable to sector activity during the time of observation. (In all analyses using FDBs, the covariate was significant.) All statistics reported were significant at an alpha level of .05 or better. The mean number of marks or actions observed during an average 10-min period appears in Table 1 along with a description of the activity. Overall, controllers marked or manipulated the strips 25.9 times in a period, or once every 20 s. Extending this mean to an 8-hr day of activity for one sector suggests that the typical sector will witness 1200 strip interactions, and a center with 50 sectors will give rise to almost 60 000 strip interactions during a single 8-hr work shift. Thus differences as little as half an activity per 10-min observation yields a difference at the typical facility of more than a thousand interactions per work shift.
Overall, a large number of moves/resequences (6.92 moves per 10-min period) was observed, more than any other strip marking or activity. This was followed by incoming radar/ communications (2.97), altitude clearances issued (2.90), outgoing communications (2.87), and outgoing radar markings (2.54). Although subsequent analyses will reveal that rates varied as a function of facility, staffing, and altitude, as well as the interactions of these variables, note that these five strip activities always occurred at the highest rates.
Strip markings. Markings were analyzed in a 14 × 5 × 3 × 2, Mark Type × Facility × Position (individual, R-side controller, or D-side controller) × Altitude, analysis of covariance (ANCOVA) with an alpha of .05. Clearances were represented in this analysis by the issued clearances alone because issued clearances were the vast majority of clearance markings; thus in this analysis the 14 markings comprised 6 clearances and 8 nonclearance markings.
We used lower bound statistical estimates, a conservative inferential procedure that can be used for data that do not meet sphericity assumptions (Hays, 1994). The analysis revealed, not surprisingly, that marks occurred at different rates, F(1, 1240) = 20.21. There was a significant main effect of facility, F(4, 1240) = 3.07. Facility significantly interacted with the mark type, F(4, 1240) = 7.31, suggesting that the relative frequencies of strip markings varied across facilities.
There was also a main effect of position, F(2, 1240) = 38.33, and an interaction of position and facility, F(8, 1240) = 35.76. The Facility × Position interaction appears in Figure 2. The interaction occurred primarily because the facilities differed in how strip-marking duties were divided among R-and D-side team members. Overall, D-side controllers at ZKC, ZDC, and ZOB seemed to make more marks, whereas at ZAU and ZTL more markings were made by the R controller. This Facility × Position interaction was further modified by mark type, F(8, 1240) = 10.26, suggesting that not all marks were allocated according to the pattern in Figure 2. Specifically, coordination-related marks, such as pointouts and control released/received, are performed by the D-side controller regardless of facility. In addition, when we analyzed clearances more closely by comparing issued and coordinated clearances, we found that coordinated clearance marks were the province of D-side controllers in all facilities. Thus, whereas coordination marks (both coordinated clearances and nonclearance coordinations) are universally performed by the D side, issued clearance marks vary with facility.
It is interesting to speculate on the origins of these differences among centers. Training at the FAA Academy proceeded with a division of roles similar to that found in ZKC, ZDC, and ZOB. The ZTL and ZAU division of roles probably evolved to handle particular requirements of those types of centers. ZTL and ZAU have two of the largest airports in the world (Hartsfield and O'Hare, respectively), and thus the evolution of the roles may have been influenced by the amount or type of traffic. For example, in Chicago and Atlanta the D-side controller does a lot of coordination and does it in such a way that he or she cannot hear what the R-side controller is doing. For our purpose, the design of any electronic aid should recognize that in today's environment issued clearance marks are facility dependent, being made by different members of a controller team. Thinking of an electronic strip replacement as a "D-side tool" is, therefore, oversimplified. Electronic flight data representations that allow issued clearances to be recorded by either member of a controller team are less likely to disrupt the integrity of the roles and thus would be more likely to be accepted by the controller workforce.
The overall analysis of strip marking frequencies also demonstrated a significant main effect of altitude, F(1, 1240) = 19.44, with observations made at low-altitude sectors generally being associated with a greater number of observed strip markings (1.09, vs. 0.88 for high-altitude sectors). Altitude also significantly interacted with mark type, F(1, 1240) = 17.98. This pattern differed across facilities, F(4, 1240) = 3.09, but not positions. Frequencies of strip markings are presented in Figure 3 as a function of altitude. Although more overall marking occurred at low-altitude sectors, marks associated with aircraft entering or exiting the sector occurred more frequently in high-altitude sectors. More control actions are likely to occur in the lower altitudes, where aircraft are transitioning to and from approach and departure. Overflights would tend to occupy the higher altitudes, and although they would enter and exit sectors frequently, it might be expected that other control actions would occur less often. The data document these expectations, at least to the extent that they affect flight strip markings.
Nonmark strip activities. In addition to marks on the strips, actions also seem to be a significant component of the controllers' interplay with the strips. In fact, movement was the single most frequent activity involving the strips. A 3 × 5 × 3 × 2, Activity (moves, offsets, points) × Facility × Position × Altitude, ANCOVA was performed on the nonmark strip-related activities. The analysis revealed a significant main effect of activity, F(1,1240) =133.76, and a significant Activity × Facility interaction, F(4, 1240) = 2.88. This interaction occurred primarily because ZDC controllers pointed to and offset the strips frequently, whereas ZTL controllers rarely did so.
As with the strip marking data, there were effects of position, F(2, 1240) = 263.07, and a Facility × Position × Activity interaction, F(8, 1240) = 3.07. Unlike strip marking, however, this interaction was merely one of magnitude, with the D-side controller at all facilities assuming responsibility for moving the strips.
The analysis of nonmark flight progress strip activities also revealed effects of altitude, F(1, 1240) = 13.11, and an Altitude × Activity interaction, F(1, 1240) = 18.80: Whereas more moves/resequences occurred during observations at high-altitude sectors than at low-altitude sectors, there were more offsets and points at low-altitude sectors.
In summary, strip markings differed from one another in observed frequency. The pattern of strip markings differed among centers most significantly during team control situations. There seemed to be two cooperative styles. In one style (ZAU and ZTL), the R-side controller marked issued clearances more often, whereas at ZKC, ZDC, and ZOB the D-side controller marked issued clearances more often. This pattern did not apply to coordination, nor did it apply to strip actions such as moving the strip. Furthermore, frequencies varied as a function of altitude, with greater strip marking frequencies, along with offsets and points, occurring in lower-altitude sectors. Once again, the pattern did not hold for all strip-marking types: Markings associated with aircraft entering and exiting the airspace and strip movements occurred more often in high-altitude sectors.

PHASE 2: BENEFITS
In addition to determining the frequency with which markings and actions occurred in the field, a second goal of the project was to determine the benefits controllers perceived to be associated with using strips. Data about the types of benefits perceived in the current strip environment would be valuable to designers developing an electronic representation of flight data. An important, frequent marking that served the controller as an external memory aid would suggest a different replacement than would an important, frequent marking that served as a communication aid. In this phase of the research, we assessed the type of perceived benefits that emerged from open-ended questions and categorized them. This was followed by a more focused assessment of the benefits.

Method and Results
Two interview procedures were used. The first interview procedure was open-ended and was used in ZKC, ZAU, and ZTL to explore the range of benefits provided by strip markings/ activities and to see the benefits that would naturally emerge. The second interview procedure used closed-ended Likert items derived from the analysis of the initial interviews. The second procedure was used in ZDC and ZOB to capture and quantify the specific benefits of each mark/activity. The interview questions were developed in accordance with established cognitive interviewing techniques (Geiselman & Fisher, 1997;Klein, Calderwood, & Mac-Gregor, 1989) that have been demonstrated to provide more complete and accurate details than do other interviewing procedures.
For invitations to preliminary interviews, subject-matter experts were given guidelines to invite controllers who had made (a) numerous markings, (b) unusual markings, (c) critical markings, or (d) markings representative of the particular control situation (e.g., holding) on a particular strip. That strip was delivered to the interviewers when it was no longer required in the control area. For invitations to the focused interviews, the subject-matter experts attempted to represent markings from all of the strip activities in Table 1.
In both interview procedures, the observers used "receipts" to specify the flight progress strip of interest. One copy of the receipt was given to the controller who was observed during the 10-min observational period and another copy was given to the area supervisor to indicate the strip needed to be retrieved for the interview process.
Method: Preliminary interviews. Of the 196 controllers invited for interviews from ZDC, ZAU and ZTL, 186 complied. The first step of the interview procedure was to ensure that the controller recognized the strip and that he or she had handled and marked it. Once this was established, the controller was asked to indicate those markings he or she made (as opposed to a teammate or relieved controller) by highlighting them on a photocopy of the target strip. The controller was then asked to indicate the chronological order in which he or she made the strip markings and to indicate whether he or she had resequenced, offset, or pointed to the strip. We then tried to recover the original context by asking controllers to construct a timeline that included the occurrence of each strip marking/activity, as well as the events leading up to each occurrence. The timelines themselves were not subjected to analysis.
From this timeline, a single target strip marking/activity was then selected for further analysis. Selection of a target marking was based on the recommendation of the observer or on the choice of the interviewer if the observer's notes were not specific. An effort was made to include marks that were less mundane in the hopes of obtaining information about less frequent, but presumably informative, markings in the interviews. Each controller then wrote responses to three open-ended questions. Questions referenced the specific target strip marking/activity selected during the previous part of the interview. The specific questions were as follows: (a) "Why did you make the target strip marking/ target action?" (b) "Did you make the target marking because it was required? Because it benefited you? Or both? Explain further." (c) "Hypothetically, how else could you have accomplished the same goals without making a strip marking/action?" Interviewees were then asked to write their answers. The interviewers provided general instructions about the questions and made themselves available to clarify the questions.
Results: Preliminary interviews. For the 84% of controllers who indicated that the target marking provided some benefit, the answers to the question were further coded according to categories that emerged from the responses. Thus initial categorization was not based on a predefined set of categories. Instead, clusters of related benefits were identified and labeled by the researchers. These emergent category labels were then given to two researchers, who classified the individual responses. Initial agreement between researchers reached 87.5%. Answers to the perceived benefits question could be coded as fitting more than one category of perceived benefits. Differences between coders were then negotiated until full agreement was reached.
The categories of perceived benefits that emerged were (a) that the target strip marking/ activity facilitated communication between teams of controllers or between the controller making the target strip marking/activity and the relieving controller; (b) that the target strip marking helped to save time or eliminate unnecessary repetitive actions among individuals or control teams; (c) that the target strip marking/activity provided an external memory aid or an external reference to important sources of information; and (d) that the target strip marking/activity aided the controller in the perceptual or cognitive organization of information, aiding in locating the strip.
Each of the emergent benefits proved reliable for some particular activity; however, memory, communication, or a combination of the two was suggested for a variety of marks. We used the emergent categories from the preliminary interviews to create Likert scales to tap into the four types of benefits that tended to emerge from controllers' open responses.
Method: Likert questions. Interviews at ZDC and ZOB allowed us to quantify more precisely the benefits that emerged and to determine the extent to which the benefits were perceived in individual-and team-staffed situations. Of the 196 controllers invited for interviews, 109 complied. During the interview, controllers indicated the markings that they made and indicated the chronological order of the markings. Two target markings (A and B) were then selected as the subject matter for the ensuing interview, with an effort made to represent all activities listed in Table 1.
The new questionnaire included eight probes that presented interviewees with prepared statements, two probes for each of the benefit categories: memory (e.g., "The target marking/ activity was beneficial because it allowed me to refer to information I would have otherwise had to remember"), communication (e.g., "...was beneficial because it allowed me to communicate information with sector teammates or other controllers without directly speaking with them"), workload (e.g., "...I saved excess work for myself or my sector teammate"), and organization (e.g., "...helped me organize control-related information in a more meaningful way"). The interviewees were to reflect their level of agreement with the statement by circling any number between 1 (strongly disagree) and 7 (strongly agree), with 4 (no opinion) as the midpoint. Interviewees were not asked to answer the Likert probes if they initially indicated that the marking provided no benefit other than fulfilling national or facility strip marking requirements. After completing the questionnaire for Target Marking A, participants went through the same procedure for Target Marking B.
Results: Likert questions. Overall, there were 210 target marks resulting from the interview phase of the follow-up study. Of those marks, 155 were perceived by the interviewee as beneficial and were subsequently the subject of the interview probes. An average score was calculated for each of the four benefits categories. These scores were then analyzed with a multivariate analysis of variance, with the four benefits measures as criteria and facility and mark type as predictors.
The means of the interview responses, as well as the results of the significance tests comparing means with scale midpoints, are presented in Table 2 as a function of mark type. Issued and coordinated clearances marks were perceived as giving all benefits. Planned clearance marks showed similar benefits, except that they were not viewed as valuable for communication. However, markings that did not involve clearances (incoming/outgoing radar/communications, nonclearance coordinations, and infor-mation updates) were beneficial only as communication aids. Finally, nonmarking striprelated activities were considered beneficial only as aids in the perceptual/cognitive organization of information.
These means capture well the perception of benefits across facilities, with the exception that organization varied with facility, F(1, 83) = 4.49, with ZDC controllers seeing more organizational benefits than did ZOB controllers. Facility and mark type did not interact for any of the dependent measures.
The interview data from individuals operating within a team during the strip marking observation are also summarized in Table 2. Issued clearances included all benefits except aiding in the organization of information. Nonmarking activities and coordinated clearance markings were considered beneficial across all benefits categories. None of the benefits of planning markings were significantly different from the midpoint; however, their communication and memory values approached significance. Incoming/outgoing radar/communication marks were considered an important source of communication for individuals operating within a control team but were not considered especially beneficial in other categories. Nonclearance coordination markings were considered beneficial for communication as well as workload reduction. Within a team environment, the benefits were similar across facilities.

PHASE 3: IMPORTANCE RATINGS
Although frequencies of observed strip markings and activities provide an understanding of performance differences across facilities and positions, they do not necessarily provide information regarding the perceived importance of those strip markings to actual ATC operations. To account for perceived importance of strip markings, 10 CPCs from various ARTCC facilities were asked to judge strip markings/activities on a scale of 1 (low importance) to 100 (high importance). Evaluations were conducted out of context, forcing judges to give a summary judgment. The mean intraclass correlation was .901, suggesting a good level of reliability among judgments.
These decontextualized importance judgments, when crossed with the more contextualized frequencies, can make tractable the task of determining critical strip activities. Importance ratings and frequencies were uncorrelated, r(27) = -.189, ns. A median split was performed on both the subjective importance and observed frequencies rankings. Four different classifications of strip markings were then established according to placement in a 2 (high importance, low importance) × 2 (high frequency, low frequency) contingency table (Table 3). Strip markings identified as both high importance and high frequency included all issued clearances (except for the rarer approach/departure and holding), coordinated altitude clearances, eliminate/revise control information, and pointouts. We will return to these critical marks in the Discussion section.
Strip markings identified as high importance, low frequency included most clearance coordinations, the remaining clearances, and control release. Strip markings identified as of low importance but high frequency included marks that typically occur when an aircraft first enters a sector. Also included in this category were marks dealing with time and all of the nonmarking strip actions. Finally, the low-importance/ low-frequency category included all planned clearance markings.

Implications for Air Traffic Control
Even when we used statistical procedures to eliminate the effect of the number of aircraft, we found considerable variability attributable to the facility and the mark made. Movement of the strips, along with outgoing communications, incoming radar/communication, altitude issued, and outgoing radar markings, always occurred at rates suggesting frequent contact with the strips, but of these only the altitude issued mark was viewed as an important mark.
Most marks reflected a different division of team responsibilities in ZAU and ZTL as compared with the other three ARTCCs. The D-side controller at ZKC, ZDC, and ZAU had primary responsibility for marking issued clearances, whereas the R-side controller marked most of the issued clearances at ZAU and ZTL. (It may be the case that other facilities follow either the ZKC/ZDC/ZOB approach or the ZAU/ZTL approach. We hope to address this limitation in upcoming work.) This difference in division of responsibility has consequences for how automation plans should proceed. Coordination marks, such as pointouts and coordinated altitude, did not show this facility dependence.
The variability among the marks and actions was also attributable, in part, to the altitude of the sector being observed. Observations made at low-altitude sectors often, but not always, involved more strip markings. Those few strip markings observed more often in high-altitude sectors were generally perceived to be of low importance. Given that six of seven frequent, important marks occurred primarily in low-altitude sectors, sector altitude should also be considered when assessing an electronic replacement for flight progress strips.

Implications for Design
Considering importance allowed us to make an initial step toward distinguishing among strip activities. It was often the case that very frequent marks -in fact, several of the most frequent marks -were viewed as less important by our panel of controllers. In other cases, markings viewed as important by our panel occurred only rarely. The benefits perceived by controllers augment the frequency and importance information. A design team should consider these three dimensions -frequency, importance, and client-perceived benefits -in deciding how to prioritize the automation of the strips.
Combining information from the interviews with frequency and importance data supplies an As examples from Table 4, consider efforts to capture the functionality of heading-issued markings. The designer should recognize not only that it is a primarily low-altitude marking and is made by different members of a team, depending on the facility, but also that it serves both a memory and communication function. Given the communication function between the R-side and D-side controllers (e.g., notify the D side that the R side had assigned a heading to a pilot), some effort should be made to make notations about headings issued by the R side available to the D side, either by allowing both controllers to view the same display or by echoing the information on the D-side controller's display. Given the memory function, the designer may want to display the information to both controllers in an accessible form that is consistent with its reminder role.

Implications for Theories of Dynamic Environments
In addition to the obvious applied benefits of the current results, the findings have theoretical value in our characterization of operators and operator teams in dynamic environments. Three findings are of particular theoretical value: First, dynamic environments have sometimes been thought to include a plethora of cognitive activities -attention, pattern recognition, memory, projection, and so on -and artifacts have been mentioned as possible aids for all of these (e.g., Olson & Olson, 1999). However, in this case only four benefits emerged from the controllers' comments, and only two of these, memory and communication, were confirmed repeatedly as benefits conferred by the strips. Although other artifacts in other domains may suggest other benefits, it is also possible that cognitive artifacts in all dynamic environments will have memorial and communicative benefits simply because of the nature of controlling a dynamic situation with a team of resourcelimited human operators. In fact, the value of artifacts in dynamic environments has been reported previously for both communicative (e.g., Olson & Olson) and memorial (e.g., Herrmann, Brubaker, Yoder, Sheets, & Tio, 1999) roles. A large component of the burgeoning area of computer-supported cooperative work addresses just such concerns.
As for memory, the study of external memory aids has a long history. In dynamic environments, the operator will be required to remember actions to be taken (i.e., prospective memory; e.g., Einstein & McDaniel, 1996). We also believe that some retrospective memory value will be found in the artifacts of controllers or any other operator of a dynamic environment. Although it matters little (and in fact may be harmful) if the operator remembers past events that have no consequences for the present, it is often of value for people to note that they have performed an action and thus need not perform it again. We believe that communication and memory aids will appear routinely as other dynamic environments are considered.
Second, dynamic environments have been thought to require considerable amounts of planning (e.g., Gronlund, Dougherty, Durso, Canning, & Mills, 2003). In our data, we found little evidence of long-term planning, at least as reflected by strip markings. We suggest that the extent to which planning is relevant depends on several factors, including the predictability of the environment, the amount of time available, and the value of having a plan. In some dynamic environments, reacting to the developing situation or taking a tactical role can be effective. Some controllers and researchers have thought of those taking a tactical role as somehow performing less well than those with a strategic approach. However, Klein (1989) and Zsambok and Klein's (1997) view of recognitionprimed decision making opens the possibility that true experts would be more likely to approach situations tactically. This is similar to early research suggesting that experts are more likely to work forward to solve a problem, presumably because they expect to have available the knowledge and skills needed for subsequent steps when that phase of the problem presents itself (e.g., Larkin, McDermott, Simon, & Simon, 1980). Finally, the current work is of value to those interested in understanding how teams function  Mark tended to be made by R side in ZTL and ZAU, otherwise D side.
in dynamic environments. We noted that communication was a prominent benefit listed by controllers for a number of marks. Clearly, R and RA controllers "share" information and work, as Cooke, Salas, Cannon-Bowers, and Stout (2000) noted, not by doing the same job but by taking their respective parts of the job. It is especially interesting that the roles of the two controllers change from facility to facility. Although the Rside controller has defining responsibilities that are consistent across facilities, some tasks taken by the R side at some facilities are the province of the D side at others. This difference is not trivial; in fact, it was common for some controllers to ask, "Well then, what does your D side do?" This strikes us as an interesting difference from other teams, such as the pilot, navigator, and photographer of unoccupied aerial vehicles, whose roles do not have the fluidity of the members of controller teams. In fact, the "radar team concept" of the FAA states that "there are no absolute divisions of responsibilities" and that "the team as a whole, has the responsibility" (FAA, 2001, Section 6-2-1a).
Although naturalistic research is limited in its ability to confirm causality, it does, as in this case, suggest hypotheses that can be addressed under more controlled circumstances. As the ATC system evolves into a paperless one, information about frequency of use and importance of currently used strip annotations can help prioritize the changes made. It seems critical that developers attend to the functionality of the marks as the system evolves into a paperless one. Memorial and communicative functionality, especially of the most frequent and important marks, must be part of that evolution. The study of how strips function in ATC bears on theoretical advances in the understanding of artifacts and dynamic environments.