Effects of an Online Visual Procedure on Task Completion, Time, and Attitude

Although substantial literature exists regarding learning with visuals, most consider text the primary channel with varying amounts of visuals explored as a secondary channel. This study considered the effectiveness of visuals-only procedural guides versus visuals plus added text, using visuals as the primary channel and using visuals developed from screen shots to eliminate the need to create a visual, stand-in vocabulary. There was no difference in the level of successful task completion between treatment groups. The time required to complete the task was measured and there were significant differences in the amount of time required by treatment group, age, and sex. Both treatment groups responded favorably to the procedures on a follow-up attitude questionnaire. Implications of the study and suggestions for further research are discussed.

established visual and display guidelines hold true with newer media. For example, Bradshaw and Johari (2002) explored traditional assumptions regarding white space in the more modern context of online instruction. That study suggests that where quality is otherwise high, general heuristics regarding white space and text structure may not be as critical in an online environment as in more traditional media because users may be more accustomed to poorly displayed text and other content in online contexts. Fischman (2001, p. 30) also has pointed out the importance of re-examining "the traditional assumption that texts, words, and images reinforce each other through fixed or transparent connections." Several researchers (Berger, 1972;Chaplin, 1994;Sontag, 1977;Tagg, 1993) have challenged that assumption, and view the relationship between words, texts, and images as dynamic interactions.

INSTRUCTIONAL VISUALS
The relationship between visuals and text in instructional materials has been explored in numerous ways. Researchers have manipulated size, placement, proximity, image color or quality, and whether subjects learn better when visuals are present or absent. Most studies investigating learning with pictures have examined the effects of text supplemented with pictures (Alesandrini, 1981).

Effectiveness
The effectiveness of a visual learning environment depends on many factors, including the amount of realistic detail in the visuals, the method of presentation of the visuals, and the techniques used to focus student attention on the essential learning characteristics (Dwyer, 1985). Words and pictures together generally have been found more effective than words alone. Diagrams, for example, have been found most effective as study aids and learning tools when text is integrated directly with the pictorial image (Chandler & Sweller, 1992;Guri-Rozenblit, 1988;Holiday, Brummer, & Donais, 1977). A meta-analysis of 74 studies comparing conventional versus visual-based instruction indicated students had slightly more success from visual instruction (Cohen, Ebeling, & Kulik, 1981). Levie and Lentz's (1982) meta-analysis of 55 studies indicated that "Illustrations can sometimes be used as effective/efficient substitutes for words or as providers of extra linguistic information" (p. 226). Forty-six of the studies they reviewed compared text and text with realistic pictures. They found that learning from the illustrated text was more effective than learning from non-illustrated text.

Dual Coding
The dual-coding hypothesis suggests people possess both visual and verbal encoding mechanisms (Hannafin & Rieber, 1989). Using visuals as the primary channel has been found to increase both speed and accuracy (Booher, 1975, Schorr, 1984Stone & Glock, 1981). Booher compared the relative comprehensibilty of procedural instructions that were presented as text only, picture only, picture-related text, text-related pictures, picture redundant text, and text-redundant pictures. Subjects were naval aircraft maintenance personnel and the procedural instructions were related to work-related job skills. Booher found that pictures are better for presenting contextual information and information that helps readers focus on the specific objects on which operations must be performed. Text is better for specific operational steps. Using pictures to provide action-step information increased the number of errors committed by his subjects. Using words to provide context or focus increased the time required for comprehension. His results suggest that procedural instruction is best conveyed with a combination of both pictorial and verbal information. Schorr (1984) also compared instructional treatments utilizing text, pictures, and text and pictures together, and found that instructions with explicit textual instructions led to greater accuracy, while instructions that used pictures led to greater speed. In his study, the explicitness of the instruction seemed the most important key to differences, regardless of the presentation mode. Consensus among researchers is that visuals can be more effective than written language when used to facilitate certain types of instruction, including performing procedures (Williams, 1993).

Procedural Learning
Learning can be categorized in three domains: cognitive, affective, and psychomotor (Bloom, Hastings, & Madaus, 1971). Psychomotor outcomes involve manipulative skills and abilities, including handling and manipulating the materials of scientific investigation, following instructions, and making accurate observations (Kempa, 1985). Procedural learning and learning to perform procedures are common where computer use is required in order to complete a learning task or a technology-related project. Whether students are learning to use technology-related tools and techniques in a supportive classroom environment or whether they are expected to learn them on their own, significant amounts of time in a lab setting are generally necessary. Students often are frustrated by the procedural manuals intended to help them learn to use software tools. Many such manuals have relied heavily upon text to convey instructions, although graphics are being included more frequently and more extensively, even to the point recently of using visuals as the primary communication channel, as in the case of the Master Visually book series (for example, Cable & Harris, 1999). Dechsri, Jones, and Heikkinen (1997) found that procedures with visual information-processing characteristics helped learners achieve higher scores with regard to achievement and psychomotor skills.

Advance Organizer and Functional Feedback
Visuals used to support learning can be used as advance organizers (Ausubel, 1960). Particularly where a learner has no working schema of a topic or task, a visual can provide a preview and a context. Studies (Dean & Enemoh, 1983;Dean & Kulhavy, 1981) have indicated that students with no prior knowledge of treatment content, who were shown a map-like drawing in advance of difficult-to-understand text materials, were able to comprehend and remember text as well as those with significant prior knowledge of the topic. In most visual versus text studies, pictures are used as stand-in vocabulary or syntactical ideas. For example, a "thumbs up" graphic was used in the Booher (1975) study to signify the verification step of the procedure, such as "Check Power Lamp A Lights." Stand-in graphics often create a new problem because what the developer intends for a graphic to mean may not be the same as what a reader interprets from the same graphic. For computer-related tasks, sample screen shots can be used to display both an outcome and the steps along the way. When used for comparison during a procedural task, screen shots can provide a form of feedback that is immediate and concise, and that also provides a means of supporting and requiring interaction and engagement from learners. Williams (1993) discusses how visuals can be used as both advance organizer and a means of verification. Visuals, then, not only provide us with needed contextual information at the beginning of a task but also provide us with confirmatory informationinformation that allows us incrementally to ensure that we have performed the procedures properly: "Does my screen now look like the one in the picture?" (p. 46).
With screen-captured images, a user does not need to read the visual in terms of interpreting meaning, as is the case from stand-in syntax such as a "thumbs up" visual. Rather, a user simply needs to look to see whether his or her screen matches the screen provided as an example. Visually verifying a match between the example and the user's own screen is more efficient than would be equivalent textual instruction, such as "Go to 'Start.' From 'Programs,' select 'Authorware 5.2,' then 'Authorware 5.2 Web Packager.'" For this kind of procedural task, textual instruction involves abstraction by the developer and subsequent reinterpretation by the reader, thus requiring a different kind of interpretation than a more immediate visual verification method.
In the present study, the graphics used were screen shots to represent exactly what a learner would see on actual screen during an online procedure. The task involved doing a procedure via a software program and an Internet connection. No stand-in visual vocabulary was needed. Rather, screen images were used to show each of the steps involved, with the entire procedure broken down and presented via 18 images that were adapted from screen shots.
The purpose of the study was to determine whether an online instructional procedure guide that used visuals only (modified screen-shots) would be as effective as one that used the same visuals with added text. This study is linked to previous research in that it compares the effectiveness of procedural instruction in two modes, textual and visual. Previous research has considered text based instruction as the primary mode, with text and visuals together as a second option. Much less research has been conducted that regards visuals only as the primary mode, with visuals and added text as the second option. The present study uses this approach and compares the effectiveness of visuals only with visuals and added text. This study also updates previous studies by using an online environment as the medium of delivery.
Hypothesis 1: Subjects receiving treatment one, with procedural steps presented via visuals only, will complete the task as successfully as those who receive treatment two, the procedural steps via visuals with added text.
Hypothesis 2: Subjects receiving treatment one, with procedural steps presented via visuals only, will complete the procedural task as fast or faster than will subjects receiving treatment two.
Hypothesis 3: Subjects receiving treatment one, with procedural steps presented via visuals only, will respond more favorably to the follow-up questionnaire items than will subjects receiving treatment two.

Subjects
Subjects were 36 students enrolled in an undergraduate instructional multimedia design department at a small, public university in the southwestern United States during the spring 2000 semester. All participants had previous experience developing and publishing a homepage and participation was voluntary. Subjects were randomly assigned to one of two treatment groups. Immediately following the treatment, subjects completed a belief questionnaire and responded to three open-ended questions regarding the treatment. Two-thirds (24) of the subjects were male. Sixteen were aged 17-25, 10 were aged 26-35, and 10 were aged 36 or above.

Treatments
An online procedural module was developed using Javascript to guide subjects through the task of "shocking" a simple Authorware project to a Web server. The module was designed as a window-style scaffold to run on the desktop along with the actual Authorware application for use by students saving Authorware programs as Shockwave files to run via the Internet. The treatments were developed to be visual guides that served as both instruction and feedback. The first version of the module, which was visual only, with no additional text prompts, served as Treatment 1 (Figure 1). A second version of the module, containing supplemental text prompts, served as Treatment 2 ( Figure 2). Both treatments included a tracking function that produced a time log (Figure 3) indicating the length of time the learner had remained on each of the procedural steps. The time logs were collected by the researchers at the end of the procedure.

Instruments
Data collection instruments included 14 Likert-style questions regarding subjects' attitudes toward the task and treatments, and three open-ended questions regarding the procedure. Data regarding the amount of time spent on each step of the procedure were collected via a time log function developed as part of the treatments. This log was not visible or apparent during the treatment but was available upon completion of the procedure.

Procedures and Data Analysis
Subjects were randomly assigned to one of the two treatments when they entered the computer lab. Their task was to publish an already constructed simple Authorware program to the subjects' homepage on the university's server.
A very brief orientation explained to subjects that their task was to publish the Authorware program using only the online procedure as a guide. Subjects were invited to take as much time as they needed to complete the task and did not receive any additional guidance from researchers during the process. Time logs were collected for each subject after they completed the task (i.e., when the researchers could view the published project via the subjects' homepage). Following the treatment, subjects were asked to fill in a questionnaire regarding their experience.
The procedural task was considered completed when the participant had successfully published the shocked Authorware file to their server space. A research assistant verified task completion following each subject's participation. Data from the time log was converted from minutes and seconds, to seconds, prior to analysis. Data for both the time on task and the Likertstyle response items were analyzed using one-way analysis of variance (ANOVA). Significance was set at the .05 level. SPSS was used to analyze the data.

RESULTS
The study considered three hypotheses. The first predicted that there would be no difference between treatment groups regarding successful completion of the treatment task. The second predicted that any difference by treatment in the time spent on the task would favor treatment one, and the third predicted that subjects receiving the visuals-only version would respond more favorably to the items on the follow-up questionnaire.

Task Completion
Subjects were considered to have completed the task successfully when the Authorware project provided to them had been "shocked" and published to their personal Web sites. It was hypothesized that subjects in both groups would do equally well in successfully completing the task. This hypothesis was supported; all participants succeeded in completing the task.

Time
At the completion of the task, a time log was collected for each subject. Time required to complete the task ranged from 2 minutes, 0 seconds, to 12 minutes, 39 seconds for Treatment 1 (visual only), and from 4 minutes, 11 seconds to 18 minutes, 29 seconds for Treatment 2 (visuals with text). Mean times were 6 minutes, 30 seconds for Treatment 1 and 9 minutes, 23 seconds for Treatment 2. Analysis of variance indicated significant difference between the two treatments, F(1, 34) = 6.77, p = .016 (Table 1).
There also was significant difference in the time required for completion by age (Table 2), with younger subjects requiring less time to complete the task than older subjects. The mean for those aged 17-25 was 6.31; for those 26-35, the mean was 8.19; and for those 36 and older, the mean was 10.32. Further analysis revealed the significance to be between the youngest and the oldest learners, F(1, 24) = 9.89, p = .004.
There also was significant difference in time by sex (Table 3). The mean completion times for males was 7.1 and for females, 9.6. Within treatment groups, the difference disappeared in Treatment 2 (Table 4), while in Treatment 1 significance was at the .01 level (Table 5).

Attitude
All subjects completed a 17-item belief and attitude questionnaire. Responses for the first 14 items were collected via a 4-point bipolar scale with 1 being strongly disagree and 4 being strongly agree. With the exception of item 10 ("I would rather develop software using traditional book text procedure"), there were no significant differences between treatment groups for the attitude items. In order to achieve a total value for the attitude items, item 10 was reverse scored (Table 6). Although both treatment groups indicated general disagreement with item 10, with mean ratings of 1.64 for Treatment 1 (visuals only), and 2.39 for Treatment 2 (visuals and added text), the difference was significant (Table 7). Mean responses by treatment group to individual attitude items are presented in Table 8.
There was no significant difference by sex on any of the questionnaire items. By age, there was a significant difference on two of the questionnaire items 410 / BRADSHAW AND JOHARI (item 3, "I like this kind of procedure more than books"; and item 4, "I believe teachers need to use this kind of procedure in their teaching"), with the highest agreement rating from the youngest subjects. For item 3, means were: 3.81, 3.2, and 3.2 for the age ranges of 17-25, 26-35, and 36 and older, respectively. For item 4: 3.7, 3.4, and 3.2.
Responses to the three open-ended questions were similar across treatment groups. Sample, representative responses to each are presented in Tables 9, 10, and 11. For Question 15, subjects generally reported liking the way the procedure included all the necessary steps in as concise a manner as possible. For questions 16, "What could be done to improve the procedure?" and 17, "Any other comments or suggestions?" responses were similar across treatments, although for both items there were slightly more "nothing" responses from participants in Treatment 2.

DISCUSSION
This study considered the effectiveness of visuals-only procedural guides versus visuals plus added text. Much of the previous research had considered the comparison primarily from the perspective of visuals as support rather than primary channel, and most of those that did implement visuals as primary channel used pictures to represent words or phrases. This study used visuals in the form of screen shots to show subjects exactly what their own screens should look like as they progressed through the procedural task, without the need to create a visual, stand-in vocabulary. As no stand-in visual syntax was used in the treatments, generalization is limited to realistic representation visual procedures, such as those with visuals based on actual screen shots.
Regarding task completion, there was no difference in the successful task completion rate between treatment groups. All subjects successfully shocked the Authorware file and published the procedural task to their personal Web sites. This is not surprising, given that both treatments were clearly developed and the procedure was broken down into small steps. One of the authors had taught Authorware for several years and had found the process of "shocking" Authorware program files for use on the Web to be a difficult and intimidating task for many students, generally requiring up to two class sessions for students to understand and complete. The speed with which students were able to complete this task is all the more surprising in that subjects were able to do it alone with the online procedure as a scaffold, without discussion or assistance from an instructor or peers.
Time and efficiency were important factors in this study. Subjects who received the visual-only treatment were able to complete the task as well as subjects who received the treatment with both visuals and added text, although it took the visuals with added text group significantly longer. This result is interesting in light of Booher's study in which subjects who were provided only with print Table 9. Sample Responses to Item 15: What did you like best about this kind of procedure?
Treatment Response 1 1 1 1 2 2 2 2 "The small window size, the way items were highlighted that were important." "The instructions did not require the learner to decipher textual intent." "I like it because it saves time." "I prefer visual procedures because it makes learning quicker." "It was easy because it took out the unnecessary words." "That it visually takes you through each step of the procedure. It is easy to use and understand." "The use of visual steps helps in the learning process." "It was hands on. All the instruction was there. I had access to it to go back if I missed a step." instructions were able to perform the required procedures as accurately as were those subjects provided with pictures and text-it simply took them longer. In both cases, the visuals group was faster than the text group, in one case, text alone, in the current case, text in addition to the visuals. This can be explained to some degree by dual coding theory. Where both channels are sufficiently well developed, reading and understanding a message via two channels could be expected to take longer than reading and understanding the message from a single channel. Although no new information is being offered, time is required for that to be determined by the reader. Other researchers (Pettersson, 1999) have indicated that visuals and text used together may be most effective when the two channels are redundant (as opposed to contradictory). Redundancy is a safeguard against insufficient information by a single channel. If there are weaknesses in either channel, the other can be used to compensate. Stone and Glock's (1981) study used eye-movement data and they noted that, although both text and accompanying visual were carefully constructed to provide completely redundant information, readers frequently scanned back and forth between text and visuals. In the current study, the visuals were the dominant channel and, because the visual channel was so easy to understand, the second channel, text, did not add any missing information, but only required extra time for verification. This may not have been the case if the visuals were not so "I hope to see more of this kind of learning." "Impressive." "This is a great method." [field left blank] [field left blank] "Very good" "Good job, but some people also learn by person-to-person interaction." dominant nor the meaning so transparent. Transparency was accomplished because the visuals were developed from actual screen shots. Had the task utilized stand-in or syntactical visuals, rather than actual representations, there would have been a greater need for redundancy. Regarding attitude, both treatment groups responded favorably to the procedures. Both groups generally agreed that the procedure was easy, short and to the point, and that the visual aspect of the procedure was positive. There was significant difference by treatment only to Item 10, "I would rather develop software using traditional book text procedures," with the visuals-only group disagreeing significantly more strongly. Both treatment groups preferred the online visual procedures to traditional book procedures, but the visuals-only group was even less inclined to prefer traditional text procedures. Treatment 1, the visuals-only group, had no textual instruction during the procedure they had just completed. Treatment 2, the visuals-with-added-text group, did have simple textual statements to reinforce the visual instruction. While group 2 instruction was primarily visual, textual statements were still included in the procedures. Although the text directives were redundant, subjects may have perceived the treatment as less radically different from traditional text-based procedures than was perceived by subjects in the visuals-only group.
Regardless of treatment, there was significant difference by age in the amount of time required to successfully complete the task. Differences in time by age were not surprising given the fact that some level of computer use prior to college is much more common for people in the youngest age group than in the oldest. It is likely that comfort with the technology in general and confidence in their abilities to work with it contributed to the greater speed of the youngest group of subjects. Females took longer on average to complete the task than did males. This may be related to general self-efficacy and frequency of use differences by sex reported elsewhere, although the current study does not provide additional direct evidence.
For the purposes of this study, the researchers intended the treatments to be used as a procedural prompt for use whenever students have the need to shock files. The process of successfully shocking files for the Web is fairly complex. A strength of this study is its use of the shocking process as the basis for treatment development. Despite the intimidating nature of the task, the procedures were effective and efficient, with the result that subjects completed the task within an astonishingly short time. This was true for both treatments, although the time required for the visuals-only treatment was significantly less. Results suggest that other tasks, whether simple or fairly complex, also could be performed fairly rapidly with scaffolds of this type. It is important to note that the researchers committed a substantial amount of time to treatment development to ensure that all steps in the procedure were addressed and that the information needed was visually accessible. As is often the case in instructional contexts, an inverse relationship exists between the time and effort put forth in the development of instruction and the time and effort required by end users.
This study attempted to recapture some neglected focus on visuals for learning (Fischman, 2001), with specific attention to the use of procedural guides. Such guides are very common in educational and training settings. Students and even professional developers might not need to use the "shocking" procedure that often. Even if the procedure is learned well, some students might forget a step or two if the procedure is not used for a long period of time. This is similar to situations very common in corporate settings, where employees often rely heavily on "job aids" for infrequently performed procedures. Nonetheless, future studies could include delayed follow-up performance assessments to explore whether differences exist with regard to subjects' abilities to perform the procedure without the scaffold. Particularly in business and corporate training settings, the speed and accuracy with which an individual can complete a procedural task can be an important key to the success of both the individual and the company. This study supports findings from previous studies (Booher, 1975;Schorr, 1984;Stone & Glock, 1981) that the use of visuals enhances speed. In this study, enhanced screen-shot procedural guides were used that do not require the same kind of interpretation as stand-in syntax visuals, and there was no difference in accuracy. Future research should explore the issue of accuracy with more targeted treatments that allow for a greater range of success and to see whether reduced accuracy related to visuals is limited to studies in which the visuals used are syntactical stand-in vocabulary. Follow-up studies also should incorporate self-efficacy measures to see whether correlations exist between age, sex, comfort with the technology, and time on task. Related studies could include procedural guides that include combinations of screen shots and animated elements. Adding auditory narration at the beginning of each screen and evaluating the impact also could be worthwhile.