02.08.2013 Views

Evaluating Student Learning Gains in Two Versions of AutoTutor

Evaluating Student Learning Gains in Two Versions of AutoTutor

Evaluating Student Learning Gains in Two Versions of AutoTutor

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Evaluat<strong>in</strong>g</strong> <strong>Student</strong> <strong>Learn<strong>in</strong>g</strong> <strong>Ga<strong>in</strong>s</strong> <strong>in</strong> <strong>Two</strong><br />

<strong>Versions</strong> <strong>of</strong> <strong>AutoTutor</strong><br />

Natalie K. Person 1 , Laura Bautista 2 , Arthur C. Graesser 2 , Eric Mathews 1 ,<br />

and The Tutor<strong>in</strong>g Research Group 2<br />

1 Department <strong>of</strong> Psychology, Rhodes College, 2000 North Parkway, Memphis, TN 38112,<br />

2 Department <strong>of</strong> Psychology, The University <strong>of</strong> Memphis, Memphis, TN 38152-6400<br />

Abstract: The pedagogical effectiveness <strong>of</strong> two versions <strong>of</strong> <strong>AutoTutor</strong> was assessed <strong>in</strong> a student learn<strong>in</strong>g outcome<br />

study. Sixty students enrolled <strong>in</strong> a computer literacy course received tutor<strong>in</strong>g from one <strong>of</strong> the versions <strong>of</strong> <strong>AutoTutor</strong><br />

on one <strong>of</strong> the follow<strong>in</strong>g topics: Hardware, Operat<strong>in</strong>g Systems, and the Internet. <strong>Student</strong>s were also required to reread<br />

material on two <strong>of</strong> the previously mentioned topics. All participants then received a comprehension test on all three<br />

topics. A with<strong>in</strong>-subjects design enabled the follow<strong>in</strong>g conditions to be compared: <strong>AutoTutor</strong> versus a reread<br />

condition versus a control condition. Results <strong>in</strong>dicated that <strong>AutoTutor</strong> was an effective pedagogical tool compared to<br />

the other learn<strong>in</strong>g controls. Both versions <strong>of</strong> <strong>AutoTutor</strong> provided an effect size <strong>in</strong>crement <strong>of</strong> approximately .5<br />

standard deviation units when compared to the reread and control condition.<br />

1 Background<br />

<strong>AutoTutor</strong> is an animated pedagogical agent that participates <strong>in</strong> a conversation with the learner<br />

while simulat<strong>in</strong>g the dialog moves that are frequently used by typical human tutors. <strong>AutoTutor</strong> is<br />

currently designed to help college students learn about topics that are typically covered <strong>in</strong> an<br />

<strong>in</strong>troductory computer literacy course (e.g., hardware, operat<strong>in</strong>g systems, and the Internet).<br />

Elaborated descriptions <strong>of</strong> <strong>AutoTutor</strong>’s architecture have been discussed <strong>in</strong> previous<br />

publications, and therefore, will only receive brief mention <strong>in</strong> this paper [8, 11, 15, 16, 20, 25,<br />

28, 34, 35].<br />

<strong>AutoTutor</strong>’s discourse patterns and pedagogical strategies are based on a previous project that<br />

dissected 100 hours <strong>of</strong> naturalistic tutor<strong>in</strong>g sessions [12, 13, 31]. Instead <strong>of</strong> merely be<strong>in</strong>g an<br />

<strong>in</strong>formation delivery system that bombards the student with a large volume <strong>of</strong> <strong>in</strong>formation,<br />

<strong>AutoTutor</strong> serves as a discourse facilitator or collaborative scaffold that assists the student <strong>in</strong><br />

actively construct<strong>in</strong>g knowledge. Hence, a central educational philosophy beh<strong>in</strong>d <strong>AutoTutor</strong> is<br />

that effective learn<strong>in</strong>g occurs when students actively do the follow<strong>in</strong>g: (1) construct subjective<br />

explanations and elaborations <strong>of</strong> the material, (2) ask and answer questions, and (3) solve<br />

problems that require deep reason<strong>in</strong>g [2, 4, 7, 23].<br />

We currently have two versions <strong>of</strong> <strong>AutoTutor</strong>. <strong>AutoTutor</strong> 1.1 simulates the dialog moves <strong>of</strong><br />

normal, untra<strong>in</strong>ed human tutors, whereas <strong>AutoTutor</strong> 2.0 simulates dialog moves that are<br />

motivated by more sophisticated, ideal tutor<strong>in</strong>g strategies. Our analyses <strong>of</strong> human tutor<strong>in</strong>g<br />

sessions revealed that typical human tutors do not use most <strong>of</strong> the ideal tutor<strong>in</strong>g strategies that<br />

have been identified <strong>in</strong> education and the <strong>in</strong>telligent tutor<strong>in</strong>g system enterprise. These strategies<br />

<strong>in</strong>clude the Socratic method [5], model<strong>in</strong>g-scaffold<strong>in</strong>g-fad<strong>in</strong>g [6], reciprocal tra<strong>in</strong><strong>in</strong>g [29],<br />

anchored situated learn<strong>in</strong>g [1], error identification and correction [1, 23, 38], build<strong>in</strong>g on


prerequisites [10], and sophisticated motivational techniques [22]. Detailed discourse analyses<br />

have been performed on small samples <strong>of</strong> accomplished tutors <strong>in</strong> an attempt to identify<br />

sophisticated tutor<strong>in</strong>g strategies [9, 17, 26, 27, 36]. However, we discovered that the vast<br />

majority <strong>of</strong> these sophisticated tutor<strong>in</strong>g strategies were virtually nonexistent <strong>in</strong> the untra<strong>in</strong>ed<br />

tutor<strong>in</strong>g sessions that we videotaped and analyzed. Tutors clearly need to be tra<strong>in</strong>ed how to use<br />

the sophisticated tutor<strong>in</strong>g skills because they do not rout<strong>in</strong>ely emerge <strong>in</strong> naturalistic tutor<strong>in</strong>g with<br />

untra<strong>in</strong>ed tutors. In this paper we report a study that assessed the impact <strong>of</strong> two versions <strong>of</strong><br />

<strong>AutoTutor</strong> on student learn<strong>in</strong>g ga<strong>in</strong>s. We beg<strong>in</strong> with a brief overview <strong>of</strong> <strong>AutoTutor</strong> and then<br />

report the results <strong>of</strong> the empirical study.<br />

2 Brief Overview <strong>of</strong> <strong>AutoTutor</strong><br />

<strong>AutoTutor</strong> works by hav<strong>in</strong>g a conversation with the learner. <strong>AutoTutor</strong> appears as an animated<br />

agent that acts as a dialog partner with the learner. The animated agent delivers <strong>AutoTutor</strong>’s<br />

dialog moves with synthesized speech, <strong>in</strong>tonation, facial expressions, and gestures. The major<br />

question or problem that is be<strong>in</strong>g worked on is both spoken by <strong>AutoTutor</strong> and is pr<strong>in</strong>ted at the<br />

top <strong>of</strong> the screen. The major questions/problems are generated systematically from a curriculum<br />

script, a module discussed below. <strong>AutoTutor</strong>’s major questions and problems are not the fill-<strong>in</strong>the<br />

blank, true/false, or multiple-choice questions that are so popular <strong>in</strong> the US educational<br />

system. Instead, the questions and problems <strong>in</strong>vite lengthy explanations and deep reason<strong>in</strong>g<br />

(e.g., answers to why, how, what-if questions). The goal is to encourage students to articulate<br />

lengthier answers that exhibit deep reason<strong>in</strong>g, rather than to recite small bits <strong>of</strong> shallow<br />

knowledge. There is a cont<strong>in</strong>uous multi-turn tutorial dialog between <strong>AutoTutor</strong> and the learner<br />

dur<strong>in</strong>g the course <strong>of</strong> answer<strong>in</strong>g a question (or solv<strong>in</strong>g a problem). When consider<strong>in</strong>g the turns <strong>of</strong><br />

both the learner and <strong>AutoTutor</strong>, it typically takes 10 to 20 conversational turns to answer a s<strong>in</strong>gle<br />

question or solve a problem from the curriculum script. The learner types <strong>in</strong> her/his<br />

contributions dur<strong>in</strong>g the exchange on the keyboard. For some topics, there are graphical displays<br />

and animation, with components that <strong>AutoTutor</strong> refers to. The ultimate goal is to have<br />

<strong>AutoTutor</strong> be a good conversational partner that comprehends, speaks, po<strong>in</strong>ts, and displays<br />

emotions, all <strong>in</strong> a coord<strong>in</strong>ated fashion.<br />

3 <strong>AutoTutor</strong>’s Architechure<br />

3.1 Curriculum Script<br />

A curriculum script is a loosely ordered set <strong>of</strong> skills, concepts, example problems, and questionanswer<br />

units. Each topic <strong>in</strong> the curriculum script is represented as a structured set <strong>of</strong> words,<br />

sentences, or paragraphs <strong>in</strong> a free text format. Associated with each topic (problem or question)<br />

is a focal question, a set <strong>of</strong> basic noun-like concepts, a set <strong>of</strong> ideal good answer aspects (each<br />

be<strong>in</strong>g roughly a sentence <strong>of</strong> 10 - 20 words), different forms <strong>of</strong> express<strong>in</strong>g or elicit<strong>in</strong>g each ideal<br />

answer aspect (i.e., a h<strong>in</strong>t, prompt, versus assertion), a set <strong>of</strong> anticipated bad answers (i.e., bugs,<br />

misconceptions), a correction for each bad answer, a summary <strong>of</strong> the answer or solution, and a<br />

set <strong>of</strong> anticipated topic-related questions and answers.


3.2 Natural language extraction and speech act classification<br />

<strong>AutoTutor</strong> must be able to classify the speech acts <strong>of</strong> student contributions <strong>in</strong> order to flexibly<br />

respond to what the student types <strong>in</strong>. First, <strong>AutoTutor</strong> segments the str<strong>in</strong>g <strong>of</strong> words and<br />

punctuation marks with<strong>in</strong> a learner’s turn <strong>in</strong>to speech act units, rely<strong>in</strong>g on punctuation to perform<br />

this segmentation. Then each speech act is assigned to one <strong>of</strong> the follow<strong>in</strong>g speech act categories:<br />

Assertion, WH-question, YES/NO question, Metacognitive comment (e.g., I don’t understand),<br />

Metacommunicative act (e.g., Could you repeat that?), and Short Response.<br />

3.3 Latent Semantic Analysis<br />

The fact that world knowledge is <strong>in</strong>extricably bound to the process <strong>of</strong> comprehend<strong>in</strong>g language<br />

and discourse is widely acknowledged, but researchers <strong>in</strong> computational l<strong>in</strong>guistics and artificial<br />

<strong>in</strong>telligence have not had a satisfactory approach to handl<strong>in</strong>g the deep abyss <strong>of</strong> world knowledge.<br />

Recently, latent semantic analysis (LSA) has been proposed as a statistical representation <strong>of</strong> a<br />

large body <strong>of</strong> world knowledge [20, 21]. LSA capitalizes on the fact that particular words appear<br />

<strong>in</strong> particular texts (called “documents”). Each word, sentence, or text ends up be<strong>in</strong>g a weighted<br />

vector on the K dimensions. The “match” (i.e., similarity <strong>in</strong> mean<strong>in</strong>g, conceptual relatedness)<br />

between two words, sentences, or texts is computed as a geometric cos<strong>in</strong>e (or dot product)<br />

between the two vectors, with values rang<strong>in</strong>g from 0 to 1. <strong>AutoTutor</strong> has successfully used LSA<br />

as the backbone for assess<strong>in</strong>g the quality <strong>of</strong> student assertions, based on matches to good answers<br />

and anticipated bad answers <strong>in</strong> the curriculum script [15].<br />

3.4 Dialog Move Generator<br />

<strong>AutoTutor</strong> currently generates the follow<strong>in</strong>g dialog moves: ma<strong>in</strong> questions, short feedback (i.e.,<br />

positive, neutral, negative), pumps (“uh huh”, “tell me more”), prompts ("The primary memories<br />

<strong>of</strong> the CPU are ROM and _____"), h<strong>in</strong>ts, assertions, corrections, and summaries. As mentioned<br />

earlier, we currently have two versions <strong>of</strong> <strong>AutoTutor</strong>, <strong>AutoTutor</strong> 1.1 and <strong>AutoTutor</strong> 2.0.<br />

<strong>AutoTutor</strong> 1.1 simulates the dialog moves <strong>of</strong> untra<strong>in</strong>ed (yet effective) human tutors, whereas<br />

<strong>AutoTutor</strong> 2.0 is a hybrid between naturalistic tutorial dialog and ideal pedagogical strategies.<br />

The two versions primarily differ <strong>in</strong> terms <strong>of</strong> the mechanisms that control the particular dialog<br />

moves that are generated after a student contribution. The dialog move mechanisms for both<br />

<strong>AutoTutor</strong> versions are discussed later.<br />

3.5 Dialog Advancer Network<br />

The Dialog Advancer Network (DAN) is a mechanism that manages the conversation that occurs<br />

between a student and <strong>AutoTutor</strong> [30, 32, 33, 34]. The DAN is comprised <strong>of</strong> a set <strong>of</strong> customized<br />

pathways that are tailored to particular student speech act categories (e.g., Assertion,<br />

Metacognitive comments). The DAN enables <strong>AutoTutor</strong> to micro-adapt each tutor-generated<br />

dialog move to the preced<strong>in</strong>g student turn. For example, if a student wants <strong>AutoTutor</strong> to repeat<br />

the last dialog move, the DAN conta<strong>in</strong>s a Metacommunicative pathway that allows <strong>AutoTutor</strong> to<br />

adapt to the student’s request and respond appropriately. A DAN pathway may <strong>in</strong>clude one or a


comb<strong>in</strong>ation <strong>of</strong> the follow<strong>in</strong>g components: (1) discourse markers (e.g., “Okay” or “Mov<strong>in</strong>g on”),<br />

(2) <strong>AutoTutor</strong> dialog moves (e.g., Positive Feedback, Pump, or Assertion), (3) answers to WH-<br />

or Yes/No questions, or (4) canned expressions (e.g., “That’s a good question, but I can’t answer<br />

that right now”).<br />

3.6 Animated Agent<br />

The persona for <strong>AutoTutor</strong> was created <strong>in</strong> MetaCreations Poser 3 and is controlled by Micros<strong>of</strong>t<br />

Agent. <strong>AutoTutor</strong> is a three-dimensional embodied agent who rema<strong>in</strong>s on the screen throughout<br />

the entire tutor<strong>in</strong>g session. <strong>AutoTutor</strong> communicates with the learner via synthesized speech,<br />

facial expressions, and simple hand gestures. Each <strong>of</strong> these communication parameters can be<br />

adjusted to maximize <strong>AutoTutor</strong>’s overall effectiveness as a tutor and conversational partner.<br />

Although a great deal more could be said about the work<strong>in</strong>gs <strong>of</strong> the animated agent, these<br />

mechanisms have been described elsewhere [25, 35] and are simply beyond the scope <strong>of</strong> this<br />

paper.<br />

4 <strong>Two</strong> <strong>Versions</strong> <strong>of</strong> <strong>AutoTutor</strong><br />

4.1 <strong>AutoTutor</strong> 1.1<br />

The dialog moves <strong>in</strong> <strong>AutoTutor</strong> 1.1 are generated by 15 fuzzy production rules [19] that<br />

primarily exploit data provided by the LSA module [15, 34]. <strong>AutoTutor</strong>1.1’s production rules are<br />

tuned to the follow<strong>in</strong>g LSA parameters: (a) <strong>Student</strong> Assertion Quality, (b) <strong>Student</strong> Ability Level,<br />

and (c) Topic Coverage. Each production rule specifies the LSA parameter values for which a<br />

particular dialog move should be generated. For example, consider the follow<strong>in</strong>g dialog move<br />

rules:<br />

(1) IF [<strong>Student</strong> Assertion match with good answer text = HIGH or VERY HIGH]<br />

THEN [select POSITIVE FEEDBACK dialog move]<br />

(2) IF [<strong>Student</strong> Ability = MEDIUM or HIGH & <strong>Student</strong> Assertion match with<br />

good answer text = LOW] THEN [select HINT dialog move]<br />

In Rule (1) <strong>AutoTutor</strong> will provide Positive Feedback (e.g., “Right”) <strong>in</strong> response to a high quality<br />

student Assertion, whereas <strong>in</strong> Rule (2) <strong>AutoTutor</strong> will generate a H<strong>in</strong>t to br<strong>in</strong>g the relatively high<br />

ability student back on track (e.g., “What about the size <strong>of</strong> the programs you need to run?”). The<br />

dialog move generator currently controls 12 dialog moves: Pump, H<strong>in</strong>t, Splice, Prompt, Prompt<br />

Response, Elaboration, Summary, and five forms <strong>of</strong> immediate short-feedback (positive,<br />

positive-neutral, neutral, negative-neutral, and negative).<br />

Dur<strong>in</strong>g the tutorial conversation for each tutor<strong>in</strong>g topic, <strong>AutoTutor</strong> must keep track <strong>of</strong> which<br />

good answer aspects have been covered along with which dialog moves have been previously<br />

generated. <strong>AutoTutor</strong> 1.1 uses the LSA Topic Coverage metric to track the extent to which each<br />

good answer aspect (Ai) for a topic has been covered <strong>in</strong> the tutorial conversation. That is, LSA<br />

computes the extent to which the various tutor and student turns cover the good answer aspects<br />

associated with a particular topic. The Topic Coverage metric varies from 0 to 1 and gets updated<br />

for each good answer aspect with each tutor and student turn. If some threshold (t) is met or


exceeded, then the Ai is considered covered. <strong>AutoTutor</strong> also must decide which good answer<br />

aspect to cover next. In <strong>AutoTutor</strong> 1.1, the selection <strong>of</strong> the next good answer aspect to cover is<br />

determ<strong>in</strong>ed by the zone <strong>of</strong> proximal development. <strong>AutoTutor</strong> 1.1 decides on the next aspect to<br />

cover by select<strong>in</strong>g the aspect that has the highest subthreshold coverage score. Therefore,<br />

<strong>AutoTutor</strong> 1.1 builds on the fr<strong>in</strong>ges <strong>of</strong> what the student knows or what has occurred <strong>in</strong> the<br />

discourse history <strong>of</strong> the tutorial conversation. A topic is f<strong>in</strong>ished when all <strong>of</strong> the aspects have<br />

coverage values that meet or exceed the threshold t.<br />

4.2 <strong>AutoTutor</strong> 2.0<br />

We believe that the most effective computer tutor will be a hybrid between naturalistic tutorial<br />

dialog and ideal pedagogical strategies. <strong>AutoTutor</strong> 2.0 <strong>in</strong>corporates tutor<strong>in</strong>g tactics that attempt<br />

to get the student to articulate the good answer aspect that is selected. <strong>AutoTutor</strong> 1.1 considers Ai<br />

as covered if it is articulated by either the student or the tutor, whereas <strong>AutoTutor</strong> 2.0 counts only<br />

what the student says when evaluat<strong>in</strong>g coverage. Therefore, if Ai is not articulated by the student,<br />

it is not considered as covered. This forces the student to articulate the explanations <strong>in</strong> their<br />

entirety, an extreme form <strong>of</strong> constructivism. In order to flesh out a particular Ai; <strong>AutoTutor</strong> 2.0<br />

uses discourse patterns that organize dialog moves <strong>in</strong> terms <strong>of</strong> their progressive specificity. H<strong>in</strong>ts<br />

are less specific than Prompts, and Prompts are less specific than Elaborations. Thus, <strong>AutoTutor</strong><br />

2.0 cycles through a H<strong>in</strong>t-Prompt-Elaboration pattern until the student articulates the Ai. The<br />

other dialog moves (e.g., short feedbacks and summaries) are controlled by the fuzzy production<br />

rules that were described for <strong>AutoTutor</strong> 1.1.<br />

<strong>AutoTutor</strong> 2.0 has two additional features for select<strong>in</strong>g the next Ai to be covered. First,<br />

<strong>AutoTutor</strong> 2.0 enhances discourse coherence by select<strong>in</strong>g the next Ai that is most similar to the<br />

previous aspect that was covered. Second, <strong>AutoTutor</strong> 2.0 selects pivotal aspects that have a high<br />

family resemblance to the rema<strong>in</strong><strong>in</strong>g uncovered aspects; that is, <strong>AutoTutor</strong> 2.0 attempts to select<br />

an aspect that has the greatest content overlap with the rema<strong>in</strong><strong>in</strong>g aspects to be covered.<br />

Whereas <strong>AutoTutor</strong> 1.1 capitalizes on the zone <strong>of</strong> proximal development exclusively, <strong>AutoTutor</strong><br />

2.0 also considers conversational coherence and pivotal aspects when select<strong>in</strong>g the next good<br />

answer aspect to cover.<br />

5 Evaluation <strong>of</strong> <strong>Student</strong> <strong>Learn<strong>in</strong>g</strong> Outcomes<br />

5.1 Methods<br />

The methodologies for test<strong>in</strong>g the two versions <strong>of</strong> <strong>AutoTutor</strong> (i.e., versions 1.1 and 2.0) were<br />

identical. The participants were 60 students <strong>in</strong> a computer literacy course at the University <strong>of</strong><br />

Memphis. Thirty-six students participated <strong>in</strong> the <strong>AutoTutor</strong> 1.1 test<strong>in</strong>g, 24 <strong>in</strong> the <strong>AutoTutor</strong> 2.0<br />

test<strong>in</strong>g. The students received extra credit <strong>in</strong> the computer literacy course for participat<strong>in</strong>g <strong>in</strong> the<br />

experiment. There were three experimental conditions: <strong>AutoTutor</strong> (student <strong>in</strong>teracted with<br />

<strong>AutoTutor</strong> to learn about one <strong>of</strong> the three computer literacy topics, Hardware, Operat<strong>in</strong>g systems,<br />

or Internet), Reread (student reread material <strong>in</strong> the course textbook about one <strong>of</strong> the three topics),<br />

and no-read Control (student does not re-read or <strong>in</strong>teract with <strong>AutoTutor</strong> for one <strong>of</strong> the three<br />

topics). It should be noted that the students were reread<strong>in</strong>g the material that they had previously<br />

covered <strong>in</strong> the computer literacy course, not learn<strong>in</strong>g it for the first time. That is, students had


eceived lectures on the material, had been assigned relevant chapters to read, and had been<br />

tested on the topics by the course <strong>in</strong>structor. A repeated-measures design ensured that all students<br />

participated <strong>in</strong> each <strong>of</strong> the three conditions. The assignment <strong>of</strong> the three conditions to the three<br />

computer literacy topics was counterbalanced across subjects to control for possible order effects.<br />

All conditions occurred sequentially with m<strong>in</strong>imal time elaps<strong>in</strong>g between conditions. The time<br />

spent reread<strong>in</strong>g the material and <strong>in</strong>teract<strong>in</strong>g with <strong>AutoTutor</strong> was restricted. For <strong>AutoTutor</strong> 1.1,<br />

students were given 45 m<strong>in</strong>utes to reread the material and 45 m<strong>in</strong>utes to <strong>in</strong>teract with <strong>AutoTutor</strong>.<br />

These times were extended to 55 m<strong>in</strong>utes for the <strong>AutoTutor</strong> 2.0 sessions because <strong>AutoTutor</strong> 2.0<br />

<strong>in</strong>teractions are (by design) longer.<br />

5.2 Outcome measures<br />

There were 3 outcome measures. We selected a sample <strong>of</strong> 18 multiple-choice questions from the<br />

test-bank that accompanies the textbook used <strong>in</strong> the computer literacy course. An equal number<br />

<strong>of</strong> questions was selected for each <strong>of</strong> the three topics. We discovered that all <strong>of</strong> the test-bank<br />

questions were shallow accord<strong>in</strong>g to Bloom’s taxonomy. A computer literacy expert constructed<br />

a sample <strong>of</strong> 12 deep multiple-choice questions, four questions for each <strong>of</strong> the three topics that<br />

tapped causal <strong>in</strong>ferences and reason<strong>in</strong>g. And f<strong>in</strong>ally, there was a cloze test that had 4 critical<br />

words deleted from the ideal answers <strong>of</strong> each topic; the students filled <strong>in</strong> a total <strong>of</strong> 72 blanks with<br />

answers. The three measures were comb<strong>in</strong>ed <strong>in</strong>to a composite score for each student. The<br />

proportion <strong>of</strong> correct responses <strong>of</strong> the composite score served as the metric <strong>of</strong> student learn<strong>in</strong>g<br />

ga<strong>in</strong>s. <strong>Student</strong>s were given unlimited time to complete the tests.<br />

5.2 Results<br />

A 2 (<strong>AutoTutor</strong> Version) x 3 (Experimental Condition) repeated-measures ANOVA was<br />

performed to determ<strong>in</strong>e whether the composite score means differed <strong>in</strong> the various conditions.<br />

The results <strong>of</strong> this analysis <strong>in</strong>dicated that there were significant differences among the three<br />

experimental conditions, with means <strong>of</strong> .43, .37, and .35 <strong>in</strong> the <strong>AutoTutor</strong>, Reread, and Control<br />

conditions, respectively, F(2, 70) = 6.10, p< .05. Planned comparisons showed the follow<strong>in</strong>g<br />

pattern: <strong>AutoTutor</strong> > Reread = Control. The effect size <strong>of</strong> <strong>AutoTutor</strong> over Control was .50<br />

standard deviations. This is encourag<strong>in</strong>g given that students spent the same amount <strong>of</strong> time <strong>in</strong> the<br />

<strong>AutoTutor</strong> (50.6 m<strong>in</strong>utes) and Reread (49 m<strong>in</strong>utes) conditions. Surpris<strong>in</strong>gly, there was no ma<strong>in</strong><br />

effect for <strong>AutoTutor</strong> Version or any significant <strong>in</strong>teractions.<br />

A repeated-measures ANOVA was performed that crossed the three conditions with the three<br />

types <strong>of</strong> tests (Shallow, Deep, and Cloze). There was a significant ma<strong>in</strong> effect <strong>of</strong> condition, F(2,<br />

70) = 48.03, p< .05, MSe = .038, a significant ma<strong>in</strong> effect <strong>of</strong> test, F(2, 70) = 3.06, p< .05, MSe =<br />

.037, and no significant <strong>in</strong>teraction. The effect size advantages <strong>of</strong> <strong>AutoTutor</strong> over Control were<br />

.15 for the shallow test questions, .28 for the deep questions, and .64 for the cloze test.<br />

6 Conclusions<br />

The results support the conclusion that <strong>AutoTutor</strong> has a significant impact on student learn<strong>in</strong>g<br />

ga<strong>in</strong>s compared to the other learn<strong>in</strong>g and control conditions. We are encouraged by these f<strong>in</strong>d<strong>in</strong>gs


for two reasons. First, <strong>AutoTutor</strong> is (to our knowledge) the first animated conversational<br />

computer tutor to produce such learn<strong>in</strong>g outcomes <strong>in</strong> students. Second, students and educators<br />

alike should be pleased that sessions with <strong>AutoTutor</strong> do not require time commitments beyond<br />

those that students would normally make study<strong>in</strong>g the material.<br />

We anticipated that the more sophisticated strategies <strong>of</strong> <strong>AutoTutor</strong> 2.0 would lead to more<br />

positive learn<strong>in</strong>g outcomes than the rule-based generation <strong>in</strong> <strong>AutoTutor</strong> 1.1. One possible reason<br />

for this non-difference between the <strong>AutoTutor</strong> versions is that <strong>AutoTutor</strong> 2.0 sessions were<br />

approximately twice as long as the version 1.1 sessions, 160.58 turns versus 88.35 turns,<br />

respectively. We reported above that there was no difference <strong>in</strong> the amount <strong>of</strong> time students spent<br />

<strong>in</strong> the <strong>AutoTutor</strong> versus the Reread condition; however, there were significant differences <strong>in</strong> the<br />

average amounts <strong>of</strong> time students spent <strong>in</strong>teract<strong>in</strong>g with <strong>AutoTutor</strong> 1.1 versus 2.0. On average,<br />

students spent 38.4 m<strong>in</strong>utes <strong>in</strong>teract<strong>in</strong>g with <strong>AutoTutor</strong> 1.1 and 69.0 m<strong>in</strong>utes with the 2.0<br />

version. Hence, it may be the case that <strong>AutoTutor</strong> 2.0 is a better overall tutor; however, students<br />

experienced fatigue <strong>in</strong> the considerably lengthier sessions possibly mask<strong>in</strong>g the effects <strong>of</strong> the 2.0<br />

version.<br />

References<br />

[1] Anderson, J. R., Corbett, A. T., Koed<strong>in</strong>ger, K. R., & Pelletier, R. (1995). Cognitive tutors: Lessons learned.<br />

The Journal <strong>of</strong> the <strong>Learn<strong>in</strong>g</strong> Sciences, 4, 167-207.<br />

[2] Bransford, J. D., Goldman, S. R., & Vye, N. J. (1991). Mak<strong>in</strong>g a difference <strong>in</strong> people’s ability to th<strong>in</strong>k:<br />

Reflections on a decade <strong>of</strong> work and some hopes for the future. In R. J. Sternberg & L. Okagaki (Eds.),<br />

Influences on children (pp. 147-180). Hillsdale, NJ: Erlbaum.<br />

[3] Cassell, J., & Thorisson, K.R. (1999). The power <strong>of</strong> a nod and a glance: Envelope vs. emotional feedback <strong>in</strong><br />

animated conversational agents. Applied Artificial Intelligence, 13, 519-538.<br />

[4] Chi, M. T. H., de Leeuw, N., Chiu, M., & LaVancher, C. (1994). Elicit<strong>in</strong>g self-explanations improves<br />

understand<strong>in</strong>g. Cognitive Science, 18, 439-477.<br />

[5] Coll<strong>in</strong>s, A. (1985). Teach<strong>in</strong>g reason<strong>in</strong>g skills. In S.F. Chipman, J.W. Segal, & R. Glaser (Eds), Th<strong>in</strong>k<strong>in</strong>g and<br />

learn<strong>in</strong>g skills (vol. 2, pp 579-586). Hillsdale, NJ: Erlbaum.<br />

[6] Coll<strong>in</strong>s, A., Brown, J. S., & Newman, S. E. (1989). Cognitive apprenticeship: Teach<strong>in</strong>g the craft <strong>of</strong> read<strong>in</strong>g,<br />

writ<strong>in</strong>g, and mathematics. In L. B. Resnick (Ed.), Know<strong>in</strong>g, learn<strong>in</strong>g, and <strong>in</strong>struction: Essays <strong>in</strong> honor <strong>of</strong><br />

Robert Glaser (pp. 453-494). Hillsdale, NJ: Erlbaum.<br />

[7] Conati, C., & VanLehn, K. (1999). Teach<strong>in</strong>g metacognitive skills: Implementation and evaluation <strong>of</strong> a tutor<strong>in</strong>g<br />

system to guide self-explanation while learn<strong>in</strong>g from examples. In S.P. Lajoie and M. Vivet, Artificial<br />

Intelligence <strong>in</strong> Education (pp. 297-304). Amsterdam: IOS Press.<br />

[8] Foltz, P.W. (1996). Latent semantic analysis for text-based research. Behavior Research Methods,<br />

Instruments, and Computers, 28, 197-202.<br />

[9] Fox, B. (1993). The human tutorial dialog project. Hillsdale, NJ: Erlbaum<br />

[10] Gagné, R. M. (1977). The conditions <strong>of</strong> learn<strong>in</strong>g (3rd ed.). New York: Holdt, R<strong>in</strong>ehart, & W<strong>in</strong>ston.<br />

[11] Graesser, A.C., Frankl<strong>in</strong>, S., & Wiemer-Hast<strong>in</strong>gs, P. & the Tutor<strong>in</strong>g Research Group (1998). Simulat<strong>in</strong>g<br />

smooth tutorial dialog with pedagogical value. Proceed<strong>in</strong>gs <strong>of</strong> the American Association for Artificial<br />

Intelligence (pp. 163-167). Menlo Park, CA: AAAI Press.<br />

[12] Graesser, A.C., & Person, N.K. (1994). Question ask<strong>in</strong>g dur<strong>in</strong>g tutor<strong>in</strong>g. American Educational Research<br />

Journal, 31, 104 -137.<br />

[13] Graesser, A.C., Person, N.K., & Magliano, J.P. (1995). Collaborative dialog patterns <strong>in</strong> naturalistic one-on-<br />

one tutor<strong>in</strong>g. Applied Cognitive Psychology, 9, 359-387.<br />

[14] Graesser, A.C., Wiemer-Hast<strong>in</strong>gs, K., Wiemer-Hast<strong>in</strong>gs, P., Kreuz, R., & TRG (1999). <strong>AutoTutor</strong>: A<br />

simulation <strong>of</strong> a human tutor. Journal <strong>of</strong> Cognitive Systems Research, 1, 35-51.<br />

[15] Graesser, A.C., Wiemer-Hast<strong>in</strong>gs, P., Wiemer-Hast<strong>in</strong>gs, K., Harter, D., Person, N., & TRG (2000). Us<strong>in</strong>g<br />

latent semantic analysis to evaluate the contributions <strong>of</strong> students <strong>in</strong> <strong>AutoTutor</strong>. Interactive <strong>Learn<strong>in</strong>g</strong><br />

Environments.


[16] Hu, X., Graesser, A. C., & the Tutor<strong>in</strong>g Research Group (1998). Us<strong>in</strong>g WordNet and latent semantic<br />

analysis to evaluate the conversational contributions <strong>of</strong> learners <strong>in</strong> the tutorial dialog. Proceed<strong>in</strong>gs <strong>of</strong><br />

the International Conference on Computers <strong>in</strong> Education, Vol. 2, (pp. 337-341). Beij<strong>in</strong>g, Ch<strong>in</strong>a:<br />

Spr<strong>in</strong>ger<br />

[17] Hume, G. D., Michael, J.A., Rovick, A., & Evens, M. W. (1996). H<strong>in</strong>t<strong>in</strong>g as a tactic <strong>in</strong> one-on-one tutor<strong>in</strong>g.<br />

The Journal <strong>of</strong> the <strong>Learn<strong>in</strong>g</strong> Sciences, 5, 23-47.<br />

[18] Johnson, W. L., & Rickel, J. W., & Lester, J.C. (<strong>in</strong> press). Animated pedagogical agents: Face-to-face<br />

<strong>in</strong>teraction <strong>in</strong> <strong>in</strong>teractive learn<strong>in</strong>g environments. International Journal <strong>of</strong> Artificial Intelligence <strong>in</strong> Education.<br />

[19] Kosko, B. (1992). Neural networks and fuzzy systems. New York: Prentice Hall.<br />

[20] Landauer, T.K., & Dumais, S.T. (1997). A solution to Plato’s problem: The latent semantic analysis theory <strong>of</strong><br />

acquisition, <strong>in</strong>duction, and representation <strong>of</strong> knowledge. Psychological Review.<br />

[21] Landauer, T.K., Foltz, P.W., Laham, D. (1998). An <strong>in</strong>troduction to latent semantic analysis. Discourse<br />

Processes, 25, 259-284.<br />

[22] Lepper, M. R., Woolverton, M., Mumme, D.L., & Gurtner, J.L. (1991). Motivational techniques <strong>of</strong> expert<br />

human tutors: Lessons for the design <strong>of</strong> computer-based tutors. In S.P. Lajoie & S.J. Derry (Eds.), Computers<br />

as cognitive tools (pp. 75-105). Hillsdale, NJ: Erbaum.<br />

[23] Lesgold, A., Lajoie, S., Bunzo, M., & Eggan, G. (1992). SHERLOCK: A coached practice environment for an<br />

electronics troubleshoot<strong>in</strong>g job. In J. H. Lark<strong>in</strong> & R. W. Chabay (Eds.), Computer-assisted <strong>in</strong>struction and<br />

<strong>in</strong>telligent tutor<strong>in</strong>g systems (pp. 201-238). Hillsdale, NJ: Erlbaum.<br />

[24] Mayer, R. E., & Moreno, R. (1998). A split attention effect <strong>in</strong> multimedia learn<strong>in</strong>g: Evidence for dual<br />

process<strong>in</strong>g systems <strong>in</strong> work<strong>in</strong>g memory, Journal <strong>of</strong> Educational Psychology, 90, 312-320.<br />

[25] McCauley, L., Gholson, B., Hu, X., Graesser, A. C., & the Tutor<strong>in</strong>g Research Group (1998). Deliver<strong>in</strong>g<br />

smooth tutorial dialog us<strong>in</strong>g a talk<strong>in</strong>g head. Proceed<strong>in</strong>gs <strong>of</strong> the Workshop on Embodied Conversation<br />

Characters (pp. 31-38). Tahoe City, CA: AAAI and ACM.<br />

[26] Merrill, D. C., Reiser, B. J., Ranney, M., & Trafton, J. G. (1992). Effective tutor<strong>in</strong>g techniques: A comparison<br />

<strong>of</strong> human tutors and <strong>in</strong>telligent tutor<strong>in</strong>g systems. The Journal <strong>of</strong> the <strong>Learn<strong>in</strong>g</strong> Sciences, 2, 277-305.<br />

[27] Moore, J.D. (1995). Participat<strong>in</strong>g <strong>in</strong> explanatory dialogues. Cambridge, MA: MIT Press.<br />

[28] Olde, B. A., Hoeffner, J., Chipman, P., Graesser, A. C., & the Tutor<strong>in</strong>g Research Group (1999). A<br />

connectionist model for part <strong>of</strong> speech tagg<strong>in</strong>g. Proceed<strong>in</strong>gs <strong>of</strong> the American Association for Artificial<br />

Intelligence (pp. 172-176). Menlo Park, CA: AAAI Press.<br />

[29] Pal<strong>in</strong>scar, A. S., & Brown, A. (1984). Reciprocal teach<strong>in</strong>g <strong>of</strong> comprehension-foster<strong>in</strong>g and comprehension-<br />

monitor<strong>in</strong>g activities. Cognition & Instruction, 1, 117-175.<br />

[30] Person, N. K., Bautista, L., Kreuz, R. J., Graesser, A. C. & the Tutor<strong>in</strong>g Research Group (2000). The dialog<br />

advancer network: A conversation manager for <strong>AutoTutor</strong>. ITS 2000 Proceed<strong>in</strong>gs <strong>of</strong> the Workshop on<br />

Model<strong>in</strong>g Human Teach<strong>in</strong>g Tactics and Strategies. Montreal, Canada.<br />

[31] Person, N. K., & Graesser, A. C. (1999). Evolution <strong>of</strong> discourse <strong>in</strong> cross-age tutor<strong>in</strong>g. In A.M. O’Donnell<br />

and A. K<strong>in</strong>g (Eds.), Cognitive perspectives on peer learn<strong>in</strong>g (pp. 69-86). Mahwah, NJ: Erlbaum.<br />

[32] Person, N. K., Graesser, A. C., & the Tutor<strong>in</strong>g Research Group (2000). Design<strong>in</strong>g <strong>AutoTutor</strong> to be an<br />

Effective Conversational Partner. Proceed<strong>in</strong>gs for the 4 th International Conference <strong>of</strong> the <strong>Learn<strong>in</strong>g</strong> Sciences.<br />

Ann Arbor, MI.<br />

[33] Person, N. K., Graesser, A. C., Harter, D., Mathews, E. C., & the Tutor<strong>in</strong>g Research Group (2000). Dialog<br />

Move Generation and Conversation Management <strong>in</strong> <strong>AutoTutor</strong>. Proceed<strong>in</strong>gs for the AAAI Fall Symposium<br />

Series: Build<strong>in</strong>g Dialogue Systems for Tutorial Applications. Falmouth, Massachusetts.<br />

[34] Person, N. K., Graesser, A. C., Kreuz, R. J., Pomeroy, V., & the Tutor<strong>in</strong>g Research Group (2000). Simulat<strong>in</strong>g<br />

human tutor dialog moves <strong>in</strong> <strong>AutoTutor</strong>. International Journal <strong>of</strong> Artificial Intelligence <strong>in</strong> Education.<br />

[35] Person, N. K., Klettke, B., L<strong>in</strong>k, K., Kreuz, R. J., & the Tutor<strong>in</strong>g Research Group (1999). The <strong>in</strong>tegration <strong>of</strong><br />

affective responses <strong>in</strong>to <strong>AutoTutor</strong>. Proceed<strong>in</strong>g <strong>of</strong> the International Workshop on Affect <strong>in</strong> Interactions (pp.<br />

167-178). Siena, Italy.<br />

[36] Putnam, R. T. (1987). Structur<strong>in</strong>g and adjust<strong>in</strong>g content for students: A study <strong>of</strong> live and simulated tutor<strong>in</strong>g<br />

<strong>of</strong> addition. American Educational Research Journal, 24, 13-48.<br />

[37] Soller, A., L<strong>in</strong>ton, F., Goodman, B., & Lesgold, A. (1999). Toward <strong>in</strong>telligent analysis and support <strong>of</strong><br />

collaborative learn<strong>in</strong>g <strong>in</strong>teraction. In S.P. Lajoie & M. Vivet (Eds.), Artificial Intelligence <strong>in</strong> Education (pp.<br />

75-82). Amsterdam: IOS Press.<br />

[38] van Lehn, K. (1990). M<strong>in</strong>d bugs: The orig<strong>in</strong>s <strong>of</strong> procedural misconceptions. Cambridge, MA: MIT Press.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!