How Computerized Tutors Are Learning to Teach Humans

Neil Heffernan was listening to his fiancée, Cristina Lindquist, tutor one of her students in mathematics when he had an idea. Heffernan was a graduate student in computer science, and by this point — the summer of 1997 — he…

Neil Heffernan was listening to his fiancée, Cristina Lindquist, tutor one of her students in mathematics when he had an idea. Heffernan was a graduate student in computer science, and by this point — the summer of 1997 — he had been working for two years with researchers at Carnegie Mellon University on developing computer software to help students improve their skills. But he had come to believe that the programs did little to assist their users. They were built on elaborate theories of the student mind — attempts to simulate the learning brain. Then it dawned on him: what was missing from the programs was the interventions teachers made to promote and accelerate learning. Why not model a computer program on a human tutor like Lindquist?

Over the next few months, Heffernan videotaped Lindquist, who taught math to middle-school students, as she tutored, transcribing the sessions word for word, hoping to isolate what made her a successful teacher. A look at the transcripts suggests the difficulties he faced. Lindquist’s tutoring sessions were highly interactive: a single hour might contain more than 400 lines of dialogue. She asked lots of questions and probed her student’s answers. She came up with examples based on the student’s own experiences. She began sentences, and her student completed them. Their dialogue was anything but formulaic.

Lindquist: Do you know how to calculate average driving speed?

Student: I think so, but I forget.

Lindquist: Well, average speed — as your mom drove you here, did she drive the same speed the whole time?

Student: No.

Lindquist: But she did have an average speed. How do you think you calculate the average speed?

Student: It would be hours divided by 55 miles.

Lindquist: Which way is it? It’s miles per hour. So which way do you divide?

Student: It would be 55 miles divided by hours.

As the session continued, Lindquist gestured, pointed, made eye contact, modulated her voice. “Cruising!” she exclaimed, after the student answered three questions in a row correctly. “Did you see how I had to stop and think?” she inquired, modeling how to solve a problem. “I can see you’re getting tired,” she commented sympathetically near the end of the session. How could a computer program ever approximate this?

In a 1984 paper that is regarded as a classic of educational psychology, Benjamin Bloom, a professor at the University of Chicago, showed that being tutored is the most effective way to learn, vastly superior to being taught in a classroom. The experiments headed by Bloom randomly assigned fourth-, fifth- and eighth-grade students to classes of about 30 pupils per teacher, or to one-on-one tutoring. Children tutored individually performed two standard deviations better than children who received conventional classroom instruction — a huge difference.

Affluent American parents have since come to see the disparity Bloom identified as a golden opportunity, and tutoring has ballooned into a $5 billion industry. Among middle- and high-school students enrolled in New York City’s elite schools, tutoring is a common practice, and the most sought-after tutors can charge as much as $400 an hour.

But what of the pupils who could most benefit from tutoring — poor, urban, minority? Bloom had hoped that traditional teaching could eventually be made as effective as tutoring. But Heffernan was doubtful. He knew firsthand what it was like to grapple with the challenges of the classroom. After graduating from Amherst College, he joined Teach for America and was placed in an inner-city middle school in Baltimore. Some of his classes had as many as 40 students, all of them performing well below grade level. Discipline was a constant problem. Heffernan claims he set a school record for the number of students sent to the principal’s office. “I could barely control the class, let alone help each student,” Heffernan told me. “I wasn’t ever going to make a dent in this country’s educational problems by teaching just a few classes of students at a time.”

Heffernan left teaching, hoping that some marriage of education and technology might help “level the playing field in American education.” He decided that the only way to close the persistent “achievement gap” between white and minority, high- and low-income students was to offer universal tutoring — to give each student access to his or her own Cristina Lindquist. While hiring a human tutor for every child would be prohibitively expensive, the right computer program could make this possible.

o Heffernan forged ahead, cataloging more than two dozen “moves” Lindquist made to help her students learn (“remind the student of steps they have already completed,” “encourage the student to generalize,” “challenge a correct answer if the tutor suspects guessing”). He incorporated many of these tactics into a computerized tutor — called “Ms. Lindquist” — which became the basis of his doctoral dissertation. When he was hired as an assistant professor at Worcester Polytechnic Institute in Massachusetts, Heffernan continued to work on the program, joined in his efforts by Lindquist, now his wife, who also works at W.P.I. Together they improved the tutor, which they renamed ASSISTments (it assists students while generating an assessment of their progress). Seventeen years after Heffernan first set up his video camera, the computerized tutor he designed has been used by more than 100,000 students, in schools all over the country. “I look at this as just a start,” he told me. But, he added confidently, “we are closing the gap with human tutors.”

Grafton Middle School, a public school in a prosperous town a few miles outside Worcester, has been using ASSISTments since 2010. Last spring, I visited the home of Tyler Rogers, a tall boy with reddish-blond hair who was just finishing seventh grade at Grafton and who used the program to do his math homework. (While ASSISTments has made a few limited forays into tutoring other subjects, it is almost entirely dedicated to teaching math.) His teachers described him as “conscientious” and “mature,” but he had struggled in his pre-algebra class that year. “Sometime last fall, it started to get really hard,” he said as he opened his laptop.

Tyler breezed through the first part of his homework, but 10 questions in he hit a rough patch. “Write the equation in function form: 3x-y=5,” read the problem on the screen. Tyler worked the problem out in pencil first and then typed “5-3x” into the box. The response was instantaneous: “Sorry, wrong answer.” Tyler’s shoulders slumped. He tried again, his pencil scratching the paper. Another answer — “5/3x” — yielded another error message, but a third try, with “3x-5,” worked better. “Correct!” the computer proclaimed.

ASSISTments incorporates many of the findings made by researchers who, spurred by the 1984 Bloom study, set out to discover what tutors do that is so helpful to student learning. First and foremost, they concluded, tutors provide immediate feedback: they let students know whether what they’re doing is right or wrong. Such responsiveness keeps students on track, preventing them from wandering down “garden paths” of unproductive reasoning.

The second important service tutors provide, researchers discovered, is guiding students’ efforts, offering nudges in the right direction. ASSISTments provides this, too, in the form of a “hint” button. Tyler chose not to use it that evening, but if he had, he would have been given a series of clues to the right answer, “scaffolded” to support his own problem-solving efforts. For the answer “5-3x,” the computer responded: “You need to take a closer look at your signs. Notice there is a minus in front of the ‘y.’ ”

Tyler’s father, Chris Rogers, who manages complex networks of computers for a living, is pleased that his son’s homework employs technology. “Everyone works with computers these days,” he told me later. “Tyler might as well get used to using them now.” But his mother, Andrea, is more skeptical. Andrea is studying for a master’s in education and plans to become an elementary-school teacher. She is not opposed to the use of educational technology, but she objects to the flat affect of ASSISTments. In contrast to a human tutor, who has a nearly infinite number of potential responses to a student’s difficulties, the program is equipped with only a few. If a solution to a problem is typed incorrectly — say, with an extra space — the computer stubbornly returns the “Sorry, incorrect answer” message, though a human would recognize the answer as right. “In the beginning, when Tyler was first learning to use ASSISTments, there was a lot of frustration,” Andrea says. “I would sit there with him for hours, helping him. A computer can’t tell when you’re confused or frustrated or bored.”

Heffernan, as it happens, is working on that. Dealing with emotion — helping students regulate their feelings, quelling frustration and rousing flagging morale — is the third important function that human tutors fulfill. So Heffernan, along with several researchers at W.P.I. and other institutions, is working on an emotion-sensitive tutor: a computer program that can recognize and respond to students’ moods. One of his collaborators on the project is Sidney D’Mello, an assistant professor of psychology and computer science at the University of Notre Dame.

“The first thing we had to do is identify which emotions are important in tutoring, and we found that there are three that really matter: boredom, frustration and confusion,” D’Mello said. “Then we had to figure out how to accurately measure those feelings without interrupting the tutoring process.” His research has relied on two methods of collecting such data: applying facial-expression recognition software to spot a furrowed brow or an expression of slack disengagement; and using a special chair with posture sensors to tell whether students are leaning forward with interest or lolling back in boredom. Once the student’s feelings are identified, the thinking goes, the computerized tutor could adjust accordingly — giving the bored student more challenging questions or reviewing fundamentals with the student who is confused.

Of course, as D’Mello puts it, “we can’t install a $20,000 butt-sensor chair in every school in America.” So D’Mello, along with Heffernan, is working on a less elaborate, less expensive alternative: judging whether a student is bored, confused or frustrated based only on the pattern of his or her responses to questions. Heffernan and a collaborator at Columbia’s Teachers College, Ryan Baker, an expert in educational data mining, determined that students enter their answers in characteristic ways: a student who is bored, for example, may go for long stretches without answering any problems (he might be talking to a fellow student, or daydreaming) and then will answer a flurry of questions all at once, getting most or all correct. A student who is confused, by contrast, will spend a lot of time on each question, resort to the hint button frequently and get many of the questions wrong.

“Right now we’re able to accurately identify students’ emotions from their response patterns at a rate about 30 percent better than chance,” Baker says. “That’s about where the video cameras and posture sensors were a few years ago, and we’re optimistic that we can get close to their current accuracy rates of about 70 percent better than chance.” Human judges of emotion, he notes, reach agreement on what other people are feeling about 80 percent of the time.

Heffernan is also experimenting with ways that computers can inject emotion into the tutoring exchange — by flashing messages of encouragement, for example, or by calling up motivational videos recorded by the students’ teachers. The aim, he says, is to endow his computerized tutor “with the qualities of humans that help other humans learn.”

But is humanizing computers really the best way to supply students with effective tutors? Some researchers, like Ken Koedinger, a professor of human-computer interaction and psychology at Carnegie Mellon University, take a different view from Heffernan’s: computerized tutors shouldn’t try to emulate humans, because computers may well be the superior teachers. Koedinger has been working on computerized tutors for almost three decades, using them not only to help students learn but also to collect data about how the learning process works. Every keystroke a student makes — every hesitation, every hint requested, every wrong answer — can be analyzed for clues to how the mind learns. A program Koedinger helped design, Cognitive Tutor, is currently used by more than 600,000 students in 3,000 school districts around the country, generating a vast supply of data for researchers to mine. (The program is owned by a company called Carnegie Learning, which was sold to the Apollo Group last year for $75 million; Apollo also owns the for-profit University of Phoenix.)

Koedinger is convinced that learning is so unfathomably complex that we need the data generated by computers to fully understand it. “We think we know how to teach because humans have been doing it forever,” he says, “but in fact we’re just beginning to understand how complicated it is to do it well.”

As an example, Koedinger points to the spacing effect. Decades of research have demonstrated that people learn more effectively when their encounters with information are spread out over time, rather than massed into one marathon study session. Some teachers have incorporated this finding into their classrooms — going over previously covered material at regular intervals, for instance. But optimizing the spacing effect is a far more intricate task than providing the occasional review, Koedinger says: “To maximize retention of material, it’s best to start out by exposing the student to the information at short intervals, gradually lengthening the amount of time between encounters.” Different types of information — abstract concepts versus concrete facts, for example — require different schedules of exposure. The spacing timetable should also be adjusted to each individual’s shifting level of mastery. “There’s no way a classroom teacher can keep track of all this for every kid,” Koedinger says. But a computer, with its vast stores of memory and automated record-keeping, can. Koedinger and his colleagues have identified hundreds of subtle facets of learning, all of which can be managed and implemented by sophisticated software.

Yet some educators maintain that however complex the data analysis and targeted the program, computerized tutoring is no match for a good teacher. It’s not clear, for instance, that Koedinger’s program yields better outcomes for students. A review conducted by the Department of Education in 2010 concluded that the product had “no discernible effects” on students’ test scores, while costing far more than a conventional textbook, leading critics to charge that Carnegie Learning is taking advantage of teachers and administrators dazzled by the promise of educational technology. Koedinger counters that “many other studies, mostly positive,” have affirmed the value of the Carnegie Learning program. “I’m confident that the program helps students learn better than paper-and-pencil homework assignments.”

Heffernan isn’t susceptible to the criticism that he is profiting from school districts, because he gives ASSISTments away free. And so far, the small number of preliminary, peer-reviewed studies he has conducted on his program support its value: one randomized controlled trial found that the use of the computerized tutor improved students’ performance in math by the equivalent of a full letter grade over the performance of pupils who used paper and pencil to do their homework.

But Heffernan does face one serious hurdle: any student who wishes to use ASSISTments needs a computer and Internet access. More than 20 percent of U.S. households are not equipped with a computer; about 30 percent have no broadband connection. Heffernan originally hoped to try ASSISTments out in Worcester’s mostly urban school district, but he had to scale back the program when he found that few students were consistently able to use a computer at home. So ASSISTments has mainly been adopted by affluent suburban schools like Grafton Middle School and Bellingham Memorial Middle School in Massachusetts — populated, Heffernan said ruefully, by students who already have the advantages of high-functioning schools and educated, involved parents. But, he told me brightly, he recently received a grant from the Department of Education to supply ASSISTments to almost 10,000 public-school students in Maine — a largely poor, largely rural state in which all seventh and eighth grade students are supplied with a laptop, thanks to a state initiative. Heffernan hopes that by raising the Maine students’ test scores with ASSISTments, he will inspire more officials in states around the country to see the virtue of making tutoring universal.

The morning after I watched Tyler Rogers do his homework, I sat in on his math class at Grafton Middle School. As he and his classmates filed into the classroom, I talked with his teacher, Kim Thienpont, who has taught middle school for 10 years. “As teachers, we get all this training in ‘differentiated instruction’ — adapting our teaching to the needs of each student,” she said. “But in a class of 20 students, with a certain amount of material we have to cover each day, how am I really going to do that?”

ASSISTments, Thienpont told me, made this possible, echoing what I heard from another area math teacher, Barbara Delaney, the day before. Delaney teaches sixth-grade math in nearby Bellingham. Each time her students use the computerized tutor to do their homework, the program collects data on how well they’re doing: which problems they got wrong, how many times they used the hint button. The information is automatically collated into a report, which is available to Delaney on her own computer before the next morning’s class. (Reports on individual students can be accessed by their parents.) “With ASSISTments, I know none of my students are falling through the cracks,” Delaney told me.

After completing a few warm-up problems on their school’s iPod Touches­, the students turned to the front of the room, where Thienpont projected a spreadsheet of the previous night’s homework. Like stock traders going over the day’s returns, the students scanned the data, comparing their own grades with the class average and picking out the problems that gave their classmates trouble. (“If you got a question wrong, but a lot of other people got it wrong, too, you don’t feel so bad,” Tyler explained.)

Thienpont began by going over “common wrong answers” — incorrect solutions that many students arrived at by following predictable but mistaken lines of reasoning. Or perhaps, not so predictable. “Sometimes I’m flabbergasted by the thing all the students get wrong,” Thienpont said. “It’s often a mistake I never would have expected.” Human teachers and tutors are susceptible to what cognitive scientists call the “expert blind spot” — once we’ve mastered a body of knowledge, it’s hard to imagine what novices don’t know — but computers have no such mental block. Highlighting “common wrong answers” allows Thienpont to address shared misconceptions without putting any one student on the spot.

I saw another unexpected effect of computerized tutoring in Delaney’s Bellingham classroom. After explaining how to solve a problem that many got wrong on the previous night’s homework, Delaney asked her students to come up with a hint for the next year’s class. Students called out suggested clues, and after a few tries, they arrived at a concise tip. “Congratulations!” she said. “You’ve just helped next year’s sixth graders learn math.” When Delaney’s future pupils press the hint button in ASSISTments, the former students’ advice will appear.

Unlike the proprietary software sold by Carnegie Learning, or by education-technology giants like Pearson, ASSISTments was designed to be modified by teachers and students, in a process Heffernan likens to the crowd-sourcing that created Wikipedia. His latest inspiration is to add a button to each page of ASSISTments that will allow students to access a Web page where they can get more information about, say, a relevant math concept. Heffernan and his W.P.I. colleagues are now developing a system of vetting and ranking the thousands of math-related sites on the Internet.

For all his ambition, Heffernan acknowledges that this technology has limits. He has a motto: “Let computers do what computers are good at, and people do what people are good at.” Computers excel in following a precise plan of instruction. A computer never gets impatient or annoyed. But it never gets excited or enthusiastic either. Nor can a computer guide a student through an open-ended exploration of literature or history. It’s no accident that ASSISTments and other computerized tutoring systems have focused primarily on math, a subject suited to computers’ binary language. While a computer can emulate, and in some ways exceed, the abilities of a human teacher, it will not replace her. Rather, it’s the emerging hybrid of human and computer instruction — not either one alone — that may well transform education.

Near the end of my visit to Worcester, I told Heffernan about a scene I witnessed in Barbara Delaney’s class. She had divided her sixth graders into what she called “flexible groups” — groupings of students by ability that shift daily depending on the data collected in her ASSISTments report. She walked over to the group that struggled the most with the previous night’s homework and talked quietly with one girl who looked on the brink of tears. Delaney pointed to the girl’s notebook, then to the ASSISTments spreadsheet projected on a “smart” board at the front of the room. She touched the girl’s shoulder; the student lifted her face to her teacher and managed a crooked smile.

When I finished recounting the incident, Heffernan sat back in his chair. “That’s not anything we put into the tutoring system — that’s something Barbara brings to her students,” he remarked. “I wish we could put that in a box.”