Why Multiple Strategies Is Not A Fluency Building Technique

An excerpt from a forthcoming book by Dr. Amanda VanDerHeyden, Every Student’s Right: The Science Behind Instruction That Works

The instructional hierarchy says that when a new concept or skill is introduced to a student (i.e., the student has not yet acquired the skill), then the goal of the instruction is to establish correct skill performance. In behavioral psychology this process is called “discrimination” and means that the student understands the conditions under which a given response is correct and a different response is incorrect. Because the goal of acquisition instruction is discrimination, then certain tactics will work instructionally whereas others will be contraindicated (either not helpful or will worsen learning).

Again, the goal of the instruction is to ensure the student understands how to correctly respond. The clarity of the task presentation is critical and teachers must take care to avoid creating misunderstanding through sloppy presentation of the task (Engelmann, 1993) or unsupervised exploratory problem solving where errors are highly likely to occur and go undetected which causes student misunderstanding. If these are the critical features of acquisition instruction, then one can readily see how tactics that allow for minimal guidance from an adult and unguided exploration like those that are so popular in math instruction will actually undermine or worsen learning when students are in the acquisition stage.

Somehow math education prizes errors as useful. While it is true that in discrimination training it is helpful to demonstrate both correct and incorrect problem solving as a way to help the student understand the conditions under which a given response is correct and a different response is incorrect, this is not the same as prizing the occurrence of errors, especially undetected errors. In the acquisition stage of learning, errorless learning (Touchette & Howard, 1981) is actually the goal, because errors are deadly to discrimination. Again, I will explain this science in greater detail later in this book.

When teachers use tactics like multiple strategies or discovery or a period of productive struggle or flipping the classroom—all enormously popular strategies—during the acquisition stage of learning, they are selecting tactics that are perfectly predicted to worsen, rather than benefit, learning.

Abysmal math performance on state tests is evidence that “train and hope” does not work. Learning is engineered by teachers. It is a complex science, but every teacher can learn to deliver more effective instruction. When teachers understand how to deliver effective instruction, the results are apparent immediately and the teacher’s use of effective instruction becomes powerfully reinforced. (In other words, teachers become highly motivated and delighted by “making the dots go up for students,” as my colleague Scott Ardoin has said). When teachers understand the science of learning, they begin to use the tools in their classroom more effectively because they can rapidly identify the weaknesses in a given program or curriculum and they know how to modify those procedures or materials to produce better learning.

One of the best features of behavioral psychology in education is that the focus is on shaping adaptive behaviors that maintain and become useful to the learner in their everyday life environments. In other words, generalization is the goal of instruction even though that is not where instruction begins. So if generalization is our ultimate goal, we can organize our instruction at each stage of learning with this goal in mind.

Trevor Stokes (1992) famously wrote “There is nothing but discrimination and generalization,” (p. 429). In learning science, truer words have never been spoken.

Discrimination

Discrimination is the student’s ability to understand the conditions under which one response is correct and a different response is incorrect: it is the goal of acquisition instruction. In my opinion, the weakest part of instructional delivery in most classrooms is management of the task presentation, something Engelmann (1993) articulated beautifully.

Technically, discrimination is the occurrence of the trained response in the presence of the discriminative stimulus (S-D or S+) and not in the presence of another stimulus for which reinforcement is not available (S-Delta or S-). So, for example, a child saying or writing “8” when presented with 5 + 3, but not when presented with 5 + 2. When presented with 5 + 3, saying “8” will be reinforced. When presented with 5 + 2, saying “8” will not be reinforced. It is important to understand that you cannot teach a skill without providing instruction in the context of both S-D’s and S-Deltas. When discrimination is the goal, the teacher controls two things—the stimuli (task) and the consequences (affirmation, rewards, corrective feedback). There are some general findings from basic and applied research over many decades that can inform the actions we can take to predictably bring about rapid discrimination. First, the greater the difference between the S+ and the S- the faster the discrimination will be made. For example, think about teaching a student to identify the greater quantity with decimal quantities to the 100ths. Presenting 3 and .01 will be “easier” than presenting 0.80 and 0.79. Yet, discrimination is not complete until the student can respond correctly in the presence of similar stimuli like 0.80 and 0.79. The take-away for teachers is that we start with “easier” problem types and introduce more similar S-Ds over subsequent trials in instruction if we want to aid discrimination.

Discrimination is the goal of acquisition instruction. To facilitate discrimination, we must prevent and minimize errors during instruction, provide sufficient stimulus exemplars, and provide sufficient response exemplars.

Errors Interfere with Acquisition

Perhaps the most important feature of the task presentation is its capacity to produce responding without errors. Errors are deadly to discrimination. Contrary to popular opinion, errors during discrimination do not benefit learning. Rather, errors are often undetected during instruction and those errors establish misconceptions that interfere with future skill development. The best way to prevent the deadly effect of errors in discrimination is the careful design of the task itself along with guided instruction so that errors are mostly prevented, immediately identified if they occur, and corrected with adjustments to the task to aid further discrimination.

Errors Reduce the Student’s Rate of Reinforcement

Additionally, errors disrupt the cadence of the instruction and reduce the student’s rate of reinforcement, making the session less engaging and less rewarding for the student. Herrnstein’s Matching Law (1961, 1970) tells us that humans will choose or allocate their responding based on relative rates of reinforcement, choosing or engaging in those behaviors that yield more reinforcing consequences relative to other behaviors. In a brilliant paper, Skinner et al., (1996) explained how problem completion rates may function as conditional reinforcers for assignment completion. In other words, completing the incremental steps of solving each math problem may be reinforcing to the student’s behavior of completing those problems because those incremental problem completion steps lead to a completed assignment for which the student is likely to avoid unpleasant consequences (a low grade, a note home to parents, staying in at recess to complete the assignment) and/or receive rewarding consequences (teacher praise, a high grade, or an earned privilege like game time). Errors reduce the student’s experience of reinforcing outcomes because they reduce correct problem completion.

Further, students will allocate their responding to tasks that they can successfully complete because doing so yields them a higher rate of reinforcement. This law of human behavior predicts that students will typically avoid challenging problem types and resist attempting those problem types. In fact, getting students to naturally engage with new or difficult tasks has been the subject of much research. Errors reduce the student’s experience of success (rate of reinforcement) and this effect will predictably worsen student engagement and problem completion.

Errors Increase Behaviors that Interfere with Learning

Finally, errors will also cause an elevation of behaviors that interfere with learning and this has been demonstrated in animal (Terrace, 1966) and human studies (Gickling & Armstrong, 1978; Touchette, 1971). For example, Gickling and Armstrong (1978) found that task difficulty was associated with on-task behavior and work completion. Specifically, when the task was too easy (they called this Independent which is the same as Mastery) or too difficult (defined as Frustrational), task completion and on-task behavior was worse for all participants than was the case when the task was Instructional. In other words, simply adjusting the task difficulty produced better student engagement and learning success without any other teacher-directed efforts. This effect was expected based on the science—providing students tasks they can perform accurately avoids the maladaptive consequences of errors: creating confusion, reducing reinforcement, reducing task engagement, and causing worsened learning. This finding has been replicated in dozens of studies and is the basis for selecting instructional-level tasks for which students are mostly accurately responding as the target of your instruction (Archer & Hughs, 2010; Haring & Eaton, 1978; Greenwood et al., 1991).

One of the most destructive consequences of discovery, inquiry, productive struggle, or other such “minimally guided instructional tactics” (Kirschner, et al., 2006) during the acquisition stage of learning is that they promote the occurrence of errors and errors during acquisition are deadly to learning.

Ways to Reduce Errors During Instruction

You might logically wonder how you can teach a student a new understanding without the student making lots of errors. The primary method of reducing errors during discrimination is to use prompts.

Use prompted responding. A prompt is a stimulus (think cue) that is usually external or not functionally related to the task that is used in learning to facilitate student responding. Like when teaching the phonemes in a word, the teacher may draw a line under the letters in a word when asking students to identify the starting phoneme. When prompts are used during instruction (as they typically are), transferring control of the response from the prompt (drawing the line under the relevant letters) to the task itself (simply the printed word) is an essential goal of the instruction.

Touchette and Howard (1981) explain that there is a trade-off and a point of diminishing returns in prompting.Introduce prompts effectively and you can rapidly establish correct student responding, but use them ineffectively and you will foster dependence on that prompt or even worsen future skill acquisition. When prompting or cuing student responding, the key is to remove the prompt when (not before and not long after) correct responding is established. Leave the prompt in place and you will potentially create or establish prompt dependence. Remove the prompt too soon and you may cause persistent incorrect response patterns that interfere with future skill acquisition.

Touchette and Howard (1981) go on to describe the importance of removing the prompt in a way that avoids the occurrence of errors, stating, “An ideal transition from prompted to unprompted responding will result in few or no errors” (p. 175) and they specify that avoiding errors is important because the occurrence of errors can cause behaviors that interfere with learning (e.g., inattention, agitation). Terrace called this “discrimination learning without errors” and found that pigeons could learn to discriminate between two stimuli without making incorrect responses (i.e., responding to the targeted incorrect stimulus). He noted there was no “emotional response” to the incorrect stimulus. Touchette contrasted “trial and error” learning with a process he called “graduated stimulus change” on a simple discrimination task with six students with severe intellectual disabilities and all but one mastered or discriminated correctly given his graduated stimulus change model which is an early model of prompting in learning research. His description of stimulus presentation as a gradient was especially brilliant as we can imagine how the changes in task occur so subtly that successful student responding can continue to occur. Touchette (1968) explained that “experimenters initiated training by reinforcing a stimulus-response relation which already existed or which was easily acquired. The controlling stimuli were then gradually changed to approximate more and more closely those appropriate to the discrimination to be taught. By maintaining appropriate stimulus-response relations throughout training, responses based on stimuli not directly related to reinforcement were eliminated,” (p. 39) and prompting during instruction with humans was born.

In 1981, Touchette and Howard named this operant process “errorless learning.” Coincidentally, when popular thought leaders promote a “guide on the side” type of learning, they are misrepresenting the sophistication of the type of instruction Touchette and Howard described, which would be the ideal way to guide a student to correct and complete skill mastery. In fact, these procedures are central to models of direct instruction (Carnine & Engelmann, 1991) and explicit instruction (Archer & Hughes, 2010).

Some of Touchette and Howard’s language is technical and certainly inconsistent with contemporary language, but it is notable that scientific evidence dating back to 1966 illustrates the lack of efficacy and side effects of trial and error learning and demonstrates experimentally the benefit of a more controlled task presentation (e.g., stimulus gradients in learning) which frankly requires more adult skill to deliver. When teachers use discovery, inquiry, or productive struggle in acquisition instruction where discrimination is the goal, they are inviting trial and error learning, which is squarely at odds with how humans learn.

As student responding begins to occur without prompting, the rate of correct responding is now unbounded and provides a natural increase in opportunity to experience more reinforcement occasions. In other words, because the student does not have to wait for the prompt to respond, more responding occurs, this responding is likely to be accurate, and the student’s experience of success is denser during the instructional episode which is rewarding to both the student and the teacher.

Thus, transferring the control from prompt to task without errors is an essential goal of human learning, which is profoundly misunderstood and/or underappreciated by educational philosophies that encourage trial and error learning or instruction that begins with generalization tactics rather than acquisition tactics (e.g., productive struggle, discovery, inquiry, flipping the classroom).

Types of Prompting

Prompts during instruction include prompts associated with the task itself (“stimulus prompts”) which can involve movement or gestures (pointing to the greater quantity when teaching students to compare quantities), positional (moving the correct number card forward when teaching a student to solve for a missing number in a sequence), or within-stimulus prompts (making the correct answer larger, animated, or a given color and fading that element very gradually across subsequent learning trials until the correct response occurs with no prompt).

Response prompts include giving a visual cue (the written problem), verbal instructions (telling the student how to solve the problem), modeling correct problem solving, and physically guiding correct responding. When physically guiding the correct response, you can provide full assistance to the student such as placing your hand over the student’s hand to assist the student to write a “2” while saying “write the 2, like this”. You can provide partial assistance, for example, modeling writing a 2 and then assisting the student to place their pencil in the correct starting position. Your guidance can also be graduated, meaning you deliver full physical guidance and fade that to no guidance based on correct responding. Or your guidance can begin at the least intrusive step and you can increase guidance as needed to ensure correct responding.

Prompt scheduling is another dimension to consider. Highly effective tactics for prompting correct student responding include the use of constant time delay prompting (Wolery, 1992) where the teacher presents a task and after a very brief interval (e.g., 2 seconds) cues the correct answer, verbally, by gesturing to the correct response selection, or by writing the correct response. The teacher will continue to present each trial in that way, inserting a 2-second delay between the task presentation and the delivered prompt. The student will naturally begin to respond before the prompt occurs and the response has a very high probability of being correct. As students begin to respond correctly, the prompt delay between the task and the prompt can be further faded and then eliminated altogether. But prompt schedules can also be given immediately (no delay), a constant delay, or a graduated delay where the delay between task and prompt is gradually increased as students continue to correctly respond.

Providing Sufficient Stimulus and Response Exemplars

So readers might naturally wonder if errors do not occur, how do students understand the conditions under which a trained response will be incorrect? The answer to this question is that the teacher must present non-examples during acquisition to establish discrimination. A non-example is a stimulus for which the trained response is not correct. Let’s imagine you are teaching someone to identify a “5” when presented with two numbers, one of which is a 5. First, you must present a second number (the non-example) so the student can learn that not every number will be a 5. So you might present a 5 and a 1. You could say, “point to the 5,” wait 2 seconds and then point to the 5. On the next trial, you would present the 5 next to a 3 and say again, “point to the 5” and wait 2 seconds between saying “point to the 5” and then pointing yourself. (Your pointing is the prompt.) As the student responds correctly, you will continue presenting the 5 with different non-example numbers and eventually remove the prompt altogether. Once that happens, you can teach a new number name. A non-example is a response to a stimulus (e.g., problem) that is not correct. So when teaching students to add 12 + 19, you might show that 11 ones cannot be written in the final sum in the ones position and demonstrate that one has to make a ten. You could do this by verbally asking students to add 9 + 2 which they have already mastered. When they say “11,” you will say “that’s right, but we cannot record that 11 here” (as you write the 11 underneath the ones column). You are demonstrating a non-example of responding when you do this and you will want to tie it up by modeling the correct response (e.g., regrouping).

Why Teaching Multiple Strategies Does Not Work During Acquisition

This process is not the same as “teaching multiple strategies” during acquisition instruction which is a popular tactic among math educators. The idea of teaching multiple strategies is that when teaching an operation like 2-digit by 2-digit multiplication, the teacher will show multiple methods to arrive at a correct answer in addition to teaching the standard algorithm and students will be required to use and demonstrate independent use of all the strategies to demonstrate their learning. Let’s unpack why this is bad instruction.

First, when someone is learning a new skill or understanding, the first order of business is discrimination. Discrimination requires clarity in the training and this clarity is mostly accomplished by effective task presentation. Excellent curriculum materials are those that are designed with this learning science in mind, but in math, such precise curriculum design is nearly nonexistent (Doabler et al., 2012). Instead, teachers are provided with materials and instructions to encourage a period of trial-and-error learning when it is known to be most harmful to learning.

When teachers present multiple strategies, in effect, they are reducing the odds of discrimination. Why? Because they are introducing too many complex concepts in the same moment, introducing too many different problem set-ups at the same time, and teaching different procedures to solve the same problem simultaneously. Interestingly, multiple strategies are multiple procedural ways to arrive at a correct answer, and what children end up attending to and attempting to learn is how to duplicate multiple procedures to arrive at a correct answer. So, in effect, discrimination is much more challenging to the student.

Logically, one can see that presenting a task using totally different problem formats or set-ups and then teaching the student two to four ways to solve a problem at the same time using novel stimuli presentations is likely to produce confusion. Students are likely to mix up the procedures, their probability of errors increases, and students will predictably struggle (their rate of reinforcement during learning goes down). The goal of the learning becomes using the multiple procedures and this is different than the goal of the learning being to understand (or as behavior psychologists say, “discriminate”) the conditions under which a given response is correct and a different response is incorrect, which is the essence of learning. Given our example of 12 + 19, we would first teach students to solve via regrouping. To help students understand the place value properties, you might demonstrate adding these quantities using expanded notation and you might map the operation on a number line. But you are not teaching students to use these strategies. You are teaching one strategy and that is the standard algorithm. Here, your use of a number line and expanded notation is only to make the algorithm make sense to the student. When children have correctly discriminated and can accurately and independently solve 2-digit addition problems with and without regrouping, then you might teach students other strategies for solving as students are entering the fluency-building stage of learning. For example, solving 12 + 19 as 12+ 20 – 1 which can be done in one’s head. Such strategies are introduced after discrimination/acquisition. Situating them in fluency-building increases the student’s rate of correct responding so their relative rate of reinforcement increases. Making the antecedent stimulus less discriminable (more varied) facilitates robust, enduring, flexible skill use (or generalization).

Let’s return to multiple strategies during acquisition since this tactic is so popular (and so misaligned with the science of learning). Imagine for a moment that someone is teaching you how to tie your shoes and they show you three ways to do it in the same session. You will likely not remember the steps of each method and not be able to demonstrate each way successfully after that session. A better way to teach someone how to tie their shoes is to observe how they naturally approach tying the laces and then select the method to teach them that is closest to their own present effort. The teacher would stick with that method, teaching step by step until you learn how to successfully tie your shoe. There is no profound or deeper understanding of shoe tying that occurs by teaching you two ways to tie your shoes at the same time and, in all actuality, there is little value in teaching you a second method when one method is sufficient. It would be better to teach you new useful skills that you have not learned yet, like buttoning a shirt.

Tying shoes is not too unlike a multi-step math problem: there are multiple steps that occur in a correct sequence to arrive at a correct answer. Yes, there can be more than one way to solve the problem. In tying shoes, you might train the response for which the child shows some starting skill. In math, you will teach using the method for which the learner has mastered the prerequisite understandings. As a simple example, when teaching multiplication or division with fractions to high school students, you can teach them to simplify the starting fractions. But with 5th graders, you would not teach simplification as the first step because they have not yet mastered simplification of fractions.

Complex multi-step math problems can be taught using forward and backward chaining, building proficiency solving each step, one step at a time, and then adding the next step. You will likely need to build in verbal and sometimes modeled responding at each step. For example, “The first step is to multiply the ones by the ones. Which numbers are the ones? That’s right, we are multiplying 2 ones x 4 ones. What does that give us? That’s right, 2 x 4 = 8.” On the next step if students cannot identify the tens quantity in 34, you can model decomposing the 34 into its tens and ones values (34 = 30 + 4 or 3 tens and 4 ones). Or, if students are more proficient with place value, you can simply point to the 3 after a short delay, then present another example like 54 and ask, “How many tens are in 54?” Ensure success with this step before continuing. You might need to pause and practice responding with students to ensure they have mastered this step.

Present Sufficient Stimulus Exemplars

It will also be important that you provide sufficient stimulus exemplars. This means you will present problem types that reflect all the variations that are included within this target skill, beginning with easier-to-solve problem types and progressing to more challenging-to-solve iterations (e.g., more challenging multiplication factor pairs, regrouping at various steps of the problem) as students experience success.

You will consider arrangements in your sample problems that could cause students to reach the wrong conclusions about how correct problem solving happens.For example, placing a factor of 1 always in the same position (top or bottom) in the problem presentation or never showing that regrouping may occur in generating each partial product and in adding those partial products for the final answer. You will want to present factors that contain zeros in all positions. In your choice of tasks and your dynamic flow of instruction, you will make explicit again and again the key understandings, how the operation is happening, how the expected quantities are being attained at each step. Students will experience a high rate of success, their responding at each step will be observable to you as the teacher so you can see where the missteps and misunderstandings occur.

Student understanding of the operation conceptually will advance in tandem with their capacity to successfully complete the procedural steps (Star, 2005). There is a common myth in math education that something called (but not well defined) “conceptual understanding” is attained before you teach actual problem solving. Research evidence tells us rather conclusively that both conceptual understanding (defined and measured) and procedural skill develop in tandem, in a bi-directional way (Rittle-Johnson, 2017). As a teacher, you build this understanding by reducing maladaptive consequences in instruction, namely errors, and using sufficient stimulus exemplars and response exemplars during acquisition until discrimination is attained, building fluency by increasing contact with natural consequences during instruction and making the antecedent stimuli less discriminable, and creating flexible, adaptable skill use (generalization) by making consequences less discriminable.

Importantly, when teachers use novel problem set-ups that avoid the standard algorithm and do not link directly to the standard algorithm in an effort to teach conceptual understanding (not well defined or measured), this work becomes a misapplication of learning science. Let’s return to the multiplication task just shown. I often use this as a non-example of effective acquisition instruction and it is called the “area-array model of multiplication.”