Shaping:
Getting Behavior A Little Bit At A Time
I believe that not truly knowing how to shape behavior is the biggest reason people have for giving up on positive
reinforcement and clicker-based training. They don’t know how to shape
behavior effectively. Since shaping doesn’t work in their practice, they
resort to what does work- which may be corrections, luring, or giving up. And
that makes total sense, even if shaping really can get them where they want to go. They
don’t really know that until they experience success with shaping for themselves.
Shaping
Shaping is defined
as the teaching by differential reinforcement of new behaviors by systematically reinforcing successive approximations toward
the target behavior. (The target behavior is the behavior you want to train.) To shape a behavior you develop a clear definition of the final behavior you want
to train, and then you look at the dog’s current behavior. What does he
do already that is even a little tiny bit like the final behavior?
Some terminology discussion. Shaping
is the correct term. Some people use the above definition but call it free shaping. The distinction they make is that free shaping means you don’t start with any
previously learned or lured behavior, you just shape the whole thing, and shaping means you start with some previously learned
behavior or lure or prompt or somehow get the behavior started. Actually,
shaping is term for the definition I gave above. What people often call free
shaping is really just shaping. Shaping that is started by using some previously
learned or lured or targeted behavior is just shaping with something else going on in front of it. There’s only one term to remember. Shaping.
Let’s say you want to teach your dog to roll over, but he rarely does this on his own so capturing the behavior
would require days of watching the dog. I recently did this with my Chinese Crested
mix, Pan, so I’ll use that experience as an example. I decided that the
target behavior I wanted was for Pan to quickly lay down on his side on the floor, roll over on his back, complete rolling
onto the opposite side, and then get up onto his feet after the completed roll.
But Pan didn’t
routinely do this. So I watched him for a while and realized that he does frequently
lie down on his chest. So, that’s the behavior I started with. As soon as Pan went down onto his chest, he got a click and treat.
But now what? He repeated and I clicked that behavior a few times. Laying on his chest is called a successive approximation. It is a behavior that can get me a tiny bit closer to the target behavior of rolling all the way over.
After I clicked that
behavior just a few times, Pan dropped to his chest and looked at me. I didn’t
do anything. I was waiting for him to give me a little bit more. What I was counting on at this point is for him to produce a bit of an extinction burst. An extinction burst often happens when reinforcement the learner has come to expect doesn’t happen. In shaping we use this to our advantage because the animal’s behavior will start
to get a little variable and intense as he tries to show you, “SEE? Look
what I’m doing? Don’t you want to click?” Something in that variable behavior can usually be clicked to get you a little closer to the target behavior. Once you click it, you stop clicking the thing that was getting clicks before and
let that old behavior fade away.
Pan laid on his
chest and looked at me for a couple of seconds. Clearly he thought I must not
have noticed that he had done this wonderful thing I liked so much before, so he hopped up and laid right back down. That didn’t work. So he wiggled
to the side a little bit. CLICK! That
did the trick. That was a new successive approximation. I let him revel in that success only a couple of times before holding out on him again. He flopped onto his side. This was a new approximation. As each new approximation took hold, the previous one stopped earning clicks and treats. The previous approximation was placed on extinction.
The new approximation earned clicks and treats. That process is called
differential reinforcement.
At this stage,
Pan consistently flopped down onto his side and laid there. And as soon as he
was readily doing this, it was no good any more… but he knew that trying new stuff was the ticket to treat-land, so
he looked over his shoulder… CLICK!
To be perfectly honest,
at this stage I was expecting him to tip onto his back a little bit. This is
part of the art of shaping. You have to be ready to accept what he will give
you and that takes practice. Practice is easy.
Just do it. Remember that as he learns, your behavior is being shaped
as well. You learn what to look for, what to click and you learn how to be a
better clicker trainer. If you miss that little flick of the head over the shoulder,
no biggie. Catch it next time. If
you accidentally click the wrong thing, no biggie. Give the treat and keep training.
You’re going to give him a LOT more chances to get it right.
Now, if Pan had
stopped succeeding and got stuck, I would have simply backed up the last approximation with which he had been successful and
train again from there.
From here, it wasn’t
long until he was rolling onto his back, and once he rolled onto his back he pretty quickly… and probably by accident…
rolled onto his other side. Click and Treat!
I had to be ready to take a bigger approximation, just as I had to be ready to come up with tiny approximations that
could be built into the target behavior.
So now I have
a dog who almost has a roll over. And here’s the place that becomes a dilemma
for many trainers. Should I keep training since things are going well, or should
I quit? I chose to quit. I had other
things to do, and he had been very successful. So, I quit for that session.
A side note: I had not touched Pan during that session
except in the delivery of reinforcers.
Do you always
have to quit on a success like that? Well, it can’t hurt. I don’t know that the research is solid enough to say, “Yes, you must always quit on a high
note!” But I do think it makes the trainer more excited to get started
the next day, and it might have the same effect on La Pooch. So, I prefer to
end my sessions when everyone’s feeling successful.
So the next day
we started where we left off, but Pan didn’t instantly go to rolling almost all the way over. Oh, well. No biggie.
I backed up and clicked flopping onto his side. Then I clicked tipping
onto his back. Then I clicked all the way over.
It didn’t take long for him to get back up to speed. Maybe 5-6 clicks
and we were where we left off the previous day.
So now what? He went over onto the opposite side and laid there and looked at me.
And twitched his head. And wiggled.
Finally he lifted his head up… CLICK! In just a few clicks, he got
up on his feet at the end of the roll over… that was my target behavior! Yeah!!! I gave him a huge jackpot. (Do jackpots
really work to strengthen behavior better than the usual size of treat? I don’t
know. They make me feel good. I
am part of the training team so my opinion of the process matters, too.)
Stimulus
Control- Getting it On Cue
But the training wasn’t
done. Now I had to add the cue. I
wanted Pan not to just randomly roll over. I wanted him to roll over only when
I said “roll over”. So now when Pan was about to finish rolling over,
I said, “Roll over”, and gave the click and treat for the finished behavior as I had been doing. Then as he was halfway through the roll over, I gave the cue. Gradually
I said the cue earlier and earlier, and finally I could say the cue before he did the behavior and he would drop and roll. Yes. Now I had a dog who would roll over
when I said roll over.
Generalization
Or did I? I had a dog who would roll over when I said “roll over”
in front of the fireplace! When I sat with him in front of the TV in the family
room and said roll over, that wacky dog stood on his back legs and danced, he sat, he laid on his chest, he spun around in
a circle. He did everything but roll over.
I wasn’t surprised! Dogs can be very specific sometimes. This whole “roll over” thing was related to the fireplace area in the library as far as he
was concerned. I was doing the training there because the light was good for
videotaping. So now my poochikins needed to learn that “roll over”
called for the same behavior in the family room that it calls for in the library. And
that required backing up again and starting over. Of course, just as when I had
stopped training for the night, this training session went much more quickly.
It will probably take mini shaping sessions in several rooms of the house before Pan realizes that the words, “roll
over” always call for the same behavior no matter where we are. When he
responds to the cue, “roll over” with the target behavior I’ve taught him in new places where it has not
been specifically trained, we will say his behavior has generalized.
Shaping is the heart and soul of successful clicker training… or for positive reinforcement training without
a clicker. It takes practice, but practice is cheap.