Goat Diaries – Clicker Training Day 2: These Goats Are Smart!

The goat palace is almost finished.  We were hoping to get it done yesterday afternoon, but we didn’t quite make it.  The three yearlings are feeling very squashed in the stall by the oldest female, Thanzi.  She is making it very clear that they are TO STAY IN YOUR CORNER.  I am glad we decided in our construction to use the entire space the lean-to provided and didn’t just settle for making a small goat pen.  They will have plenty of room to spread out.

So for this morning it is back to July and the Goat Diaries.  I had gotten as far as mid-morning of E and P’s second day of clicker training.

Training Rhythms

Good training begins to have a rhythm to it, especially in these early stages where you are asking for simple behaviors, and you’re keeping the rates of reinforcement high. It’s get the behavior – click and feed, get the behavior – click and feed, – get the behavior, click and feed. It becomes a training loop. We’re looking for clean loops.

When a loop is clean you get to move on, and not only do you get to move on you should move on. That’s the mantra of loopy training. Often people change criteria too fast which ends up confusing the learners. Or they stay too long at one step so they build a glass ceiling into their training.  To the learner backing up means three steps and only three steps. If the handler asks for four, there’s frustration. The learner knows the behavior. It’s three steps and three steps only!

The mantra of loopy training helps you to know when to move on. It also helps you to know when you should pause for a moment to let your learner show you what he has learned. Canine trainer, Kay Laurence refers to these pauses as puzzle moments.

In these early sessions with these goats I was beginning to establish some training loops. P in particular was such a fast learner, it was time to give him some puzzle moments to see what dots he was connecting.  If you aren’t sure what a puzzle moment looks like, P is about to show you.

Session 3: 11 am
I started with P out in the pen. He was ready, eager to touch a target, but my attention was elsewhere.  I was busy setting up the camera. I was very aware that I might be missing a window of opportunity. We began with a little targeting. He oriented to it, I clicked, fed, and then clicked and fed again while he was still out of my space. The jumping up on me to try to get the food that he had been doing in the previous session was almost completely gone.  My active use of food delivery was paying off.

Click for targeting. Feed where the perfect goat would be. The perfect goat would have all four feet on the ground. He would be looking straight ahead, and he would be outside my personal space.

After I clicked, I fed P so he had to take a step or two back to get the food. My concern here was the food delivery caused him to curl his neck so his head was in the orientation it would be for butting with his horns. I didn’t want to trigger that behavior. But head butting is a forward moving behavior. Here he was moving back, so I hoped that his feet would keep his head from thinking he should be charging me.

Get them while they’re standing still.

I fed P so he had to back up a couple of steps to get to the treat in my hand. Before he could come forward again, click, I was giving him a treat – this time where he was standing. I wanted him to get the idea. Standing still, away from me, is a good thing. Click treat, click treat. I was tightening the training loop down to the tiny fraction of a second in which he was standing still looking straight ahead.

The neighbors were mowing the hill up above the barn. P kept turning his head to the side to check them out. His feet were still, but I didn’t want to make such a full head turn part of the behavior. I had to wait, hoping his feet would be still when he finally looked back in my direction. Click then treat.

When I clicked, I used my food delivery to move him back a couple of steps. I wanted to be able to click again while he was still standing back out of my space. I also wanted his head to be straight. If I clicked too many times when his head was turned, I was concerned that I would build that into the base behavior. So I had to wait to click until his feet were still AND he had his head straight. Asking for two criteria at once was pushing my luck. The first couple of times he was too quick for me. He straightened his head, but just as I began to click, he was shifting forward.

I moved him back again with the food delivery. He took his treat from my hand.  Before I could click again, he had come forward into my space.

I work hard to avoid putting my learners into a macro extinction process.  Here’s what that means: This behavior has been consistently working to get me to hand you treats. Only now suddenly, it’s not. You’re not going to be reinforced for this very successful behavior.

We all know how frustrating this can be. You put your money in the vending machine and nothing comes out. Time to shake the vending machine!

My training rhythm was broken and P didn’t yet have enough experience in the game to know what to do. His repertoire of behaviors was still too limited to offer me something I could reinforce. Instead he was trying to go directly to my pockets. I suspect by this point the small children he had grown up with would have dropped pretzels and peanuts all over the floor and everyone would be happy. The children would be giggling, and P would be gobbling up the goodies. Only this wasn’t how I played the game. How annoying!

P gave a little chuff of a sneeze. I had llamas years ago, so I recognized this sound as a sign of frustration. He tried both my pockets. Nothing. He gave a head toss which I dodged, and then I got lucky. He dropped his head away from me enough so that I could reinforce him. The food delivery moved him out of my space, and we were back on track building good behavior.

Training is not without moments of frustration. I was beginning to recognize what this looked like in a goat. A little tail wiggle, a snort, a head butting gesture – these all told me that P was struggling a bit to make sense of what was happening. Why wasn’t I just giving him treats! That’s what the children would have done. And if they didn’t give him treats, he’d just jump up on them, and that was sure to make them scatter their peanuts and pretzels on the ground!

But here this was different. He was clearly frustrated. Doing what had always worked in the past, namely crowding into me didn’t work. Looking away, taking a step back, produced treats!  It made no sense to him, so while it produced treats it also produced a puzzled goat.  And a puzzled goat can very quickly become a frustrated goat.  Noted.

I was monitoring carefully. Always I am asking myself is this working? Is this the best strategy? How much frustration is too much? What should I change? Should I stop?

Puzzle solving!

There is a time to be clicking, and a time to just wait it out and let your learner work out the puzzle. Through the food delivery, I had shown P the answer. Back away and you get treats. Would he put the pieces of the puzzle together? I waited. The skill here is to be quiet, to remain as non-reactive as you can be and let him figure out the answer. A puzzle you solve for yourself, is an answer you will own.

He could sniff at my pockets. I remained non-reactive. How frustrating! I was not playing the game fair. The children would have been flailing their arms about and pushing him away. Which meant they would also have been dropping treats. Push on the vending machine, and it scatters goodies over the ground, except not now.

His feet took him back a couple of steps. Click – treat. The next time the backing was even more definite.

He caught on fast and began to back away from me. When he came forward into my space, now I could wait. It was a puzzle moment. What would he do? I had shown him the answer through the food delivery. Would he find it now on his own?

The answer was yes! He backed up, not just a little, but multiple steps. And he backed with energy. Very neat!

P was definitely a quick study. He was beginning to understand that he could get the food by doing other things besides jumping up or bumping my pockets. It was a really fun session watching him catch on so fast. Though I got the impression that he was still very confused. Backing was clearly working, but it didn’t make sense to him. How could backing up get treats to appear? He was a very puzzled goat.

I sympathized. We’ve all been given sets of instructions that make no sense. Whatever is logical – do the opposite. How maddening is that! Especially when it works!

I would find out in the next session if P could reconcile himself to this new inside-out world order.

(Note: we had moved on in the treats. I was now using a mix of peanuts, peanut hulls, sunflower seeds and hay stretcher pellets as treats.)

Training time for this session: 6 minutes.

“A puzzle solved is a behavior owned.” P showed me he was making the connections – fast!

Summer Pleasures – Watermelon Parties and The Two Sides of Freedom

Watermelon Parties


Summer means watermelon parties for the horses.  They are always a surprise.  As I walk through the barn, bowl in hand, I’ll announce: “It’s party time!”

Watermelon parties are held outside. That was quick learning on my part. It’s astounding the amount of happy drool even a few pieces of watermelon can produce.

Robin and Fengur follow me outside.  While I pass out chunks of watermelon, they stand waiting, one on either side of me.  There’s no pushing, no trying to jump the queue, no grumbling at the other horse. We have a happy time together. The horses get to enjoy one of their favorite treats, and I get to enjoy their obvious pleasure.

Summer also means sharing an afternoon nap with Robin. I’ve just come in from mowing the lower pasture. It’s time for a cool down. I’m sitting in a chair in the barn aisle, cold drink by my side, computer on my lap, and Robin dozing beside me. Fengur has wandered off to the hay box to snack. He’ll join us in a little while.

The view from my chair – Robin’s lower lip droops while he naps beside me.

Why am I writing about these simple summer pleasures? My horses live in a world of yes. I’ve been thinking a lot lately about what this means. Living in a world of yes gives me the freedom to enjoy these simple pleasures. But the freedom isn’t one-sided. Living in a world of yes gives my horses just as much freedom.

We often think of training in terms of what we need from our animals. When I walk down the barn aisle, I need you, horse, to move out of my space. When the door bell rings, I need you, dog, to go sit on your mat. I’ll teach these things using clicks and treats, but the behaviors are for my benefit more than my animal companions. The freedom to ask is all on my side.

That’s not how things are in my barn. It’s set up to maximize choice for the horses. Doors are left open so they are free to go where they want. Right now what Robin wants is to nap in the barn aisle. I couldn’t give Robin this luxury of choice if I hadn’t also given him behaviors that let us share space amiably.

When I walk down the barn aisle, Robin will often pose. It’s a simple gesture, a slight arch of the neck is all that’s needed. If he thinks I’m not paying attention, he’ll give a low rumble of a nicker. I’ll click, and give him a treat. Often I’ll get a hug in return.  That’s good reinforcement for me.

The pose is a guaranteed way to get attention from me. If Robin wants to interact, he knows how to cue me. And I am under excellent stimulus control! That’s how cues should work. They create a give and take, a back and forth dialog. They erase hierarchy and create instead the three C’s of clicker training. Those three C’s lead in turn to the freedom my horses and I enjoy sharing the barn together.

Before I can tell you what the three C’s are, we have to go back a few steps to commands.  It’s not just in horse training that commands rule. They control most of our interactions from early childhood on.  Commands have a “do it or else” threat backing them up. Parents tell children what to do.  In school it is obey your teachers or face the penalties. In our communities it’s stop at red lights or get a ticket. Pay your taxes or go to jail. We all know the underlying threat is there. Stay within the rules and stay safe. Stray too far over the line and you risk punishment.

This is how we govern ourselves, so it is little wonder that it is also how we interact with our animals. With both horses and dogs – commands have been the norm. We tell our dogs to “sit”. When it is a true command, it is expected that the dog will obey – or else! The command is hierarchical which means it is also unidirectional. A sergeant gives a command to a private. The private does what he’s told.  He doesn’t turn things around give a command back to the sergeant.

We give commands to our horses, to our dogs – never the reverse. We expect our commands to be obeyed. We say “sit”, and the dog sits. I tell. You obey. Because they are hierarchical, commands exclude dialog. The conversation is all one-sided. Commands put us in a frame that keeps us from seeing deep into the intelligence and personality of the individual we’re directing.

Cues are different. Cues are taught with positive reinforcement. At first, this sounds like a huge difference, but for many handlers it represents a change in procedure, but not yet of mind set. The handler may be using treats as reinforcement, but the cues are still taught with an element of coercion.  How can this be? It’s not until you scratch below the surface, that you’ll begin to understand the ever widening gulf that the use of cues versus commands creates.

dog touching a targetTo help you see the coercive element, let’s look at how twenty plus years ago we were originally instructed how to teach cues.  You used your shaping skills to get a behavior to happen. It might be something as simple as touching a target. Cues evolve out of the shaping process. The appearance of the target quickly becomes the cue to orient to it.  But this cue is often not fully recognized by a novice handler.  We’re such a verbal species, this handler wants her animal to wait until she says “touch”.  As she understand it, that’s the cue.  So what does she do? She begins by saying “touch” and clicking and reinforcing her learner for orienting to the target.

This part is easy. Whether she had said anything or not, her learner was going to touch the target. She’s ready to make a discrimination. Now she presents the target, but she says nothing. What does her learner do? He orients to the target, just as he’s been doing in all the previous trials. He expects to hear the click and be given a treat, but nothing happens. His person just changed the rules which has plunged him into a frustrating puzzle.

He’s in an extinction process. He’s no longer being reinforced for a behavior that has worked for him in the past. He’ll go through the normal trajectory of an extinction process. That means he’ll try harder. He’ll try behaviors that worked in the past, and he’ll become frustrated, anxious, even angry, before he’ll give up for a moment. In that moment of giving up, his person will say “touch” and present the target again.

She wants him to learn the distinction. In the presence of the cue perform the behavior – click and treat. In the absence do nothing.

The problem with this approach is she never taught her learner what “do nothing” looks like. She stepped from the world of commands into what she thinks of as a kinder world of cues, but she didn’t entirely shed the mantle of “do it or else”. With cues the threat of punishment may not be there, but extinction is still an unpleasant and frustrating experience. Why isn’t this key on my computer which was just working now locked up and frozen?!! Until you can find your way out of the puzzle, you can feel very trapped and helpless. A good trainer doesn’t leave her learner there very long. She’s looking for any hesitation that let’s her explain to her learner the on-off nature of cues.

There’s another way to teach this that doesn’t put the learner into this extinction bind.  This other way recognizes that cues create a dialog, a back and forth conversation.  I want my learner to wait for a specific signal before moving towards the target.  Let’s begin by creating a base behavior, a starting point.  For my horses this is the behavior I refer to as: “the grown-ups are talking please don’t interrupt”.  I will reinforce my horse for standing beside me with his head looking forward.  He’ll earn lots of clicks and treats for this behavior.  And he’ll begin to associate a very specific stance that I’m in with this behavior.  When I am standing with my hands folded in front of me, it’s a good bet to try looking straight ahead – click and treat.

In separate sessions he’ll also be reinforced for orienting to a target.  When both behaviors are well established, I’ll combine them.  Now I’ll look for grown-ups.  I’ll fold my hands in front of me, knowing I’ll get the response I’m looking for.  Only now, instead of clicking and reinforcing him, I’ll hold out the target to touch.  Click the quick response and treat.

The message is so much more interesting than the one created by using an extinction procedure to introduce cues.  Cues have just become reinforcers which means they have become part of a conversation.  If you want to interact with the target, here’s an easy way to get me to produce it – just shift into grown-ups.  That will cue me to lift the target up.  A conversation has begun.  We’re at the very elementary stage of “See spot run”.  I’m teaching my horses the behaviors they can use to communicate with me, and I am showing them how the process works.  You can be heard.  You WILL be heard.  Let’s talk!

The conversation that emerges over time comes from looking more deeply at what cues really are. We can think of them as a softer form of commands, but that doesn’t oblige us to step out of our hierarchical mindset. It is still I give a signal. You – my animal companion – respond. Click and treat. Diagram this out. The arrows all point in one direction.

Signal from human leads to response from animal

Peel another layer of understanding about how cues work and you come to this:

It isn’t just that cues are taught with positive reinforcement. Cues can be given by anyone or anything. A curtain going up cues an actor to begin speaking his lines. We would never say the curtain commanded the actor.

If cues can be given by anyone or anything, that means they are not hierarchical. We cue our animals, and they cue us. Cues create a back and forth exchange. They lead to conversation – to a real listening to our animals. We adjust our behavior based on their response. Cues lead to the three C’s of clicker training which I can now say are: communication, choice, and connection. And in my barn that in turn creates opportunities for more freedom. It means doors can be left open. It means I can have watermelon parties and sit with my horses while we both enjoy the afternoon breeze through the barn aisle.

Let’s parse this some more.

The mindset that commands create is very much centered around stopping behavior. Other training options won’t make sense. They won’t work.

Cue-based training makes it easier for you to see your horse’s behavior as communication, as a bid for attention. That makes it easier for you to look for solutions that satisfy his needs.

Let’s see how these differences play out in a typical boarding barn scenario. Your horse is hungry. His initial whicker has been ignored. In frustration he’s escalated into banging on his stall door. His human caretakers see this as “demanding” hay. In a command-based frame demanding hay equal rebellious behavior which can’t be tolerated. The behavior must be stopped.

Within this frame the only training options you can think of are those centered around stopping the unwanted behavior. Other options don’t make sense and won’t work. The command-based frame narrows your field of view. It’s as though you have a tight beam focused on the problem behavior. Everything within that beam is crystal clear, but everything outside the beam might as well not exist. You can’t even begin to think about other solutions. You are targeted on the unwanted behavior.  Banging on the stall door must be addressed and addressed directly.

Now let’s look at the contrast that a cue-based frame creates. Your horse is hungry. His initial whicker to you is noticed and responded to. You appreciate his alerting you to the lack of hay. You have read how important gut fill is in preventing ulcers. You attend to your horse’s needs. Within this frame many options become available including hanging a slow feeder in his stall so he doesn’t have to become anxious about his hay.

Which training options make sense will depend upon which frame you are in. If you are a teacher and you want your instructions to be effective, you need to help your students open a frame that matches what you are trying to teach.

In her presentations Dr. Susan Friedman uses a graphic showing a hierarchy of behavior change procedures beginning with the most positive, least intrusive procedures.

You begin by looking at health and nutritional considerations and then move to antecedent arrangements. Hanging a hay net for our hungry horse would fit in here. Her graphic pictures a car moving along a highway. As you begin to approach more invasive procedures, there are speed bumps blocking the way. They are there to slow you down, to make you think about other approaches before you bring in the heavy guns of positive punishment. The hierarchy doesn’t exclude positive punishment as a possible solution, but it does say you would use this only when everything else has first been tried.

This hierarchy makes sense when you are looking at behavior from a cue-based perspective. From a command-based frame, the car enters not at the bottom of the roadway, but at the top.

The first intervention is positive punishment. The barriers are still there, but now they act to keep you from seeing other options. It is only when punishment fails, that you are dragged, kicking and screaming, to consider other ways of changing behavior.  I’ve heard these stories so many times from people who are attending their first clicker training clinic. They’ve been brought there by “that horse” – the one who challenges everything they thought they knew about training. Nothing else worked, but then they tried, as a last resort, a bit of clicker training and everything changed! So here they are, ready to learn more.

They don’t yet know what an exciting world they are entering. Everything they have thought about training is about to be turned truly upside down and inside out. That’s all right. They have the fun of watermelon parties ahead of them.

If you want to learn more about living in a world of yes and the freedom that creates for both you and your animal companions, come join us in Milwaukee for the Training Thoughtfully conference.  https://www.trainingthoughtfullymilwaukee.com/

JOY FULL Horses: Understanding Extinction: Part 12

Mastering Micro: Building Unlikely Behaviors with Resurgence
Nothing is either all good or all bad.

We want to use positive reinforcement with our animals because we see it as being both effective and more humane.  But the associations created through positive reinforcement can create addictions to harmful behaviors.  Think about the way advertisers manipulate our behavior to encourage smoking or overeating.

Resurgence and regression can be very negative procedures, but they can also be used to produce what might otherwise be very difficult behaviors to obtain.

If you aren’t sure how you can turn what seems like a negative procedure into a positive teaching strategy, PORTL can once again help to illustrate how this works.

Here’s the set up:

The trainer sets a toy chair on the table for her learner to interact with. The goal is to get the learner to push the chair over the table the way she might push a toy car.

We’ll now observe quietly in the background while the learner begins to interact with the chair.  The trainer could get lucky.  The learner might begin offering the behavior she’s after within the first couple of clicks.  But with this learner there’s no sign of any chair pushing behavior. Why?

History matters.

The learner is going to draw on all of her previous repertoire of things she has done with chairs.  In this case we have a learner who was scolded as a child for pushing her chair over the floor, so she’s not very likely to offer this type of behavior with the toy chair.

A history of punishment has played a role in depressing chair pushing behavior for this learner, but pushing would also have been an unlikely behavior if the trainer had set down a dice. The learner would have tossed the dice or shaken it in her hand because that’s what you do with this kind of object. Pushing a dice over the table like a toy car is not an obvious behavior to try.

Through a series of small approximations, the trainer could try to shaping the behavior she wants.  Her first step would be reinforcing the learner for touching the chair.

The learner in this case is not particularly creative.  She offers simple touches, but nothing else.  Again, the trainer may be dealing with a history of punishment.  Her learner doesn’t have a lot of experience being reinforced for trying things.  In fact, quite the opposite – she may have been punished for stepping “outside the lines”.  She is like so many of our animal learners – hesitant, lacking in confidence, and not showing any outward signs of curiosity.  In her first few attempts she touches the chair, but she doesn’t try any other behaviors.  Getting her to push the chair is going to be hard.

So the trainer takes the chair away and sets out a toy car. Using an object that normally would be pushed makes it very easy to get the desired action.  The learner pushes the car over the table top. Click and treat.

This is repeated several times, and then the trainer takes the car away and sets the chair out.  The learner goes back to touching it.  The chair accidentally falls over – click and treat. The learner latches on to that, expanding her repertoire to two behaviors – touching the chair and knocking it over.

We see this so many times with our animal learners.  One click and suddenly you’ve locked in a behavior you don’t want.  With a creative learner this isn’t a problem.  You can quickly shift the behavior into something you want, but with these “one trick ponies” you have to be so very careful what you click.  In this case the learner persists in knocking the chair over even when she is no longer getting reinforced for the action.

Her trainer makes a quick decision and decides to put everything but pushing the chair like a car on extinction.  Her learner is clearly becoming frustrated.  To avoid having her shut down completely, the trainer takes the chair away and sets the car out again.  The learner immediately starts pushing the car over the table top.  Click and treat.

To help with the generalization the trainer puts a third object out – a small block. The learner pushes the block.  Click and treat.  This is repeated several times, then the trainer takes the block away and sets out the car.  The car is pushed. Click and treat.

The trainer sets the chair out, and the learner pushes the chair.  Job done.

Resurgence and Dog “Yoga”
Using the car in this way is an elegant teaching strategy.  Often when we come up with these clever ways of helping our learner to be successful, we know that it works, but we don’t really have good explanations for why.   Understanding resurgence helps us with the why in this case.  And it helps us to be more deliberate in the use of this kind of teaching strategy.  Here’s another example.

One of Kay Laurence’s students taught her dog to step up with his hind legs onto a chair.  It was elegant training, a beautiful example of setting the learner up for success.  In his talk on extinction, Dr. Jesús Rosales-Ruiz helped us to see that it was also a great example of using resurgence.

Here’s the lesson: First, the dog learned to stand one foot each on four small plastic pods. This alone was impressive training.  The pods were the same ones physiotherapists use to help people improve their balance and proprioception. It took great coordination for the dog to stay balanced on the four pods. But that was only step 1.  Next he learned to keep his front feet on the floor while he maneuvered his hind feet up onto the brick ledge of a fireplace hearth.

Adding in the precision of the pods came next.  Now the dog wasn’t just standing with his front paws on the floor and his hind end up on the ledge.  He was also balancing on all four pods.

This was not done as a cute party trick.  The dog’s owner is a yoga teacher.  Her interest was very much the same as mine – helping her animal learner maintain a healthy spine.  In this orientation she could ask her dog for weight shifts that contribute to a flexible spine.

The last step was setting up a training session next to a chair. The handler withheld the click, putting the dog into an extinction process. With very little experimentation, the dog oriented himself so his hind end was to the chair.  He certainly demonstrated the flexibility of his spine by stepping up onto the chair with his hind legs so he was standing hind end up on the chair and front feet on the floor.

Generalization and Creativity
Jesús commented that if we didn’t know about resurgence we would simply be saying the dog generalized.  That’s not a sufficient explanation.  What we were seeing was a great example of resurgence. PORTL has given us a better understanding of how to encourage this kind of problem solving.  When we want to train for this type of generalization, knowing about the “why” of resurgence helps us to be more deliberate and efficient in our training.

It isn’t positive reinforcement by itself that creates a positive learning experience.  An eagerness for learning comes from being a successful puzzle solver.  That success in turn comes from the kind of efficient, clean training that the clever use of resurgence encourages.

These examples give us a great perspective on creativity.  When we’re training, we aren’t waiting and waiting for our animals to do something we can reinforce.  Instead we can “seed” the behaviors we want them to draw on.  Then we set up the conditions and let them have the pleasure of discovering for themselves new or unlikely combinations.

We have a procedure for setting up the creative process.  You give your learner the repertoire, the components that form more complex behaviors, and then you set a puzzle and let extinction be the catalyst for solving it.

JOY FULL Horses: Understanding Extinction: Part 9

Eureka Moments: What is Insight?

Using resurgence – Insight
Yesterday I shared several PORTL games developed by Dr. Jesús Rosales-Ruiz.   The games deliberately used extinction.  What was observed was this: when you have been consistently reinforcing behaviors as you establish them in repertoire, and you then remove all reinforcement for them, you get a resurgence of these previously reinforced behaviors.  They reoccur in the order in which they were trained.  

When you instead extinguish the individual behaviors during the teaching phase, you get a different result.  The student will go back to the most recently learned behavior.  If that doesn’t work, he’ll go a little further back, and then a little further back.

In resurgence the behaviors occur in the order in which they were taught, so the oldest behavior in the cluster occurs first.

In regression the order reverses.  The most recently taught behavior reappears first.

So how does this help us?  How can we use this understanding to shape behavior?  To get the ideas rolling Jesús shared several video examples where resurgence was used to train complex, creative behaviors.

The first video came from Robert Epstein’s work. Epstein was B.F. Skinner’s last graduate student.  Together they were exploring the concept of “insight”.  How do we solve puzzles?  Are we truly creating something that has not existed before, or is creativity a product of combining known components to solve a novel puzzle?

Bird Brains
To explore this question Epstein taught a pigeon three component behaviors: pecking a banana, climbing on a box, and pushing the box towards a target.

The pigeon was then put into a chamber with the box and the banana.  The banana was hung up out of reach.  The pigeon couldn’t peck the banana, so an extinction process began. There was a resurgence of previously trained behaviors.  The pigeon was able to push the box under the banana, get up on the box, and peck the banana.

How did the pigeon solve this puzzle so quickly?  What is insight? What really is creativity?  Skinner and Epstein would say the pigeon could solve the problem because it had in its existing repertoire the necessary components.  Pigeons that had no experience pushing the box or jumping up on the box failed to solve the puzzle.

What is Creativity?
Jesús gives us a very process-oriented way thinking about this experiment.  This kind of complex puzzle solving was achieved through resurgence.  Set up the underlying components well, add in a bit of extinction, and “creativity” pops out.

If you leave out one of the components, the individual will struggle to solve the puzzle.  He will experience a much longer extinction process.  Macro extinction emotions will begin to surface, and you have to hope the subject has the persistence to become truly creative.

This is the kind of creativity that is truly stressful.  It’s much better to analyze the end goal – the complex behavior you want to train – break it down into all of it’s component tasks, and then train each of the components separately.  The result will be brilliant looking pigeons that solve in minutes what we might otherwise think would be an impossible puzzle for them.

Jesús’ comment was there is “nothing new under the sun”. The behaviors you try are all built out of things you’ve done before.  All the components of what appears to be a novel behavior have been trained in the past. So let’s consider what happens when a group of people are presented with a challenging puzzle.  When they begin experimenting and find that the usual, familiar things aren’t working, some will give up quickly.

Others will persist.  They will experiment with novel combinations of what they already know, but again most will quit if they don’t come up with a solution fairly quickly .

A few will keep trying until they stumble across a novel combination that works.  We call these people inventors and creators because they are persistent enough to find these novel combinations.  The discovery process can be a painful one, but once the new combination has been found, it’s easy for everyone else to copy the results.

I can absolutely relate to this.  Give me a horse puzzle to solve, and I can be very persistent. My life experience has taught me that persistence pays off.  But put me in front of a computer that isn’t cooperating, and I shut down fast. There my experience has produced a different set of expectations. I’ve been in enough situations where errors in a software program have made a problem unsolvable, at least for my level of computer skills.  I don’t have the programing background that makes wrestling with a software issue fun.  Extinction has gone too far and been too uncomfortable.  So in one situation I can be very persistent and creative.  In another I’m the one going through the classic cycle of emotions that macro extinction produces.

I know first hand both how much fun the creative process can be when the expectation of success is there.  And I also know how painful and unpleasant the extinction process is when that expectation is missing.

What I want to create for my learners is a feeling of confidence.  Whether horse or human, I want them to KNOW they can solve whatever training puzzle I throw at them. Build this expectation in early before others have taught them hard lessons about failure, and you get brilliant, enthusiastic, joyful individuals.  They are the optimists of this world.  Whether horse or human, they are fun to be around.  That’s what an understanding of these concepts helps us to create.

JOY Full Horses: Understanding Extinction: Part 8

Mastering Extinction
Extinction happens all the time.  When you withhold your click, you set up an extinction process.

If you are unclear about your criteria or clumsy in your handling skills, you could be setting up your learner for a macro extinction process with all of the painful emotions that go along with it.

Or you could be using a micro extinction strategy to help shape a more complex behavior.  In this case you are using extinction to your advantage.  Extinction doesn’t have to be something you avoid.  It can be something you actively use to create more complex behavior patterns.

In yesterday’s post I described the PORTL games that Dr. Jesús Rosales-Ruiz  uses to help his students understand principles of behavior.  In his talks he shares some fascinating PORTL experiments to illustrate the difference between resurgence and regression.

Experiment One: Resurgence
The learner was taught a series of behaviors:

Behavior 1: tapping a small block. Once that behavior was confirmed, the block was removed and a toy car was placed on the table.

Behavior 2 was rolling the toy car over the table top.  When the car was brought out for the first time, there was a small extinction burst of tapping the car, but the learner quickly shifted to pushing it.  Pushing a car is an easy guess for what you would do with this kind of object.

When that behavior appeared to be solid, the car was removed and a third object, a key, was placed on the table.  Now the behavior was lifting.  Fingering a key is a normal response to this kind of object so it was easy to get the learner first to touch the key and then to lift it up off the table.  Once the learner was consistently lifting the key, that object was removed and a fourth one was introduced.

Behavior 4 involved the learner putting a wooden ring on her finger.  The learner quickly figured this out and began to consistently offer this behavior.

When each of these behaviors seemed solid – tapping the block, pushing the car, lifting the key, putting a ring on her finger – the trainer reviewed, one at a time, what the learner was to do with each of the objects.

The trainer then placed all four objects out on the table, but not in the order in which they had been taught.  The trainer observed the learner’s behavior.  She did not give any feedback or reinforcement of any kind.  The point was to see in what order the learner would interact with each object.

The result:  The learner went first to object 1/behavior 1, then moved to object 2/behavior 2, then object 3/behavior 3/and finally object 4/behavior 4.

So even though that wasn’t the left to right order in which the objects were set out, that was the order in which the learner interacted with them.

The conclusion: when you have not gone through an extinction process for the behaviors you are using, when you have instead reinforced them, and then you remove reinforcement, you get a resurgence of these previously reinforced behaviors.  They reoccur in the order in which they were trained.  

Now here’s the fun part.  When you instead extinguish the individual behaviors, you get the opposite result.  Now you see regression.  The individual will go back to the most recently learned behavior.  If that doesn’t work, he’ll go a little further back, and then a little further back – thus revealing his training history.

In resurgence the behaviors occur in the order in which they were taught, so the oldest behavior in the cluster occurs first.

In regression the order reverses.  The most recently taught behavior reappears first.

These differences are illustrated in the second experiment.

Experiment Two: Regression
After a series of behaviors have been learned, this experiment again puts the learner through an extinction process.  In the initial set up each time the learner is moved on to a new task, an extinction process is used to eliminate the previous behavior.  Here’s the experiment:

The trainer sets out one item on the table.  The learner begins to manipulate it, trying to find out what is going to be clickable.  The trainer doesn’t click any of this creativity. She waits instead for it to extinguish and then clicks for one simple behavior – touching the object with one finger. That is the “hot” action.

The trainer clicks and reinforces for successful approximations until she has achieved a high degree of consistency in touching the object with one finger.

This was the set up for the experiment.  In the next phase she sets ten different objects out in a circle, including the one they had just been working with.  The learner begins by touching the familiar object.  That gets clicked and reinforced several times, then the trainer stops reinforcing for that object.  She is using extinction to eliminate that behavior.  The learner begins by experimenting, touching various objects, but she only gets clicked for touching the one that was immediately next to the previously hot object in a counter clockwise direction.

The learner switches over to this object and begins touching it consistently.

So now the handler stops reinforcing for this object and only reinforces for the next object on the circle.  The learner again experiments and then discovers that the only object that she gets paid for touching is the third one on the circle.

When this is consistent, the handler again stops reinforcing for touching this object.  The learner is catching on to the overall pattern. Now she moves more quickly to the fourth object and discovers that is the “hot” one to touch.

They continue counter clockwise around the circle until every object has been the “hot” one once and touching it has also been extinguished.

At this point the handler stops reinforcing altogether and simply observes the learner’s behavior.  The result: the learner quickly switches to moving clockwise around the circle, touching the objects in the reverse order in which she learned them.  So she learned them originally counter clockwise: object 1, then object 2, then object 3, then object 4, etc.

Now she was touching them clockwise: object 10 – object 9 – object 8 – object 7, etc.  She isn’t getting clicked for any of these touches, but the pattern is very persistent.

So again: in the first experiment where the behaviors were taught, but not extinguished, the learner went through them in the order in which they had originally been learned.

In the second experiment where behaviors were extinguished, the learner went through them in the reverse order.

You won’t find these distinctions in the scientific literature. These two extinction outcomes, resurgence versus regression, are something Jesús and his students have been revealing by playing PORTL games.

Mind Games
Again Play is the key here.  PORTL may have a serious purpose behind it, but these are games.  All the creativity that comes with play is woven into these experiments.  It may turn out that others playing with similar set ups will have different results.  That’s a good thing.  That simply raises more questions, more puzzles to solve.

Do you have a question about how something works? Great. Design an experiment, test it a few times to work out the kinks in the procedure, and then invite your friends over for a pizza and PORTL party.  In the course of an evening you could have enough data to write a paper!

I do like the new twist Jesús has given to this version of the training game.  As he has pointed out, we’ve been using lab rats to learn about human behavior.  Now we are using humans to model animal behavior. Turnabout is fair play.  Much better to frustrate an undergrad than some poor lab rat!

JOY FULL Horses: Understanding Extinction Part 7

The Training Game
I’ve mentioned training games several times.  The original clicker training game was a close cousin to the children’s game “Hot and Cold”.  The learner was sent out of ear shot while the rest of the group chose a goal behavior.  When the learner returned, the only instructions she was given were to offer behavior.  If she did something that her designated trainer liked, she would be clicked. She was then to go to her handler for a treat.

I’ve seen situations where the learner got the behavior seamlessly.  One easy click after another led the learner directly to the goal behavior.  I’ve seen other situations where the same behavior tripped people up completely.

When we train our animals, we want the first scenario – seamless, successful training.  That’s what we want for our equine learners.  But in the training game, we often learn the most when we experience clumsy shaping.  It can be frustrating to struggle through a session that lacks a clear training plan, but you do gain a great appreciation for what NOT to do.

Kay Laurence developed a different style of training game.  In this one trainer and learner are seated opposite one another at a table.  Instead of acting out the behavior like a game of charades, the learner manipulates objects which the trainer has set out on the table.


Kay always has great fun collecting objects for the table game.  She has small plastic fruits and cakes, toy cars, small cones, plastic insects of various varieties.  It’s a colourful mixture that she hands over to her trainers.  When I play the table game at clinics, I raid the host’s kitchen junk drawer.  My toys aren’t as much fun as Kay’s, but they serve the purpose just as well.

Kay calls her game Genabacab.  It has very few instructions and really only one rule: the only person who is allowed to talk is the learner. The trainer and spectators are not to give any verbal hints or to discuss what is going on until afterwards.

The table game lets you work out shaping plans BEFORE you go to your animal.  Do you want to learn how to attach a cue to a behavior and then change that cue to a new cue? You can work out the process playing the table game and spare your animals the frustration of your learning curve.

Kay has described workshops at her training center where someone arrives with a “how do I teach this?” type of question.  Maybe the handler wants to teach match to sample, or she wants to see if her dog can indicate which object is bigger or smaller.  Instead of going straight out to the dog and confusing it with missteps and false starts, everyone in the group will pull out their Genabacab games. Kay says people will often spend half the day happily absorbed in developing the best teaching strategies for their dogs.  The dogs spend the day relaxing while their people work away at the puzzle.  It’s only once the process is well understood, that the dogs are brought in for training.

Dr. Jesús Rosales-Ruiz and his students at the University of North Texas have been using Genabacab to understand basic principles of behavior.  He wants to bring the game to the scientific community as a research tool, so he gave his version a new name:  PORTL – Portable Operant Research and Teaching Laboratory.   Kay still has her Genabacab for teaching her canine handlers and Jesús has PORTL for teaching behavior analysis.  On the surface they are similar games, but they serve different functions.

Animal studies are increasingly difficult to do because of ethical concerns and expense.  PORTL offers an alternative for research.  You can have a question about how a particular process works, design an experiment using the PORTL game, and in hour’s time have gathered enough data to write a paper – all without frustrating a single lab rat. Now that’s progress!

His students meet on a regular basis to play PORTL games. When they turned their attention to the extinction process, they made some interesting discoveries.

In one game, the learner was shaped to place one hand over the other – right hand over left, and then to reverse it – left hand over right.  The behavior was put on a fixed ratio of 5, meaning the learner was clicked and reinforced on every fifth swap of hands.

The second task was tapping a block.  Again the learner was put on a fixed ratio of 5. (The learner was to tap the block five times for each click and treat.)

The trainer then increased the ratio for the tapping to 30. The learner began to tap the block, but now there was no click and treat after 5 taps.  The learner kept going to about 13 taps.  At that point she began to experiment.  She reverted back to swapping hands.  Then she tried a few more taps, before going back to hand swaps.  She tapped the block a few more times.  The trainer was still keeping track so each of these taps was counting towards the count of 30 she was looking for.

In the twenties the learner began to be creative.  She tried different ways to move hand over hand.  She’d go back and forth between experimenting with hand swaps and tapping the block.  Finally she reached a count of 30 at which point her handler clicked and reinforced her.  All the extra gunk was also chained in.  Now as the handler kept reinforcing the tapping of the block, the frequency of the hand swapping also skyrocketed.  That behavior was no longer being intentionally reinforced, but it increased right along with the tapping.

Now you may be thinking:  “Well that’s just poor training.  No one is going to jump from a fixed ratio of 5 to one of 30.” My response would be to say that this can happen inadvertently.

Suppose a handler has had a behavior on a high rate of reinforcement. The horse is responding on a consistent basis, but then he’s distracted. He’s no longer offering the same consistent response.  Instead the handler is seeing a string of unwanted behaviors.  Sometimes the horse almost meets criterion, but not enough to click. And then he comes through with the right answer.  The handler captures that moment with a click and a treat.  The question is: what is the long term result of that click? Has the handler just identified a single clickable moment or has she chained in a long string of “junk” behavior?

The horse’s future responses will answer that particular question, but Jesús’ response in general is: if you want clean behavior, you need to train in clean loops.  Kay and I would add that you need to microshape.  You need to learn to set up your training so the behavior you want is the behavior you get.

Here’s a link to a great youtube video of  a PORTL game presented by Mary Hunter.   Many of you will know Mary from her StaleCheerios.com blogs. Mary is president of The Art and Science of Animal Training, the organization that puts on the annual conference of that same name in Dallas TX.  She and Jesús will be presenting a program on PORTL at this year’s clicker Expos.

JOY FULL Horses: Understanding Extinction: Part 6

Cues and Extinction
In Part 2 of the JOY FULL horses posts I wrote at length about cues.  We went through the list of ten things you should know about cues.  That list took us from the basics of cues to some very elegant training concepts.  Cues also play a role in this discussion of extinction.  They have a lot to do with reducing the emotional effect of extinction.

Cues can tell an animal whether or not you’re engaged with him in training.  If your cues say “not now”, he knows he can go take a nap. Kay Laurence has very clear protocols for her training classes. If someone with a dog has a question for her, the handler is first to park the dog.  Parking means the handler anchors the dog to one spot by standing on the leash.  With her hands off the leash, she can now switch her attention away from her dog to Kay.  The dog quickly learns that a parked leash means he doesn’t need to watch his handler closely.  He can take a break from the training conversation.

Teaching “Chill”
With our horses we often forget to put this piece in.  We are usually training by ourselves.  The time in the barn is our time to relax and be with our horses.  It’s only when someone comes to visit that we discover the grown-ups really can’t talk.  Your horse wants to be part of the conversation, as well!  If you abruptly ignore him, that’s when you can get macro extinctions with all of the associated problems. The solution is to teach an equine version of “park”.

The bigger lesson is to become more aware of your body language and the attention your animal is giving to it.  If you see him surfing for answers, intercept the process.  Reset the conversation.  Turn it into a teaching opportunity that gives your learner a clearer idea of what is wanted so you can both avoid the frustration of macro extinctions.

