Thursday, July 29, 2010

Science VS tradition: Training with reinforcers

Prepare for a novel. I intended this post to be just short but of course once I got carried away it got really long, really quick. Get a snack and go to the bathroom first.

Reinforcement. It's something we all use consciously or sub consciously to teach horses, cats, dogs and even other humans things about the world and how it works according to our rules.

Essentially a reinforcer is an environmental change that increases the likelihood that an animal will give a particular response.

There are two major categories of learning: Non-associative and associative. Non associative learning the horse is exposed to a single stimulus which it will become habituated (sensitized) to. Associative learning the animal is exposed to at least two stimuli and a relationship is established.

Non-associative learning the horse is exposed to something unpleasant that may provoke a fear response and it is exposed until it learns to behave passively rather than head for the hills or lash out in fear.
Normally the stimulus would be introduced gradually. Introducing something too fast and the horse may go back to a fearful response. When this happens exposure of the feared item or action must be brought back to a point where the horse has been habituated.
In horse training terms we call this Desensitization.
Of course if a horse is going to react to something fearfully instead of trying to habituate him you could counter-condition him to respond before the stimulus that might provoke the fear arises. This could be used in terms of for example, teaching a horse to step towards a mounting block for mounting before you mount rather than you having to try and pull him back once he has had a bad experience and you have a struggle on your hands when he thinks mounting is about moving away from you.

Sensitization. The opposite of habituation in which the horse will have a heightened response after the repeated presentations of the stimulus. For example a horse walking into a paddock may slip in the mud a few times in a week and at the end of the week the horse may gallop full tilt through gates from now on in fear he might slip again. Sensitization can override something that has already been habituated. An example of this would be a horse habituated to cars driving by it on the road and having a couple bad experiences begins spooking and reacting to the sound of a vehicle's motor even though it was previously habituated to them and showed no response.

When a horse makes an association between a stimulus and a response, or cue and an outcome this is associative learning.
There are two sub categories under associative learning: Classical conditioning and operant conditioning.

Classical conditioning is what most people are familiar with. Classical conditioning is the acquisition of a new response to a new stimulus in association with an old stimulus.
Or for those of you that want me to speak English again it is "Pavlov's dog" an experiment with a bell and meat powder puffed into a dogs mouth with the ring of a bell which would cause the dog to salivate was conducted by Ivan Pavlov. Eventually the dogs would salivate just by hearing the bell, even with the absence of the meat powder. Classical conditioning enables the horse to associate events it has no control over and thus it makes the environment it is in more predictable.
An example of this would be a horse seeing it's owner lay out hay. The horse knows what the hay is and will come from across the pasture to eat. Seeing forage is an unconditioned stimulus in a horse. The horse seeks forage constantly. The urge to come in and eat that hay in a horse is an unconditioned response because horses were designed to eat many, many hours of the day.
The same horse is placed in a different environment where it first hears a creaky door being opened then sees the owner placing out hay. Eventually the horse would come wandering across the pasture the moment it heard the creaky door, even if the owner did not put the hay out right away. The horse has learned to associate a conditioned stimulus, the creaky door and it created a conditioned response, coming in from the pasture even before the hay was set out.
When Pavlov made notes he noticed that dogs in his meat powder experiment would race ahead of the handler to get to the area the experiments were conducted. They would try and put themselves into situations and perform actions that they knew lead to rewards.

Our other sub-category is operant conditioning. An operant response is a voluntary action that brings out a reward. This can be taught in many different ways to horses using food or not using food. Basically it gives an option for the animal to put itself in that situation. An example of this would be a horse learning to let itself out of it's stall. The horse sees the latch (the triggger) performs a response (opening the latch with his mouth) and gets a reward (freedom and possibly food depending on where the hay and grain are kept). The effect of the reward strengthens the response, this is known as reinforcement. Operant conditioning allows a horse to associate events over which it can control. This increases the controllability over the environment which is the big difference between classical and operant conditioning.

Operant conditioning has potential benefits for horses by improving choice. Classical conditioning rewards are associated with stimuli (remember the door creaking associated with hay) and operant they become associated with a response(Freedom, food etc).

Now that we have basic learning types down we have four different types of reinforcement that we can use every day with our horses put into simple terms.

Positive reinforcement, negative reinforcement, positive punishment and negative punishment.

They go hand in hand and both are a very vital part of a horses training regime. Reinforcement, positive or negative will always make a response or behaviour more likely to happen in the future. On the opposite end of the scale positive or negative punishment will generally make a response less likely in the future.

First I will talk about negative reinforcement because it is the most widely used in the horse world. It sounds bad doesn't it? Negative is a word that is used so often to associate us with things unpleasant so we think "Gee if we use negative reinforcement on our horses we are going to be going all backwoods cowboy on them right?" Not at all.

When we first get on our horses and squeeze our legs on their sides to get them to move, that was taught using negative reinforcement. Negative reinforcement is putting an undesired stimulus into a horses environment, such as legs squeezing on the horses sides, to attain a desired result, the horse walking forward, before you release the stimulus or take your legs off the horses sides.

So I bet now you can think of a hundred other things we have taught horses to do using negative reinforcement. It is a lovely tool so long as you are being fair to the horse. By fair I mean you apply the stimulus and then take it away when the horse responds. Keeping it on is what makes a horse dull to your cues. You want to be quick about taking a stimulus away just a second or two after the horse responds correctly.

Positive reinforcement is shaping a desired behaviour by putting something positive into the environment, such as hay, grain, treats etc.

I taught Indigo how to "smile" by reinforcing the behaviour with a verbal cue and treats.

But not all things you teach with positive reinforcement are tricks and sometimes they involve negative reinforcements too. Many behaviours can be taught or reinforced positively to attain more willing results.
For example a year ago a young Morgan mare I was training was having a hard time learning how to set up for halter classes. She would stretch but only if you really got after her and moved her hooves with your feet, which is a big no-no in halter/showmanship to touch your horse. Once you stretched her moving her feet manually she would hold it but only for a short wile before moving. I brought her out and every time I moved her feet I would praise and offer a small tidbit of treat and then walk her ahead instead of just re-streching her time after time when she moved. I stretched her, treat, rinse repeat. A light bulb went off in her head and all of a sudden that mare stretched out like she had been doing it all her life. Fast forward to a show a couple months later when it came time to do showmanship. She stretched out when I cued her and stood there for 15 mintues without moving a muscle! No treats involved. The behaviour had been shaped positively and she knew the positive results it had resulted in before, thus offering it to me without a second thought. She stood quieter in the class than her brother twice her age right next to her.

Positive and negative punishment always make a behaviour less likely to happen in the future. They are both applied as a consequence to a behaviour and are operant conditioning.
Positive refers to something added to the horses world and negative refers to something being removed.

Using positive punishment a trainer might make a horse move away from his space using a rope or crop tapping on a part of his body. The next time the horse is tapped he will be more likely to move away from the rope or crop.
Negative punishment a horse will respond in a way he thinks is appropriate and what is making him respond is removed, for example a horse pawing at his stall because he wants hay and the hay being taken out of his view, is negative punishment.
When I talked about not treating a horse when he begs when you are teaching him tricks, this is negative punishment. It stops begging because the horse learns no matter how many tricks he does unless I ask, there is no reward.

Contiguity states that events close to a behaviour will become associated. So for example giving a horse a cookie two minutes after he gave you his foot he will not create a useful association. The cookie would have to arrive seconds after the hoof was lifted in order to create that useful association.
Of course time between a stimulus is not necessarily the most important factor in the association. Events far apart can be triggered by a high predictive link between the two. An example of this is a horse that previously colics because of the ingestion of a bitter plant. The horse would later relate the scent and taste of the plant to the sickness and avoid it.

Horses are also known to learn far quicker when given positive things to look forward to rather than negative. In studies a horse will learn to navigate a maze. Subjects taught to go through the maze using food as a positive reinforcement made their choices quicker than a horse being taught to go through the maze using a shock negative reinforcement. The horses being taught to avoid the shock with positive reinforcement made their choices quicker than the horses simply let to figure out the shocking on their own. Punishment can hinder creativity in horses, possibly because they have learned in the presence of human handlers there can often be punishment as results because of the handler not knowing what the horses behaviors mean. So keep in mind what you ask your horse to do. Always try and be fair and give the horse a chance to give you an answer before you go about using negative punishment.

Both reinforcement and punishment are used weather or not we know it, hopefully now you all do. A lot of trainers try nor to use the word "negative" in fear it will create the image they are abusing horses but in reality it is the complete opposite. They are both extremely useful training methods and it up to us to know when and how to shape behaviors to attain a happy, healthy and sound relationship with a horse using these methods.

Next science vs tradition I will elaborate on using positive reinforcement with treats to shape some behaviors.
So what have you used? I bet we can all think of many scenario's for each type of reinforcement that have helped us shape the horses we have today.


Michaela said...

Wow! That's a lot of info. I'm going to need to read it several times to digest it. Have you ever read The Truth About Horses by Andrew Mclean? It is a very interesting read as it goes into detail how horses learn. Another thing it touches on is another example of (operant conditioning? positive reinforcement? I'm still clueless, lol). Anyway, it is a Pavlov experiment concerning horses. You need a clicker, target (like a frisbee or supplement lid), and treats. It's really simple and you might have heard of it already but you touch the target to your horse's nose at the same time as clicking. Then you give the horse the treat. You have to repeat this several times but eventually when the horse hears the clicker he will search out the target and touch his nose to it in order to get a treat. I do it with my horse, Indigo, all the time. Now he knows what the clicker means though I always start a session with a little refresher course. It's really fun as it's a good way to spend time with your horse other than riding. Indigo will chase after me in the ring and even follow me over jumps just to get his treat.

And BTW, I have had a Dr. Cook's bitless bridle for some time now but at my old barn my trainer gave me a really hard time about that and keeping my horse barefoot. I left that barn almost a year ago and never bothered with the bitless again, but I randomly cleaned it and used it again the other day and was very impressed with the results (again). Indigo is very forward and will throw himself all over the place and it doesn't help that we only have a small ring to use. With the bitless he didn't have to run around like crazy to escape the pain from the bit. And it's not like I even used a harsh bit ever. I knew he had a sensitive mouth so I only ever used Happy Mouths. But the other day he was calmer, still quick, but easier to slow down. I popped him over a crossrail and just like Dr. Cook claims on his website, he was able to stride himself better. I couldn't resist raising the jump until we eventually ended on 2'9. He would have been willing to go higher but I decided enough is enough. I am going to use it again today and I hope maybe eventually he won't act out on the trails as much if he's wearing it.

Sydney_bitless said...

I do have that book, the truth about horses. It is very informative.

I have also done target training with my pony. I used cones, a clicker and carrot bits. I would have him touch the cone (sometimes he tried to cheat and wouldn't actually touch it before turning to see if I was going to click) he was super smart and would canter over to the cone, touch it and race back to get his carrot piece.

Michaela, your story does not suprise me. Many people have issues even with "mild" bits the truth is the horse is responding to pain, not understanding. I am glad bitless is working out a second time for you!

Golden the Pony Girl said...

Nicely done! I think it is really important for trainers and riders to know what quadrant each technique they use falls under. I no longer care about training personalities or philosophies. I like to stick to the science of behaviorism. I am glad you are sharing your knowledge. This stuff is important for anyone working with an animal to understand.

Jen said...

Oh I love all the psychology stuff! I am a big proponent of using positive reinforcement as much as possible. I do punish if it is necessary, but have noticed that the horses do not "mind" punishment at all as long as it is administered fairly in a way they understand (and less is more there, definitely). I try too, to recreate the scenario right away (if possible) and give them the chance to offer the correct behavior.
Working in an elementary school has really helped my horsemanship; it's bizarre how many parallels there are (probably why kids - particularly girls - get along so well with horses). Most of the time, especially with my girls, all I have to do is say the name with the appropriate voice inflection to correct or stop a behavior. Like yesterday as we were trimming Sara's feet, Bella reached over the side of the stall intending to bite Sarabear on the boo-tay (the naughty girl). All it took was for me to say "Bella, don't you DARE"; she withdrew her head, looking quite abashed (which was for show as I know good and well she was only sorry she got busted *grin*).
I alternate mostly between treats and praise depending on what we're doing; the need to punish is actually pretty rare. Horses are, essentially, natural born "pleasers".
Great topic for discussion Sydney...hope I didn't ramble too much!

juliette said...

Sydney - I said it before of your comedic I will say it of your training insight...You are one smart cookie!!! Thanks for all the information. It will take me some time to process all of this. I have to figure out what I am doing treat-wise with the boys. I don't think I am using treats well all of the time. I have had some successes with Sovey during grooming and tacking up, but I do think I have created a treat monster. Very bad on my part.

Blog Widget by LinkWithin