Self-Reward, Prey Drive, and the Behavioral System Behind Real Training Decisions
“For example, a Purely Positive advocate might post videos of his dog calling off rabbits on cue. He uses those videos to make his point, and they look impressive. But look closer: his dog isn’t particularly high-drive. It has never experienced the thrill or success of actually catching or chasing live prey. So for that dog, a guaranteed handler reward easily outweighs a reward it has never tasted. In high-level performance training, we do the opposite. We deliberately intensify the dog’s drive through the roof for the end reward. At that point, any reward I can offer from my hand is worth maybe ten percent of what the dog can earn by blasting through and self-rewarding on the prey. That’s the part many all-positive advocates never seem to consider. We can — and should — train with as much positive reinforcement as possible. But sooner or later, we have to set a clear boundary: self-reward is not an option in this moment. Because the thing the dog is trying to reward itself with is the absolute highest-value reward that exists in its world.” – Armin Winkler, Rivanna K9 Services
An Illustration
The morning sun was still low when the young working dog locked onto the rabbit. One second he was trotting at Heel like he’d done a thousand times in drills. The next, every muscle in his body fired at once. Ears forward, back straight, he launched across the field in a blur of drive and instinct. I gave the recall cue—clear, practiced, the same one we’d reinforced with toys and praise in the yard until he was rock-solid there. He never broke stride. The rabbit was already twenty yards ahead, and the dog was committed. The self-reward waiting at the end of that chase was worth more than anything I could offer from my hand or pocket.
This isn’t a failure of training. It’s the system doing exactly what it evolved to do. Behavior is not the action you see—the explosive chase or the clean call-off. Behavior is the organized activity of the entire organism across time, shaped by biology, history, current state, environmental opportunities, and the options available in that exact moment. What you witness at the field edge is only the final output of a much larger process. I’ve watched this play out hundreds of times across different dogs, different drives, and different contexts. With dogs chasing prey, some turn away from prey without drama. Others treat the opportunity like it’s the only thing that matters in their world. The difference isn’t usually “obedience” or “disobedience.” It’s the interplay between the dog’s natural behavioral system, the load we place on it through training, and the conditions present when the decision point arrives.
What You See Might Not Be What You Get
Consider the videos you sometimes see online—dogs that appear flawless at calling off rabbits or squirrels using nothing but a marker and a treat. The dogs look happy, the handlers look confident, and the results look effortless. Those dogs are real. Their success is real. But look closer at the dogs themselves. Many are lower-drive animals that have never experienced the full payoff of a successful chase. They’ve never tasted the adrenaline surge of closing the gap, never felt the satisfaction of a live catch, never learned that blasting through produces its own profound reinforcement. In those cases, the handler-delivered reward still competes effectively because the alternative self-reward has never been fully realized or valued at the highest level. The outcome is a staged event, not a real world encounter.
Now put that same scenario in front of a high-drive performance dog whose training has deliberately intensified the predatory sequence, or even a pet dog that has been allowed to chase live prey animals. We build drive through structured prey work—tug, flirt pole, hidden sleeves, controlled chases—until the end reward becomes the absolute pinnacle of what exists in that dog’s world. Or we build drive by allowing pets to chase wildlife. The handler’s toy or food, no matter how excellent, registers at maybe ten percent of the value the dog can earn by ignoring the cue and committing to the chase.
At that point the math changes. Positive reinforcement still works beautifully in controlled, low-arousal settings. But sooner or later the real test arrives: the live rabbit appears, the drive is through the roof, and the dog has to choose between a familiar handler reward and the ultimate self-reward it has been conditioned to pursue at full intensity.
This is where the distinction between action and behavior matters most. Learning theory—operant and classical processes described by researchers like B.F. Skinner and Ivan Pavlov—explains how consequences change the probability of future actions. It is excellent at building habits and strengthening responses under consistent conditions. But learning theory operates at the level of actions, not the larger behavioral system. It does not create capacity when the system is under different load. It does not override state-dependent constraints. A dog can have a rock-solid recall in the backyard and still be unable to access that same action when arousal, opportunity, and the prospect of self-reward align in the field. The learning didn’t disappear. The conditions changed the system that makes the action possible.
Ethologists have understood this for decades. Konrad Lorenz, Niko Tinbergen, and others mapped the fixed action patterns and releasing stimuli that organize predatory behavior across species. In domestic dogs those patterns have been shaped by selective breeding, yet the underlying sequence—orient, stalk, chase, grab, kill, consume—remains intact in many lines. Raymond and Lorna Coppinger’s work on canine origins and working roles shows how different breeds have been sculpted for different pieces of that sequence, but the drive itself is still there, ready to be expressed when conditions allow. Ádám Miklósi’s research on dog cognition and evolution further demonstrates that dogs remain highly attuned to environmental opportunities even while living in close partnership with people. The predatory system doesn’t vanish because we teach Sit-stay or recall; it waits for the right combination of internal state and external trigger. In high-level performance work we lean into that system rather than trying to suppress it. We intensify drive because we want the dog to work at the outer edge of its capacity—whether that’s protection sport, herding trials, or any other demanding role. We often also do this with our pets, not realizing what is happening. The dog learns that the chase is the ultimate payoff. In some dogs, they don’t need to rehearse this at all, and provided the proper stimulus, the chase is on.
That is not a flaw in the training; it is the point. But once the dog understands that self-reward is available and extraordinarily valuable, the trainer’s job expands. We must also teach that self-reward is not always an option. That is the boundary. And boundaries, in practice, sometimes require a clear correction to enforce the rule at the exact moment the system is testing it.
This is not about labeling dogs “stubborn” or “dominant.” It is about reading the conditions correctly. The issue is not whether the dog knows the command. What matters is whether the dog can access the action under the current state of arousal and opportunity. When drive is sky-high and self-reward is on the table, the organized behavioral sequence narrows. Options collapse. The constrained output becomes the chase. A well-timed correction at that precise point is not punishment in the emotional sense; it is a consequence delivered at the moment the system is open to receiving it. It resets the sequence, prevents the self-reward from reinforcing the wrong choice, and keeps the larger behavioral structure intact.
In the field with a high-drive dog, the predatory arousal is adaptive activation—part of the dog’s natural regulatory system—until it exceeds the dog’s capacity to maintain organized behavior under handler guidance. At that edge, the system, if unchecked, can slide into breakdown, where sequencing degrades and control narrows further. The correction, used judiciously and timed correctly, helps prevent that slide by restoring structure before the system fully unravels.
This principle applies in the field with prey: conditions matter more than labels. This is why blanket statements about training methods miss the point. Positive reinforcement is a powerful tool. In many contexts it is sufficient, even ideal. But it does not address every layer of the behavioral system. When we intensify drive for performance, when we deliberately make the self-reward the highest value available, we create a situation where the dog’s natural behavioral tendencies are in full expression. At that point the trainer’s responsibility is to teach the boundary with equal clarity. The dog learns that some opportunities for self-reward are off-limits, no matter how tempting. That lesson protects the dog, protects the handler, and preserves the partnership. Not all prey are bunnies and squirrels, sometimes they are rattlesnakes, elk, or bear. Corrections are ethically justified to prevent injury to the dog, handler or other animals.
Owners of high-drive dogs often struggle here because the videos make it look simple. They see the low-drive dog call off effortlessly and assume their own dog is broken when it doesn’t. The issue is not the dog. The issue is that the conditions and the dog’s internal system are not the same. What matters is matching the training approach to the actual behavioral capacity present in the moment. For some dogs, in some contexts, positive methods alone get the job done. For others, especially those bred and trained for high performance, the boundary has to be set explicitly. Corrections, when used as part of a complete, fair, and well-structured program, are not cruelty—they are part of teaching the dog the full set of rules that govern its world.
In practice, the best trainers I’ve observed move fluidly between layers. They use positive reinforcement and build motivation and clarity. They use environmental management to control variables. They use timing and clear communication—including corrections when necessary—to maintain structure when the system is under load. They watch the whole sequence, not just the final action. They ask: Is the dog in a state where the desired action is accessible? Is the environment providing competing opportunities for self-reward? Is the load balanced with sufficient recovery? Those questions come from understanding behavior as a system rather than a checklist of commands.
The rabbit will always run. The high-drive dog will always feel that pull to chase prey. Our job is not to pretend the pull doesn’t exist or to shame the dog for feeling it. Our job is to understand the system that produces the pull, shape the conditions that make the right choice possible, and set the boundaries that keep the partnership intact. When we do that, training stops being a battle of wills and becomes something far more interesting: a genuine conversation between two species who have been figuring each other out for thousands of years.
References
- Breland, K., & Breland, M. (1961). The misbehavior of organisms. American Psychologist, 16(11), 681–684.
- Miklósi, Á. (2015). Dog behaviour, evolution, and cognition (2nd ed.). Oxford University Press.
- Coppinger, R., & Coppinger, L. (2001). Dogs: A new understanding of canine origin, behavior, and evolution. University of Chicago Press.
- Beerda, B., Schilder, M. B. H., van Hooff, J. A. R. A. M., & de Vries, H. W. (1999). Chronic stress in dogs subjected to social and spatial restriction. II. Hormonal and immunological responses. Physiology & Behavior, 66(2), 243–254.
- Hennessy, M. B. (2020). Psychological stress, its reduction, and long-term consequences: What studies with laboratory animals might teach us about life in the dog shelter. Animals, 10(11), 2061.
- Armin Winkler, Rivanna K9 Services
- Parts of this article involve the use of AI