understanding of operant conditioning

Operant Conditioning

Operant response - a response or behaviour of an organism that is voluntary and not associated with a particular stimulus. This response acts on or modifies the environment. For example, e.g. a person picking up a book to read.

Example: A child voluntarily switching off the lights before leaving a room to save electricity, without

being asked.
1-Reinforcer - the reward given for a response in order to strengthen it and increase the likelihood of the

response occurring again.

Example: A fitness app sends a congratulatory badge and a discount code for completing a 30-day

workout challenge, encouraging the user to maintain their exercise routine.

Example: A student gets a chance to sit in a special "comfort chair" in the classroom for consistently completing their homework on time.

Example: A gardening enthusiast sees their flowers blooming beautifully after consistently watering

and caring for them, motivating them to continue.

Example: A person receives a surprise upgrade to first-class on a flight after frequently traveling with

the same airline.

I-Positive reinforcer - a reward which strengthens a response by providing a pleasurable consequence

such as praise or a chocolate bar.

Example: A teacher gives a student a “golden pass” to skip a test for achieving perfect attendance in

the semester.

Example: A manager praises an employee in front of the entire team for coming up with an innovative

idea during a meeting.

II-Negative reinforcer - a reward which strengthens a response by removing or reducing an unpleasant stimulus such as taking away a house chore or homework.

Example: A parent stops reminding a teenager to clean their room after the teenager starts doing it

regularly.

Example: A company removes mandatory overtime for employees who consistently meet their project

deadlines.

Schedules of Reinforcement

This refers to the frequency in which a response is reinforced in operant conditioning. There are different schedules of reinforcement within this type of learning.

1-Continuous reinforcement - when a satisfying response is reinforced every time.

Example: A dog gets a treat every time it sits on command.

Example: An app gives a notification every time a user completes a daily task, like a streak reminder.

2-Partial reinforcement - reinforcement which does not occur continuously. The reinforcement may

be administered in the following ways:

I-Fixed ratio schedule - a satisfying response is reinforced after a set number of responses have been made.

Example: A factory worker gets paid after assembling 10 products.

Example: A coffee shop gives a free drink after every 5 purchases through a loyalty card.

II-Fixed interval schedule - a satisfying response is reinforced at regular time intervals.

Example: A TV show airs a new episode every Friday at 8 PM, reinforcing viewers to tune in weekly.

Example: A student gets a 10-minute break after every 2 hours of focused study during exams.

III-Variable ratio schedule - a satisfying response is reinforced at irregular intervals, but the average number of responses is fixed.

Example: A gamer earns a rare item in a video game after defeating a random number of enemies.

Example: A street performer receives tips from the audience at unpredictable times but continues performing because it averages out to consistent earnings.

3-Variable interval schedule - a satisfying response is reinforced at random intervals within a fixed length of time.

Example: A fisherman catches fish at random intervals, sometimes after 10 minutes, sometimes after 30 minutes.

Example: A quality control inspector checks products at random times during the day, ensuring workers maintain consistent quality.

Punishment

Punishment differs from negative reinforcement in that it aims to decrease the likelihood of the response occurring. Punishment is the introduction of an unpleasant stimuli such as a hit or yell, whereas negative reinforcement is taking away the unpleasant stimulus to increase the probability of the response occurring.

Potential punishers are any consequences which might lead to a decrease in the response. Some consequences may be punishers for some people but not others.

Side-effects of punishment include aggression, frustration, avoidance learning, escape learning and learned helplessness. The punishment may not decrease the behaviour at all but teach the child to be aggressive or avoid the punisher. Sometimes the punishment ends up being positive reinforcement or only serves to satisfy the frustration of the punisher.

Effective punishment should address the person's actions and not the person's character. It should be related to the undesirable behaviour and it should consist of penalties or response cost (the removal of a reinforcer) rather than psychological or physical pain.

Negative Effects of Punishment

The side effects of punishment include;

Major Components of Operant Conditioning

Here we are describe the main principles of Operant Conditioning are as follows:

1-Extinction

    Extinction occurs when a behavior that was previously reinforced no longer receives reinforcement, causing the frequency of that behavior to gradually decrease. Over time, as the organism stops receiving rewards for the behavior, the behavior becomes less frequent and eventually fades away. Extinction highlights the importance of consistent reinforcement for maintaining behavior.

Example: A teacher used to praise a student every time they gave a correct answer in class. However, when the teacher stopped praising the student, their enthusiasm for participating in class decreased, and they started participating less. This is an example of extinction, where the behavior (participation) gradually fades away because reinforcement (praise) was stopped.

2. Stimulus Generalisation

       Stimulus generalisation happens when an organism responds to a new stimulus in the same way it responds to a previously reinforced stimulus. This occurs because the organism has learned to associate a specific response with a certain stimulus, and it extends this learned response to similar stimuli. Stimulus generalisation helps the organism apply learned behaviors in new, but similar, situations.

Example: A child learned to give the command "sit" to their dog, and the dog would sit and receive a treat. Later, the child tried the same command with another dog, and that dog also expected a treat. This is an example of stimulus generalisation, where the learned behavior (the dog sitting) is applied to similar stimuli (another dog).

3. Stimulus Discrimination

     Stimulus discrimination occurs when an organism learns to respond to a specific stimulus and not to other similar stimuli. This process involves differentiating between stimuli that are associated with reinforcement and those that are not. The organism becomes selective, responding only to the stimulus that reliably signals reinforcement.

Example: A dog was trained to sit when it heard a specific bell sound, and it would receive a treat. However, when it heard other bell sounds, it did not respond. The dog learned that only a particular bell sound led to reinforcement. This is an example of stimulus discrimination, where the dog responded only to a specific stimulus (the particular bell sound) and ignored others.

4. Spontaneous Recovery

       Spontaneous recovery refers to the reappearance of a previously extinguished behavior after a period of time, without any reinforcement. Even though the behavior was initially suppressed or stopped, it may resurface unexpectedly. This indicates that the behavior was not fully forgotten, just temporarily inhibited, and can return under certain conditions.

Example: A child stopped putting away their toys after they no longer received rewards for doing so. However, after a few months, the child started picking up and putting away their toys again, even though no rewards were given. This is an example of spontaneous recovery, where the extinct behavior (putting away toys) reappears after a period of time without reinforcement.

5. Shaping

     Shaping is a process used to teach complex behaviors by reinforcing successive approximations of the desired behavior. Instead of requiring the organism to perform the full behavior right away, the process involves rewarding small steps that gradually bring the organism closer to the final goal. Shaping allows for the step-by-step development of complex behaviors by reinforcing intermediate actions that are closer to the desired outcome.

Example: A dog trainer wanted to teach a dog to "roll over." First, the trainer rewarded the dog for simply lying down. Then, the trainer gradually guided the dog through the steps, rewarding it each time it moved closer to rolling over. Eventually, the dog learned the full "roll over" behavior. This is an example of shaping, where behavior is taught through gradual steps, reinforcing closer approximations of the desired behavior.

Operant Conditioning in Practice use Examples

Examples of Operant Conditioning in Practice:

1-Animal Training

       The principle of shaping is often employed in animal training to teach animals specific behaviors. In this process, animals are reinforced for behaviors that progressively move closer to the desired goal.

For instance, during guide dog training, dogs are reinforced for small actions such as sitting, staying, or following directions, and gradually, these small behaviors build up to the complex tasks required for guiding people with visual impairments.

2-Behavior Modification

   Behavior modification involves using reinforcement (either positive or negative) to encourage desirable behaviors while withdrawing reinforcement to eliminate undesirable ones. This technique is commonly used in therapy and educational settings to help individuals modify their behaviors.

For example, a child might be positively reinforced with praise for completing homework on time, while inappropriate behaviors like interrupting others might be discouraged by the withdrawal of attention.

3-Token Economies

       A token economy is a system where individuals are rewarded for appropriate behaviors with tokens, which can later be exchanged for privileges or rewards. This method is often used in schools or therapeutic settings to encourage positive behavior.

For example, in a primary school, children might receive gold stars as tokens for good behavior, and these stars can later be exchanged for rewards like extra computer time, additional playtime, or other privileges, reinforcing the importance of following rules and positive actions.

Classical Conditioning vs Operant Conditioning

The differences between classical conditioning and operant conditioning are as follows;

Elements	Classical Conditioning	Operant Conditioning
Role of learner	Passive	Active
Timing of stimulus and response	Reinforcement occurs before the response	Reinforcement occurs after the response
Nature of response	Automatic; The Involuntary Response (salivation) depends on a reinforcement (meat powder) being presented	Voluntary and Involuntary Reinforcement (food pellet) depends on response (press lever) being made

Learning Theory

This resource introduces Psychology students to some of the concepts of learning theory. Start by reading Learning Theory, an article by Bob Boakes, McCaughey Professor of Psychology and current Head of the Department of Psychology at the University of Cambridge, UK and at Harvard University, USA (located in the resources section).

His main research interests at present concern nausea-based conditioning in both rats and in cancer patients receiving chemotherapy, as well as the history of psychology. His article will help you become familiar with the terms used by Psychologists when discussing theories of learning.

Learning by Insight

Learning in humans appears to be more than a simple stimulus-response process. It involves cognition or the processing of knowledge.
Learning by insight results in a cognitive change which involves the recognition of previously unseen relationships. This can occur very quickly and the solution is not easily forgotten.
The main stages of insight learning are:

Preparation: This involves formulating the problem and gaining information about it.
Incubation: This is when you leave the problem for a while and consider other things. There is a pause in the learner's activity where the learner stops trying to complete the task.
Illumination: This involves insight into the problem. It is often referred to as the 'Ah-Ha experience' as the learner is suddenly able to carry out the task following the confident recognition of the solution. It is as if the light bulb is suddenly switched on.
Verification: This is when you test and evaluate possible solutions. If solutions do not work, you may go back to the incubation or preparation stages.

· Research into Learning by Insight

· The earliest research on this type of learning involved a chimpanzee called Sultan. Kohler (1925) locked Sultan in a cage and placed a banana outside the cage just out of Sultan's reach. Inside the cage were two hollow bamboo sticks, both too short to reach the banana.

Kohler observed Sultan trying to reach the banana, first with his arm then with each stick. He observed Sultan becoming more and more frustrated. After a while, Sultan began playing with the sticks and suddenly realised that they could be joined together to create one stick, long enough to reach the banana.

Kohler concluded that Sultan had a flash of insight which lead him to the solution. Therefore, after a temporary period of confusion and frustration, the chimpanzee was able to recognise the solution and apply it to his problem. Kohler then referred to this type of learning as insight learning. It is also referred to as insight thinking.

Sultan fitting two sticks together

· Learning Set

· Learning sets refers to the positive transfer of learning that occurs from one learning situation to another similar learning situation.
As a result of solving previous problems, rules and habits are established which help when tackling a new problem.
For example, a person learns how to play a particular card game which involves certain skills and then decides to try to learn a similar game. Once the first game is learned, the time taken to learn a new similar card game is faster as the person can transfer skills they already have.

Mohopes

learning site

Mohopes Teaches Intellectual Fields