After reading this article you will learn about the process of learning by connections or associations.
The doctrine of associationism or knowing the world by forming connections can be traced back to Aristotle. According to him, every experience in life is learned and remembered due to the laws of association – the law of repetition, law of contiguity and the law of similarity. Philosophers like Locke and Hume added to these laws, the principles of ‘reward’ and ‘punishment’.
The first psychological research concerned with associative learning was conducted by E.L. Thorndike on animals. Later Ivan Pavlov’s classical conditioning and Skinner’s operant conditioning became crutches on which the connection model stood. Other theories like Guthrie’s contiguity theory, Hull’s drive reduction theory and Lorenz’s imprinting theory, though less popular, gave a big boost to this model because of their potentialities for application.
Trial and Error Learning Theory:
According to Thorndike the most characteristic form of learning in both lower animals and man is trial-and-error learning. For instance, a person who is learning how to ride a bicycle will have to discard many wrong movements until he learns how to ride it perfectly. The principle underlying this process of learning is called trial-and-error.
Thorndike conducted experiments with a cat in a puzzle box (Fig. 9.2). A hungry cat was kept in the box and a piece of fish was kept outside the box. The cat was kept hungry and the fish was kept outside to make sure that the cat would be sufficiently motivated into action. The box was built in such a way that its door could be opened by pulling a latch by the mouth. Thorndike gave a detailed account of the cat’s activities.
The cat placed in such a situation becomes restless and makes all kinds of attempts to get out like clawing, biting the bars, shaking all the parts of the box, etc. This kind of random activity, according to him, characterizes the initial stages of trial-and-error learning. After some time, by sheer chance, the cat pulls the latch which opens the door.
The cat comes out and was allowed to eat a little bit of the fish. It was again taken and put into the box for a second attempt. During the second attempt also it was found that the cat goes through the restless and random activities as before. In the same manner the experiment was continued for several attempts.
Thorndike noticed that as the number of trials increased, unnecessary and irrelevant activities of the animal were reduced. Gradually, the errors decreased till at last after a number of trials the animal learned to operate the latch straightaway.
Thorndike conducted similar experiments with other animals and summarised the process as follows:
Firstly, the animal exhibits random activity. Secondly, there is a gradual reduction of useless movements. Finally the animal learns the trick or operates the latch successfully.
From such investigations, Thorndike formulated the law of effect and other laws of learning:
Law of Effect:
The activities of the cat suggested that the cat does not catch on to the method of escaping the cage, but learns to operate the latch by a gradual stamping in of correct responses and a stamping out of incorrect responses.
Responses followed by a rewarding state of affairs, like food in this experiment will be strengthened or stamped in as a permanent response to the particular situation, while responses which are unsuccessful will be weakened or stamped out. Thus, reward provides a mechanism for selection of the responses.
Thorndike’s experiments on animals had a very profound influence upon the understanding of human learning. He was convinced that learning, to a very large extent, can be explained by a series of connections of stimuli and responses bound by the principle of effect. A comparison of the learning investigations on human subjects with animals led him to believe that the phenomena disclosed by animal learning are also fundamental to human learning.
Other Laws of Learning:
While the law of effect embodies the main principle explaining the process of learning, Thorndike also enunciated some other laws which spelt out the roles of certain other factors in the process of learning. The first among these was the law of exercise. This law states that the more the number of repetitions or the number of times the two elements stimulus and response in a bond or connection occurred together in a learning situation the stronger is the connection between the two.
There is a direct relation between the repetition and the strength of the bond. The law of exercise has two components the law of use and the law of disuse. The law of use emphasises the positive role of repetition, the tendency for the strength of the bond to increase and thereby increase its probability of recurrence in proportion to the number of repetitions.
The other component of the law of exercise the law of disuse states that the strength of the bond decreases proportionately with the non-use of a particular bond or a connection over a period of time. The less the ‘use’ the lesser is the strength of the bond and consequently a decrease in the probability of recurrence. In addition, Thorndike also formulated another law bringing in ‘time’ as a variable.
This is known as the law of recency. According to this law, the time lapse between the present and the last occurrence of a particular ‘move’ or bond is also an important fact. Other things being equal, a particular move or bond which has occurred more recently than others has a greater possibility of recurrence.
The number of times a particular move is repeated also bears a relation to the strength of the learning. Other things being equal, greater the number of repetitions, the greater is the strength of learning. It may be seen that to a certain extent, all these laws refer, more or less, to the same factor, i.e. the amount of practice over time.
Sometimes the term law of exercise is employed to include both frequency and recency. It may be seen that indirectly the law of frequency is referred to use and the law of recency to disuse, frequency to movement and recency to time.
Classical Conditioning Theory:
Classical conditioning was discovered almost as an accident by Pavlov who was investigating the digestive process.
Since animals salivate when food is placed in their mouths, Pavlov inserted tubes into the salivary glands of dogs so that he could measure the amount of saliva they produced when given meat powder. In the midst of this simple measuring experiment, Pavlov noticed that the dog salivated even before the meat powder was in its mouth. The mere sight of food made its mouth water.
In fact, so did the sound of the experimenter’s footsteps. This aroused Pavlov’s curiosity. His dogs always salivated when food was placed in their mouth. This was not acquired knowledge, their mouths watered naturally.
However, what could not be easily explained was that the dogs salivated at the sound of the footsteps or even before food appeared or the sound of a bell if it was presented on a few occasions before the food. Pavlov, therefore, became interested in how dogs learned to salivate when food was not presented.
He planned an experiment in which he sounded a bell just before the food was brought into the room. A ringing bell does not ordinarily make a dog’s mouth water. But after hearing the bell, repeated each time before getting the food, Pavlov’s dog began to salivate as soon as the bell rang.
The dog had learned that the bell signaled the appearance of food and its mouth watered due to this signal, even if no food followed. The dog had been conditioned to respond with salivation to a new stimulus-the bell which had not caused salivation earlier.
Classical conditioning can be categorized into four basic elements. The first element is an unconditioned stimulus (UCS) like meat powder which ordinarily makes the animal react in a particular way, e.g. salivating in this case. The reaction of salivating is the second element, the unconditioned response (UCR) which occurs whenever the unconditioned stimulus is presented, i.e. when the dog is given meat powder its mouth waters.
The sound of the bell which was originally a neutral stimulus and did not produce the salivary response in the beginning but subsequently came to produce the response is the third element, the conditioned stimulus (CS). The sound of ringing may cause the dog to perk up its ears on the first trial, but it will not make the dog’s mouth water.
A certain number of trials or simultaneous presentations of the ringing sound and the food are necessary to bring about the desired response. Such a desired response or particular behaviour that the dog learnt to produce in response to the conditioned stimulus is called the conditioned response (CR). This CR, which is salivation in this case, is the fourth element.
Countless number of experiments have been conducted on animals and human subjects to study the process of classical conditioning. Lipsitt’s study on human babies shows how conditioning takes place. Babies who are five to ten days old can learn to blink their eyes on hearing a buzzing sound.
They naturally blink when a puff of air is blown against their eyes. The puff of air is the unconditioned stimulus and blinking, a natural reaction, the unconditioned response. If a buzzing sound – the conditioned stimulus is sounded immediately before blowing a puff of air the babies soon begin to blink their eyes when they hear the buzzer – the conditioned response. They learn to associate the sound with the puff of air.
Operant Conditioning Theory:
This theory of learning was propounded by B.F. Skinner. He devised an apparatus referred to as the ‘Skinner box’ which provides learning situations for rats. Like Thorndike’s cage this box contains a simple box-like device which when pressed by the rat, releases a pellet of food or water.
It is also equipped with devices for recording the number of operations made and the rate of pressing. Skinner in his experiment did not feed the rat for 24 hours. He put the hungry animal into the box and observed its activities. The animal began to explore until it finally hit the bar and the food pellet dropped into the cup.
The rat ate the food and continued its exploration. After hitting the bar two or three times and having seen the food appearing every time it hit the bar, it learned to get food by fixing the box. The food strengthened or reinforced the bar pressing response and the rat pressed again and again.
According to Skinner the experiment demonstrates the fundamental principle involved in most learning activities. First the animal is motivated by hunger. The general exploratory activity which is the result of such a motivation, in turn, results in an accidental activity, which is instrumental in achieving the appropriate goal. Thus, the animal learns this particular response which has been instrumental in getting food.
At this stage Skinner also uses the concept of reinforcement which has been found to be useful in all kinds of conditioning experiments. When the hungry rat gets food, there is reinforcement of the response and only through this reinforcement does the animal learn to reproduce the response.
In other words, reinforcement is an event in this artificial environment or experimental situation which is contingent upon the occurrence of some specified response, and which then maintains the performance of that response. The response operates upon the environment and the environment, in turn, supplies the reinforcement which maintains behaviour.
Thus, the resulting behaviour is said to be operant because the animal performs an operation which releases the food. This is different from the salivary response in Pavlov’s experiment. This is also called instrumental learning because it alone produces the reinforcing event.
Some of the points on which classical conditioning and operant conditioning theory agree and some on which they disagree are described below:
Points on Which They Agree:
(i) Extinction:
Looking at Pavlov’s experiment, the dog, which learns to salivate when it hears the bell, but fails to get any meat powder repeatedly will eventually stop salivating when it hears the bell (Fig. 9.3). Similarly, when Skinner’s rat is not reinforced with a pellet of food, the frequency of bar-pressing will decrease gradually and finally disappear. This is called ‘extinction’ (fig. 9.5).
(ii) Spontaneous Recovery:
However, a few days after the occurrence of the above phenomenon, when Pavlov’s dog was taken to the laboratory, as soon as it heard the bell it began to salivate. The response that had been learnt and then extinguished reappeared on its own with no retraining. This phenomenon is termed ‘spontaneous recovery’.
The response which occurred was only about half as strong as it was before extinction. Nevertheless spontaneous recovery does indicate that learning is not permanently lost through extinction. Spontaneous recovery, the reappearance of original learning after it has been extinguished, has been demonstrated by operant conditioning too. Rats whose bar pressing behaviour has been extinguished, spontaneously started pressing the bar when they were placed in the Skinner box after a certain period of time.
(iii) Generalisation:
The phenomenon of generalisation includes stimulus generalisation and response generalisation. In the course of his investigations Pavlov found that when a conditioned response to a specific stimulus had been established stimuli similar to the conditioned response, though not identical, could provoke the conditioned response without earlier exposure.
For example, a dog conditioned to one kind of bell might also respond to a bell of a different tone or even to a buzzer. This phenomenon is known as stimulus generalisation. Similar phenomena have been shown to occur in operant conditioning. For example, the skill one learns when playing tennis may be generalized to badminton.
However, such generalisations occur most readily for stimuli that are almost similar and belong to the same sensory modality such as two auditory stimuli or two visual stimuli, e.g. for a change from bell to a buzzer rather than bell to a red coloured disc.
Response Generalisation:
Similarly, one can also come across the phenomenon of response generalisation in both types of conditioning. An animal trained to press’ the lever with one foot will, if that foot is tied down or otherwise restrained, presses the lever with another foot or perhaps with the head.
(iv) Discrimination:
Animals and human beings not only have the capacity to generalize but also the capacity to discriminate between various stimuli. The process of learning to distinguish among various stimuli and responding to a specific stimulus or event is called discrimination. In operant conditioning discrimination is taught by rewarding one response and not rewarding the other.
This way pigeons have been trained to peck at a red disc but not a green one. First the pigeon was taught to peck at a disc; then it was presented with two discs, one red and one green. The bird was given food when it pecked at the red one. Eventually the pigeon learnt to discriminate between the two. Similarly, in classical conditioning the dogs and rats were trained to discriminate between various sounds, lights, colours, etc.
(v) Higher Order Conditioning:
An interesting phenomenon relating to classical conditioning is higher order conditioning. Here at the first stage a conditioned salivary response is established with a bell as a conditioned stimulus. Subsequently, another signal, a patch of light is paired with the bell and presented.
After a few trials, the dog exhibited the salivary response in response to the patch of light. It may be seen that the first conditioned stimulus, the bell, served as the unconditioned stimulus. This procedure is known as second order conditioning. This type of higher order conditioning can be carried on to a third and fourth level etc.
In third order conditioning the dogs which learned to salivate in response to the patch of light, also salivate on hearing a click if the click has been paired with the patch of light. In operant conditioning this phenomenon of higher order conditioning has been demonstrated on chimpanzees. Animals were trained to insert tokens in a vending machine which automatically released a grapefruit. After having learnt the reward-value of tokens the chimpanzees were trained to perform simple, instrumental or operant responses.
In one task, securing the tokens involved performance of another task like pulling in a small sliding tray which involved fixing a cord, which, in turn, involved unhooking the cord from a hook and so on. The animals were allowed to continue to work at these tasks to acquire a number of tokens which could be exchanged for food later.
The similarities between classical and operant conditioning are sufficiently impressive to make an assumption that the two processes are basically the same. It is impossible to conduct an experiment involving operant conditioning without any classical conditioning principles overlapping.
The Points on Which They Disagree:
However, various investigators claim that though they overlap classical conditioning and operant conditioning disagree on one or two significant points which make them two different processes. The points on which these two processes disagree are basically concerned with the type of responses and the consequences of the response.
(i) The Stimulus Response Relations:
In classical conditioning a specific stimulus such as food, shock or puff of air is used to elicit a specific response, whereas in operant conditioning the response is not elicited by controlled stimulation but is spontaneously emitted by the subject, e.g. pressing the lever.
In addition, classically condition-able responses are involuntary responses involving the spinal cord or the autonomic nervous system. They are considered to be mere reflexes like salivation, blinking, etc. On the other hand, operantly condition-able responses are voluntary responses involving the central nervous system.
The subject operates on the environment trying various responses which are learnt and are neither innate responses nor reflex actions as exemplified in the cases of chimpanzees performing a series of acts to obtain tokens. Thus, food is said to constitute reinforcement if it follows the act of bar pressing.
Due to the reinforcement the act of pressing the bar is likely to be repeated. In classical conditioning, the term reinforcement refers to any instance where a conditioned stimulus is followed by an unconditioned stimulus and the conditioned response is thereby strengthened. When the sound of the bell is followed by the presentation of food, the potential of the bell alone to elicit salivation is increased.
In short, the reinforcement follows the conditioned stimulus in Pavlovian conditioning and it follows the response in Skinnerian conditioning. Thus, it may be seen that classical conditioning involves the establishment and strengthening of connections between two stimuli unconditioned stimulus and conditioned stimulus.
This is known as S.S. Learning. Instrumental or operant conditioning involves a connection between a stimulus and a response, S.R. learning. In pure operant conditioning the connection involved is between a response and a reinforcement, the stimulus itself being indeterminable. Hence, it is known as R. conditioning the importance being in the connection between Response and Reinforcement.
Consequences of the Responses:
The primary law of classical conditioning is the law of contiguity where connections between stimuli and response get established due to close proximity of the occurrence of US and CS. Thus, classical conditioning is learning by connecting stimuli and stimuli.
This is known as the S.S. theory whereas instrumental conditioning involves stimulus and response followed by reinforcement. This is referred to as S.R. theory. In operant procedure, as its name implies, the subject operates on the environment to achieve some results such as access to food or water, recognition by others, escape from pain or discomfort or some other undesirable circumstances.
So this is referred to as R.R. learning. In classical conditioning the behaviour of the organism is not instrumental in achieving any such results. The organism is unable to change the events of the experiments by its behaviour. Thus, the puff of air, electric shock, meat powder are in accordance with the design of the experiment; the subject’s behaviour does not influence the occurrence of these events. In general the organism in classical conditioning is passive while in operant conditioning it is active and all its behaviour is directed by the results of its responses.
(ii) The Concept of Reinforcement:
Many theorists tried to explain the mechanism which somehow connects a particular response to a particular stimulus and is maintained over a period of time. The explanation of this mechanism differs from theorist to theorist. Thorndike claims that a response will or will not be learned and maintained depending on the pleasant and unpleasant effect of that particular response.
In classical conditioning the mechanism which maintains a particular response or behaviour is ‘contiguity’. And according to Hull ‘drive reduction’ leads to the strengthening of a response. Reinforcing events such as food or termination of shock, ultimately act as drive reducers, i. e reduce the hunger, pain or fear drives. Thus, reinforcement, according to Hull, invariably involves drive reduction.
Skinner, the most notable among the learning theorists did a lot of experimental work in the area of reinforcement and related principles, not only in maintaining but predicting and regulating the response or behaviour. Skinner’s meticulous and elaborate work on reinforcement provides a structure to the concept of reinforcement.
Today, reinforcement is considered as a procedure and as a process too. Skinner claims that as a procedure reinforcement is an event which, either naturally in the environment or artificially by experimental arrangement, is contingent upon the occurrence of some specified response and which then maintains that response.
In this arrangement there is a two way interaction between the individual and the environment. The response operates upon the environment and the environment, in turn, supplies the reinforcement event which maintains the behaviour.
Most psychologists regard reinforcement as a process, something which happens in the individual’s nervous system to make learning occur when a response is followed by reinforcement. It is said to be a neural mechanism which connects a particular response to a particular stimulus.
Positive and Negative Reinforcers:
Procedurally two kinds of reinforcers are distinguished. Positive reinforcers are those whose presentation maintains behaviour, food to a hungry person, status, power and money to a worker, often lead to maintenance of particular responses or behaviour. However, withdrawal or removal of positive reinforcers also has the same effect.
The factors which bring about this withdrawal are termed as negative reinforcers. For instance, when the management in an industry threatens to withdraw the worker’s status and power, it has the same effect. Thus, maintenance of the behaviour is the same as on the presentation of the above reinforcers simply because he does not want to lose them.
Primary and Secondary Reinforcement:
Conceptually two reinforcers are distinguished – primary reinforcers, secondary reinforcers. When a reinforcement requires no prior experience or associations it is claimed to be a primary reinforcement. Here, only the unconditioned stimulus is established as a reinforcer within the context of a particular event. Thus, the primary reinforcers are food, water, sex, etc.
In secondary reinforcement a previously neutral stimulus is established as a reinforcer within the context of a particular event. Thus, a secondary reinforcer is one whose value has to be learned through association with the primary reinforcers. For example, money may be viewed as a secondary reinforcer.
By itself money is just metal or paper, but through its association with food, clothing and other such primary reinforcers it becomes a secondary reinforcer. Children come to value money only after they learn that it will buy chocolates (primary reinforcers); chimpanzees learn to work for tokens which they can insert into a vending machine to get a primary reinforcer like fruits or biscuits. Thus, a primary reinforcer is one which directly reduces a primary drive whereas a secondary reinforcer gets its value by association with a primary reinforcer.
Skinner’s principles of reinforcement were used in a variety of behaviour problems encountered in situations such as industry, schools, hospitals, etc. It was proved to be highly effective and successful.
The two main principles of reinforcement are:
(a) Differential reinforcement, i.e. a routine or pattern of reinforcement which ensures that reinforcement is given only when the desired response occurs in the presence of a particular stimulus configuration and
(b) A schedule of reinforcement, i.e. a routine or a pattern followed in reinforcing the desired response only on certain occasions. Among the schedules of reinforcement various types of schedules were identified and demonstrated by Skinner. The types of schedules are – fixed ratio schedule, fixed interval schedule, variable ratio schedule and variable interval schedule.
These schedules of reinforcement are briefly described below:
a. Fixed-Ratio Schedule:
This type of schedule involves the presentation of the reinforcement not after every correct response, but after a certain number of correct responses according to fixed ratio. According to this schedule, reinforcement is presented, say after every tenth response or every fifth response. A person subjected to this schedule receives reinforcement for each bit of work he accomplishes rather than for every response. The piece-rate system in an industry illustrates this schedule.
b. Fixed-Interval Schedule:
In this schedule reinforcement is presented not after a fixed number of responses but at certain fixed time intervals, say after every tenth minute or every fifth minute. A rat learns to presses a bar to get food but it gets food only for the first correct response in any five minute period. Skinner predicts that in such an experimental situation, the rat would stop pressing the bar after it gets its food, but starts pressing the bar more frequently as the time for getting food begins to approach.
For example, a person who is used to the newspaper being delivered at 7’0 clock will start checking to see if the paper has arrived shortly before 7 O’clock. In this schedule performance tends to fall immediately after the reinforcement and picks up again as the time for the next reinforcement draws near. The payment of salary or wages at regular intervals illustrates this.
c. Variable-Ratio Schedule:
In this schedule the number of responses necessary to gain reinforcement is not constant. The occurrence of reinforcement is neither in any order nor is it continuous. It is unpredictable. Gambling illustrates this schedule very well. In this one can never know when reinforcement occurs.
There is always a chance of hitting a jackpot, so the temptation to keep playing continues. A person on a variable ratio schedule shows less of a tendency to be influenced by the contingencies of the reinforcement and will exhibit a high rate of response over a long period of time.
d. Variable-Interval Schedule:
In this schedule reinforcement of response is presented after varying lengths of time. One reinforcement may be given after six minutes, next after four minutes, next after two minutes and so on. Here the subject learns to give slow and steady responses so as not to miss the reinforcement which may come at any time.
For example, when examinations are conducted at specific intervals, like quarterly, midterm, and annual, students will study just before the examination and stop studying after the examination. But, if the teacher gives a surprise test or a slip test without announcing the specific time and makes it unpredictable, the student has to keep studying at a steady rate all the time because any day could be an examination day.
Points on Which Classical Conditioning Theory and Operant Conditioning Theory Agree and Disagree:
Theory of Avoidance Learning:
Learning involves both acquiring a new response or an ability to do something new and at the same time also involves an ability to learn not to do something which is not adaptive or socially approved. Thus, learning involves both acquisition of a new response or not coming out with an old response.
This second type of learning, learning not to use a response is of two types:
(i) Avoidance learning and
(ii) Punishment learning.
In numerous experiments animals have, within a few trials, learnt to ‘escape’ from a negative reinforcement (e.g., a shock).
On the other hand, in certain experiments where the negative stimulus (e g. a shock) followed a neutral stimulus, a signal (e.g. a light being turned off), subjects learned to avoid the negative stimulus totally by leaving the situation after the neutral signal and even before the shock was given.
In such an experiment, there was no change in the shock condition there was no shock before the response, and no shock after the response. What then motivated the avoidance response? A theory of avoidance known as two-factor theory or two-process theory was put forth to explain this phenomenon.
The processes emphasised in this theory are classical conditioning and operant conditioning. According to this theory, both these processes are necessary for avoidance responses to occur. One unconditioned response to shock is fear. Through classical conditioning, this fear response is transferred from the unconditioned stimulus, shock, to some conditioned stimulus (CS) – darkness, a stimulus that precedes the shock.
After a few trials, a subject would presumably respond to the darkness with fear. This conditioning of a fear response to an initially neutral stimulus is the first process of the theory. Besides being response, fear has various stimulus properties. The sensations that accompany a fear reaction are unpleasant.
A reduction in fear can serve as a reinforcement for any response that follows it. This reinforcement is the second process in the two-process theory. The subject thus leaves the situation even before the actual shock because the preceding darkness is the conditioned stimulus causing a fear response.
However, in some avoidance experiments, subjects learned to avoid a shock even in the absence of any conditioned stimulus to become fear-provoking. This and other limitations of the two-factor theory led to the proposition of the one- factor theory.
In experiments where there was no signal preceding the shock, subjects were able to reduce the frequency of shocks by learning to press a lever, and showed no apparent fear response. It was thus concluded that animals can learn an avoidance response even in the absence of an external conditioned stimulus as a reliable signal for shock.
Both the theories, however, stress the role of the capacity of the organisms to solve the fundamental problems of prediction and anticipation that are the very definition of avoidance learning. Organisms can learn to avoid regularly scheduled aversive events.
Closing the windows of your house before leaving in anticipation of the regular summer afternoon shower, seeing a doctor or dentist at regular intervals, etc., are avoidance responses performed to prevent the un-signalled aversive consequences of failing to make them. Avoidance responses are extremely persistent because the response removes us from the situation and there is no way of determining whether the feared event will actually occur.
Effect of Punishment on Behaviour:
In general, punishment takes the form of some painful stimulation, or the threat of such stimulation, following some responses. The most characteristic feature of a punishment operation is the reduction, at least temporarily, of the strength of the response that is being punished. Yet, it is not the opposite of a reward.
If a response is rewarded, it is apparent to the organism in a typical experiment, that repetition of this response will be rewarding; but if a response is punished, it is not clear to the organism which of the other available responses will be rewarded.
In effect, punishment tells the organism what not to do, but it carries no information by itself which tells an organism what particular alternative course of behaviour should be followed. This is why punishment has not always been found to be effective.
The effect of punishment on learning has been explained by the two-factor theory. According to this theory, the response which is being punished, becomes associated with fear and the organism learns to avoid the response. Nevertheless, there are various factors which play an important role in the effectiveness of punishment.
If one’s goal is to obtain a large, permanent decrease in some behaviour, the punishment should be immediately introduced at its full intensity. Just as the most effective reinforcement is one that is delivered immediately after the operant response, the more immediate the punishment, the greater the decrease in responding.
The effectiveness of a punishment procedure is inversely related to the intensity of the subject’s motivation to respond. Combining a punishment for an undesirable response with a reinforcement for a different, more desirable response, has been found to be very effective.
Nevertheless, punishment can produce a number of undesirable side-effects. It can lead to a general suppression of all behaviours, not only the behaviour being punished. Another practical problem is that individuals may try to circumvent the rules or escape from the situation entirely. It can also lead to aggression.
The general point, however, is that punishment is much more effective when the subject is provided with an alternative way to obtain the reinforcement that has been maintaining some unwanted response. Otherwise punishment by itself cannot be very effective in completely eliminating the undesirable response and much less in producing a desirable response.
Difference between Avoidance Learning and Punishment Learning:
In general terms, avoidance learning involves systematic avoidance of certain situations in order to avoid some noxious stimulus/stimuli associated with those situations. Whereas, punishment learning is the result of a deliberate attempt to reduce the frequency of occurrence of an undesirable response.
Avoidance learning can occur in the absence of direct contact with the undesirable stimulus. Mere fear of encounter with the stimulus causes avoidance of the entire situation. One might avoid using a lift for fear of being locked up in it if there is a power-failure while he is in the lift. This can happen even if he has never used a lift before or heard of any incident to that effect.
The very possibility of the coincidence discourages him from taking the ‘risk’. Punishment, on the other hand, involves direct effect of the stimulus on the individual. A child who is punished for a behaviour or had seen a peer being punished for a similar behaviour, learns to avoid the undesirable behaviour, subject to certain other factors that influence the effectiveness of a punishment.
Education and socialisation are processes directed to lead to avoidance learning rather than punishment learning. There is no certainty that punishment always produces the desired effect. In fact, punishment can often be counter-productive, and result in aggressive behaviour or total` withdrawal or even fear reaction, which can get generalized.
At this juncture a distinction has to be made between learning through punishment and avoidance conditioning in terms of the consequences of a response. If a child learns to be careful and not break the dish because he has been beaten for doing this, this could be explained as learning through punishment. But if the child refuses to carry or touch dishes for the fear of breaking them and getting punished, then it is called avoidance conditioning.
Here the response is totally avoided. Though all the concepts are consequences of reinforcement they differ in content and sometimes even in effect depending upon the individual.
We have, in the above paragraphs, tried to present a brief view of Skinner’s elaborate theory which was not only a theory of learning but of behaviour in general. It may be seen that while Thorndike and Pavlov were primarily concerned with the laws of acquisition of specific responses, Skinner has attempted to explain complex variables like personality and even culture.
Hull’s Theory of Learning:
Clark Hull, like Skinner, tried to develop a very elaborate and systematic theory of behaviour rather than of learning. Both Skinner and Hull assumed that all behaviour is learnt and, therefore, did not make a distinction between the two.
However, the reader will find that there are considerable differences between their views. Skinner’s theory was based largely on laboratory experiments and Hull’s on mathematical deduction verified by laboratory data.
Hull, while subscribing to the view that learning is essentially the formation of SR connections, claims that learning can take place only if the organism is in a state of drive or deprivation. He put forward his theory in the form of postulates. These postulates were presented in the form of a chain, linking one with the other.
A discussion of all his postulates is beyond the scope of this article; therefore, his main postulates which summarise the learning process are outlined here. Another interesting feature about his theory is that he expresses his set of postulates in the form of formula. So let us take a look into one formula and see how he explains the learning process.
SER = DXSHRXK-I
Where SER = Reaction potential
D = Drive
SHR = Habit strength
K = Incentive
I = Inhibitions
According to Hull, the organism at birth possesses neural connections of receptors and effectors (SUR), which have the capacity to be stimulated (S). Drive (D) is considered as a condition of deprivation which originates from organic conditions.
Combinations of SUR and D have the potentiality to arouse available responses and satisfy the condition or eliminate the state of deprivation. So it may be seen that the drive produced by an organic condition is the general activator of the response.
However, the stimulus-response connections which are internal or generated from within the organism, Hull referred to as But when the stimulus-response connections are acquired as a result of practice, they are referred to as habit(H).
A habit with successive practice and reinforcement summates in manner which yields a stronger habit (or the strength of that particular habit increases) and this is referred to as habit strength (SHR) Habit strength is considered as an expression of associative intensity.
The term SER is usually explained as effective reaction potential. This refers to the possibility of a particular response occurring at a particular time. The possibility varies depending on a number of factors. There are some learned responses which have a higher possibility of occurring than other responses. Thus, as a result increase in learning, correct responses have a higher SER than incorrect responses.
The SER depends on a number of factors like the level of drive, the habit strength of a response, the degree of attractiveness of a particular reward-incentive, inhibition and certain other factors. Inhibiting factors are the ones which have a tendency to decrease the potentialities of the response or suppress it totally. These concepts can be understood by an illustration.
Take for instance a child who is crying for an ice-cream cone, whose mouth is watering (SUR) and who is obviously in a state of deprivation (D). When it is taken to the ice-cream parlor and when the ice-cream is placed before it, its reactions like recognising that it is an ice-cream cone, stopping crying, the way it picks up the ice-cream, etc. depend on the strength of habit ( SHR). Attractiveness of the ice-cream, i.e. its flavour, colour, etc. and absence of inhibitions (I) play their role in determining the reaction potential.
Hull distinguishes between the two types of inhibitions:
(i) Reactive inhibition and
(ii) Conditioned inhibition.
Reactive inhibition essentially results from reduction in the strength of drive. Thus, if there is a long time-lag between the child’s request for an ice-cream cone and the actual time of presentation of the ice-cream this is likely to inhibit the response of eating it.
Fatigue is a physiological component of reactive inhibition. Reactive inhibition is, therefore, internal and, to a large extent, physiological and biochemical in nature and results in the reduction of drive level. The other type of inhibition known as conditioned inhibition, however, is a result of learning.
If the ice-cream parlor is dingy and located in unattractive surroundings it is likely to decrease the child’s enthusiasm for eating the ice cream. This type of inhibition is the result of learning and experience. The two types of inhibitions – reactive and conditioned – tend to summate and have the overall effect of decreasing the possibility of the occurrence of a particular learnt response. The combined effect of these two inhibitions is referred to as inhibitory potential.
Thus, drive strength, habit strength, and attractiveness have the combined effect of facilitating a response constituting what is known as excitatory potential. Excitatory potential and inhibitory potential act in opposite directions. Reactive inhibition is the result of reduction in the drive strength. Regarding this concept, Hull specifies that if a response is to occur the absence of inhibitors is as necessary as the presence of other factors.
It may be seen that various factors work together in opposite directions to facilitate the occurrence of a response. So at times a well-formed habit may not occur because of a low drive level and vice versa. The above concepts act as chain reactions which explain the operation of learning by completing the circuit of S and R. Though Hull’s theory is regarded as a stimulus – response association theory, in reality it is a little different. It a Stimulus – Organism – Reponses theory – (SOR).
Guthrie’s Theory of Contiguity:
According to Guthrie, learning occurs completely in one trial. His basic postulate about learning is that a stimulus pattern gains its full associative strength on the occasion of its first pairing with a response. For example, when a child puts its fingers into the fire it will learn on the first trial to associate fire (stimulus) with taking away fingers from the fire (response).
It rarely requires a second trial to form a connection. Guthrie believed that learning occurs regardless of reinforcement as long as the conditioned stimulus and response occur together. What is learnt is based upon the principle of recency. Reward influences learning only indirectly by changing that situation and terminating an act and is not essential for learning to take place.
The Concept of Imprinting:
Imprinting is a special form of learning in which a specific stimulus- response connection is established at critical periods. The critical periods in the organism’s life are determined by certain hereditary and environmental factors which become highly specific.
The connections established during such periods through imprinting have a long-lasting and far-reaching effect on behaviour. Konrad Lorenz was the first to term this type of rigid learning as imprinting. He conducted a series of experiments on animals and birds to study this process of learning.
Some of the characteristic features of the responses resulting from imprinting are:
1. All or none form of learning.
2. Rigid learning.
3. Repetition is not an important consideration.
4. Resistance to extinction is high.
5. Independent of practice and continuous reinforcement.
6. Specificity of context, period and response.
One of the most obvious impacts of this is on social behaviour. We all know that animals and birds usually stay in groups, young animals and birds follow their mothers around and stay close to them. They seem to have preferences to stay with others who are like themselves.
All this is found under normal circumstances. For instance, ducklings and goslings follow the mother soon after hatching; perhaps, they are stimulated by her movements and the noises she makes. But what Lorenz did in his experiment was, as soon as ducklings were hatched, he exposed them to a floating ball.
They started following the floating ball in the pond. The young birds which learned to follow the ball thereafter never followed the mother. Any amount of subsequent teaching the birds to follow their mother turned out to be futile. Thus, once this strong connection or imprinting has occurred, the birds approached the object they first responded to.
However strange the object may be, they will prefer to follow the object they had first approached rather than one of their own kind. In one experiment a male duck was made to move around a circular runway fitted with a loudspeaker which emitted female sounds “Gock, gock, gock”. During the critical learning period which was of less than one hour in duration, a duckling was allowed to follow the male duck.
At a later time, this duckling was left in the presence of the male bird and a female bird which emitted the real sounds from the loudspeaker. It was found that ducklings followed the male bird throughout the test.
Testing with different groups of ducklings showed that the optimal time for learning to follow a model is thirteen to sixteen hours after birth. Thus for ducklings the critical period or crucial time occurs near the middle of the first day of life. A number of experiments were conducted on other birds and animals to determine the critical periods at various stages of development for learning different types of behaviour.
Essentially learning theories tended to look upon learning as basically involving a formation of connections.
In the classical conditioning theory the connection is between stimulus and response, in the Skinnerian theory it is between response and reinforcement and, in Guthrie’s theory it is again between stimulus and response. According to these theories learning essentially involves the formation of connections and strengthening of the same mediated by practice, contiguity, reinforcement, drive and other factors.
These theorists, this, tend to view learning as a passive process essentially determined by contextual and environmental factors without any reference to the organism and its capacities. According to these theories an organism which has learnt a response and lost the same is no different from an organism which has not learnt the response at all. This is the question raised by the cognitive theorists of learning.