Reinforcement: Difference between revisions
→History of the terms: removing this history section... there is already a history section at the beginning |
WhatamIdoing (talk | contribs) →Negative reinforcement: Copyedit |
||
(43 intermediate revisions by 15 users not shown) | |||
Line 4: | Line 4: | ||
[[File:Skinner box scheme 01.svg|right|thumb|upright=1.2|[[Operant conditioning chamber]] for reinforcement training]] |
[[File:Skinner box scheme 01.svg|right|thumb|upright=1.2|[[Operant conditioning chamber]] for reinforcement training]] |
||
In [[Behaviorism|behavioral psychology]], '''reinforcement''' |
In [[Behaviorism|behavioral psychology]], '''reinforcement''' refers to [[Operant conditioning#Tools and procedures of operant conditioning|consequences]] that increase the likelihood of an organism's future behavior, typically in the presence of a particular [[Antecedent (behavioral psychology)|antecedent stimulus]].<ref>[https://dictionary.apa.org/reinforcement Definition of reinforcement from the American Psychological Association] Retrieved on January 30th, 2024</ref> For example, a rat can be trained to push a lever to receive food whenever a light is turned on. In this example, the light is the antecedent stimulus, the lever pushing is the operant behavior, and the food is the reinforcer. Likewise, a student that receives attention and praise when answering a teacher's question will be more likely to answer future questions in class. The teacher's question is the antecedent, the student's response is the behavior, and the praise and attention are the reinforcements. |
||
Consequences that lead to appetitive behavior such as subjective [[incentive salience|"wanting" and "liking"]] (desire and pleasure) function as rewards or ''positive reinforcement''.<ref name=Schultz>{{cite journal | vauthors = Schultz W | title = Neuronal Reward and Decision Signals: From Theories to Data | journal = Physiological Reviews | volume = 95 | issue = 3 | pages = 853–951 | date = July 2015 | pmid = 26109341 | pmc = 4491543 | doi = 10.1152/physrev.00023.2014 | quote = Rewards in operant conditioning are positive reinforcers. ... Operant behavior gives a good definition for rewards. Anything that makes an individual come back for more is a positive reinforcer and therefore a reward. Although it provides a good definition, positive reinforcement is only one of several reward functions. ... Rewards are attractive. They are motivating and make us exert an effort. ... Rewards induce approach behavior, also called appetitive or preparatory behavior, and consummatory behavior. ... Thus any stimulus, object, event, activity, or situation that has the potential to make us approach and consume it is by definition a reward. ... Intrinsic rewards are activities that are pleasurable on their own and are undertaken for their own sake, without being the means for getting extrinsic rewards. ... Intrinsic rewards are genuine rewards in their own right, as they induce learning, approach, and pleasure, like perfectioning, playing, and enjoying the piano. Although they can serve to condition higher order rewards, they are not conditioned, higher order rewards, as attaining their reward properties does not require pairing with an unconditioned reward. }}</ref> There is also ''negative reinforcement'', which involves taking away an undesirable stimulus. An example of negative reinforcement would be taking an aspirin to relieve a headache. |
Consequences that lead to appetitive behavior such as subjective [[incentive salience|"wanting" and "liking"]] (desire and pleasure) function as rewards or ''positive reinforcement''.<ref name=Schultz>{{cite journal | vauthors = Schultz W | title = Neuronal Reward and Decision Signals: From Theories to Data | journal = Physiological Reviews | volume = 95 | issue = 3 | pages = 853–951 | date = July 2015 | pmid = 26109341 | pmc = 4491543 | doi = 10.1152/physrev.00023.2014 | quote = Rewards in operant conditioning are positive reinforcers. ... Operant behavior gives a good definition for rewards. Anything that makes an individual come back for more is a positive reinforcer and therefore a reward. Although it provides a good definition, positive reinforcement is only one of several reward functions. ... Rewards are attractive. They are motivating and make us exert an effort. ... Rewards induce approach behavior, also called appetitive or preparatory behavior, and consummatory behavior. ... Thus any stimulus, object, event, activity, or situation that has the potential to make us approach and consume it is by definition a reward. ... Intrinsic rewards are activities that are pleasurable on their own and are undertaken for their own sake, without being the means for getting extrinsic rewards. ... Intrinsic rewards are genuine rewards in their own right, as they induce learning, approach, and pleasure, like perfectioning, playing, and enjoying the piano. Although they can serve to condition higher order rewards, they are not conditioned, higher order rewards, as attaining their reward properties does not require pairing with an unconditioned reward. }}</ref> There is also ''negative reinforcement'', which involves taking away an undesirable stimulus. An example of negative reinforcement would be taking an aspirin to relieve a headache. |
||
Line 11: | Line 11: | ||
==Terminology== |
==Terminology== |
||
{{addiction glossary}} |
|||
{{main|Reinforcement#Operant conditioning}} |
|||
{{addiction glossary}} |
|||
In the behavioral sciences, the terms "positive" and "negative" refer when used in their strict technical sense to the nature of the action performed by the conditioner rather than to the responding operant's evaluation of that action and its consequence(s). "Positive" actions are those that add a factor, be it pleasant or unpleasant, to the environment, whereas "negative" actions are those that remove or withhold from the environment a factor of either type. In turn, the strict sense of "reinforcement" refers only to reward-based conditioning; the introduction of unpleasant factors and the removal or withholding of pleasant factors are instead referred to as "punishment", which when used in its strict sense thus stands in contradistinction to "reinforcement". Thus, "positive reinforcement" refers to the addition of a pleasant factor, "positive punishment" refers to the addition of an unpleasant factor, "negative reinforcement" refers to the removal or withholding of an unpleasant factor, and "negative punishment" refers to the removal or withholding of a pleasant factor. |
In the behavioral sciences, the terms "positive" and "negative" refer when used in their strict technical sense to the nature of the action performed by the conditioner rather than to the responding operant's evaluation of that action and its consequence(s). "Positive" actions are those that add a factor, be it pleasant or unpleasant, to the environment, whereas "negative" actions are those that remove or withhold from the environment a factor of either type. In turn, the strict sense of "reinforcement" refers only to reward-based conditioning; the introduction of unpleasant factors and the removal or withholding of pleasant factors are instead referred to as "punishment", which when used in its strict sense thus stands in contradistinction to "reinforcement". Thus, "positive reinforcement" refers to the addition of a pleasant factor, "positive punishment" refers to the addition of an unpleasant factor, "negative reinforcement" refers to the removal or withholding of an unpleasant factor, and "negative punishment" refers to the removal or withholding of a pleasant factor. |
||
Line 26: | Line 25: | ||
The study of reinforcement has produced an enormous body of [[reliability (statistics)|reproducible]] experimental results. Reinforcement is the central concept and procedure in [[special education]], [[applied behavior analysis]], and the [[experimental analysis of behavior]] and is a core concept in some medical and [[psychopharmacology]] models, particularly [[addiction]], [[substance dependence|dependence]], and [[Compulsive behavior|compulsion]]. |
The study of reinforcement has produced an enormous body of [[reliability (statistics)|reproducible]] experimental results. Reinforcement is the central concept and procedure in [[special education]], [[applied behavior analysis]], and the [[experimental analysis of behavior]] and is a core concept in some medical and [[psychopharmacology]] models, particularly [[addiction]], [[substance dependence|dependence]], and [[Compulsive behavior|compulsion]]. |
||
== |
==History== |
||
Laboratory research on reinforcement is usually dated from the work of [[Edward Thorndike]], known for his experiments with cats escaping from puzzle boxes.<ref>{{cite journal | vauthors = Thorndike E | title = Some Experiments on Animal Intelligence | journal = Science | volume = 7 | issue = 181 | pages = 818–24 | date = June 1898 | pmid = 17769765 | doi = 10.1126/science.7.181.818 | url = https://zenodo.org/record/1448297 | bibcode = 1898Sci.....7..818T }}</ref> A number of others continued this research, notably B.F. Skinner, who published his seminal work on the topic in ''[[The Behavior of Organisms]]'', in 1938, and elaborated this research in many subsequent publications.<ref>Skinner, B. F. "[https://books.google.com/books?id=S9WNCwAAQBAJ&q=Reinforcement The Behavior of Organisms: An Experimental Analysis]", 1938 New York: Appleton-Century-Crofts</ref> Notably Skinner argued that positive reinforcement is superior to punishment in shaping behavior.<ref name=Walden>{{cite book| vauthors = Skinner BF |title=Walden Two| url = https://archive.org/details/waldentwo1948skin | url-access = registration |year=1948|publisher=The Macmillan Company|location=Toronto}}</ref> Though punishment may seem just the opposite of reinforcement, Skinner claimed that they differ immensely, saying that positive reinforcement results in lasting [[Behavior modification|behavioral modification]] (long-term) whereas punishment changes behavior only temporarily (short-term) and has many detrimental side-effects. |
Laboratory research on reinforcement is usually dated from the work of [[Edward Thorndike]], known for his experiments with cats escaping from puzzle boxes.<ref>{{cite journal | vauthors = Thorndike E | title = Some Experiments on Animal Intelligence | journal = Science | volume = 7 | issue = 181 | pages = 818–24 | date = June 1898 | pmid = 17769765 | doi = 10.1126/science.7.181.818 | url = https://zenodo.org/record/1448297 | bibcode = 1898Sci.....7..818T }}</ref> A number of others continued this research, notably B.F. Skinner, who published his seminal work on the topic in ''[[The Behavior of Organisms]]'', in 1938, and elaborated this research in many subsequent publications.<ref>Skinner, B. F. "[https://books.google.com/books?id=S9WNCwAAQBAJ&q=Reinforcement The Behavior of Organisms: An Experimental Analysis]", 1938 New York: Appleton-Century-Crofts</ref> Notably Skinner argued that positive reinforcement is superior to punishment in shaping behavior.<ref name=Walden>{{cite book| vauthors = Skinner BF |title=Walden Two| url = https://archive.org/details/waldentwo1948skin | url-access = registration |year=1948|publisher=The Macmillan Company|location=Toronto}}</ref> Though punishment may seem just the opposite of reinforcement, Skinner claimed that they differ immensely, saying that positive reinforcement results in lasting [[Behavior modification|behavioral modification]] (long-term) whereas punishment changes behavior only temporarily (short-term) and has many detrimental side-effects. |
||
Line 32: | Line 31: | ||
A great many researchers subsequently expanded our understanding of reinforcement and challenged some of Skinner's conclusions. For example, Azrin and Holz defined punishment as a “consequence of behavior that reduces the future probability of that behavior,”<ref name=Honig>{{cite book|last=Honig|first=Werner | name-list-style = vanc |title=Operant Behavior: Areas of Research and Application|year=1966|publisher=Meredith Publishing Company|location=New York|page=381|url=http://psycnet.apa.org/record/1966-35017-000}}</ref> and some studies have shown that positive reinforcement and punishment are equally effective in modifying behavior.{{citation needed|date=January 2024}} Research on the effects of positive reinforcement, negative reinforcement and punishment continue today as those concepts are fundamental to learning theory and apply to many practical applications of that theory. |
A great many researchers subsequently expanded our understanding of reinforcement and challenged some of Skinner's conclusions. For example, Azrin and Holz defined punishment as a “consequence of behavior that reduces the future probability of that behavior,”<ref name=Honig>{{cite book|last=Honig|first=Werner | name-list-style = vanc |title=Operant Behavior: Areas of Research and Application|year=1966|publisher=Meredith Publishing Company|location=New York|page=381|url=http://psycnet.apa.org/record/1966-35017-000}}</ref> and some studies have shown that positive reinforcement and punishment are equally effective in modifying behavior.{{citation needed|date=January 2024}} Research on the effects of positive reinforcement, negative reinforcement and punishment continue today as those concepts are fundamental to learning theory and apply to many practical applications of that theory. |
||
== |
==Operant conditioning== |
||
{{Main|Operant conditioning}}{{OperantConditioning}} |
{{Main|Operant conditioning}}{{OperantConditioning}} |
||
The term ''operant conditioning'' was introduced by |
The term ''operant conditioning'' was introduced by Skinner to indicate that in his experimental paradigm, the organism is free to operate on the environment. In this paradigm, the experimenter cannot trigger the desirable response; the experimenter waits for the response to occur (to be emitted by the organism) and then a potential reinforcer is delivered. In the [[classical conditioning]] paradigm, the experimenter triggers (elicits) the desirable response by presenting a reflex eliciting stimulus, the ''unconditional stimulus'' (UCS), which they pair (precede) with a neutral stimulus, the ''conditional stimulus'' (CS). |
||
''Reinforcement'' is a basic term in operant conditioning. For the punishment aspect of operant conditioning, see [[punishment (psychology)]]. |
''Reinforcement'' is a basic term in operant conditioning. For the punishment aspect of operant conditioning, see [[punishment (psychology)]]. |
||
===Positive reinforcement=== |
===Positive reinforcement=== |
||
Positive reinforcement occurs when a [[reward system|desirable event or stimulus]] is presented as a consequence of a behavior and the chance that this behavior will manifest in similar environments increases.<ref name=Flora>{{cite book|last=Flora|first=Stephen | name-list-style = vanc |title=The Power of Reinforcement|year=2004|publisher=State University of New York Press|location=Albany}}</ref>{{rp|253}} |
|||
* Example: A father gives candy to his daughter when she tidies up her toys. If the frequency of picking up the toys increases, the candy is a positive reinforcer (to reinforce the behavior of cleaning up). |
|||
* Example: A company enacts a rewards program in which employees earn prizes dependent on the number of items sold. The prizes the employees receive are the positive reinforcement if they increase sales. |
|||
* Example: A supervisor attaches a monetary reward for the employee who exceeds expectations the most. The monetary reward is the positive reinforcement of the good behavior: exceeding expectations. |
|||
Positive reinforcement occurs when a [[reward system|desirable event or stimulus]] is presented as a consequence of a behavior and the chance that this behavior will manifest in similar environments increases.<ref name=Flora>{{cite book|last=Flora|first=Stephen | name-list-style = vanc |title=The Power of Reinforcement|year=2004|publisher=State University of New York Press|location=Albany}}</ref>{{rp|253}} For example, if reading a book is fun, then experiencing the fun positively reinforces the behavior of reading fun books. The person who receives the positive reinforcement (i.e., who has fun reading the book) will read more books to have more fun. |
|||
The [[high probability instruction]] (HPI) treatment is a [[behaviorist]] treatment based on the idea of positive reinforcement. |
|||
The [[high probability instruction]] (HPI) treatment is a [[behaviorist]] treatment based on the idea of positive reinforcement. |
|||
===Negative reinforcement=== |
===Negative reinforcement=== |
||
Negative reinforcement increases the rate of a behavior to avoid or escape an [[aversives|aversive situation or stimulus]].<ref name=Flora/>{{rp|253}} Doing something unpleasant to people to prevent or remove a behavior from happening again is ''punishment'', not negative reinforcement. The difference is that reinforcement always increases the likelihood of a behavior whereas punishment always decreases it. |
|||
Negative reinforcement increases the rate of a behavior that avoids or escapes an [[aversives|aversive situation or stimulus]].<ref name=Flora/>{{rp|252–253}} That is, something unpleasant is already happening, and the behavior helps the person avoid or escape the unpleasantness. In contrast to positive reinforcement, which involves adding a pleasant stimulus, in negative reinforcement, the focus is on the removal of an unpleasant situation or stimulus. For example, if someone feels unhappy, then they might engage in a behavior (e.g., reading books) to escape from the aversive situation (e.g., their unhappy feelings).<ref name="Flora" />{{rp|253}} The success of that avoidant or escapist behavior in removing the unpleasant situation or stimulus reinforces the behavior. |
|||
* Example: A child cleans their room, and this behavior is followed by the parent stopping "nagging" or asking the child repeatedly to do so. Here, the nagging serves to negatively reinforce the behavior of cleaning because the child wants to remove that aversive stimulus of nagging. |
|||
* Example: A company has a policy that if an employee completes their assigned work by Friday, they can have Saturday off. Working Saturday is the aversive stimulus; the employees have incentive to increase productivity to avoid the aversive stimulus. |
|||
*Example: An individual leaves early for work to beat traffic and avoid arriving late. The behavior is leaving early for work, and the aversive stimulus the individual wishes to remove is being late to work. |
|||
Doing something unpleasant to people to prevent or remove a behavior from happening again is ''punishment'', not negative reinforcement.<ref name="Flora" />{{rp|252}} The main difference is that reinforcement always increases the likelihood of a behavior (e.g., [[channel surfing]] while bored temporarily alleviated boredom; therefore, there will be more channel surfing while bored), whereas punishment decreases it (e.g., [[hangovers]] are an unpleasant stimulus, so people learn to avoid the behavior that led to that unpleasant stimulus). |
|||
===Extinction=== |
|||
Extinction occurs when a given behavior is ignored (i.e. followed up with no consequence), where it will disappear over time if the behavior continuously receives no reinforcement. Behavior after extinction spikes first and then declines over time.<ref>Stajkovic, Alex (2019). Management and Leadership, What Can MBD Do in My Workday?. First Research Paradigms Applied</ref> Extinction does not have to be deliberate in order to have an effect on a subject's behavior. The following examples demonstrate scenarios in which it can be intentionally or unintentionally applied: |
|||
* Example (Intended): A young child ignores bullies making fun of them. The bullies do not get a reaction from the child and lose interest in bullying them. |
|||
* Example (Unintended): A worker does not receive any recognition for their above and beyond hard work. They then stop working as hard. |
|||
* Example (Intended): A cat keeps meowing for food in the night. The owners do not feed the cat, so the cat eventually stops meowing. |
|||
===Extinction=== |
|||
Extinction occurs when a given behavior is ignored (i.e. followed up with no consequence). Behaviors disappear over time when they continuously receive no reinforcement. During a deliberate extinction, the targeted behavior spikes first (in an attempt to produce the expected, previously reinforced effects), and then declines over time. Neither reinforcement nor extinction need to be deliberate in order to have an effect on a subject's behavior. For example, if a child reads books because they are fun, then the parents' decision to ignore the book reading will not remove the positive reinforcement (i.e., fun) the child receives from reading books. However, if a child engages in a behavior to get attention from the parents, then the parents' decision to ignore the behavior will cause the behavior to go extinct, and the child will find a different behavior to get their parents' attention. |
|||
===Reinforcement versus punishment=== |
===Reinforcement versus punishment=== |
||
Reinforcers serve to increase behaviors whereas punishers serve to decrease behaviors; thus, positive reinforcers are stimuli that the subject will work to attain, and negative reinforcers are stimuli that the subject will work to be rid of or to end.<ref name="D'Amato">{{cite book | vauthors = D'Amato MR |title=Learning Processes: Instrumental Conditioning|year=1969|publisher=The Macmillan Company|location=Toronto| veditors = Marx MH }}</ref> The table below illustrates the adding and subtracting of stimuli (pleasant or aversive) in relation to reinforcement vs. punishment. |
Reinforcers serve to increase behaviors whereas punishers serve to decrease behaviors; thus, positive reinforcers are stimuli that the subject will work to attain, and negative reinforcers are stimuli that the subject will work to be rid of or to end.<ref name="D'Amato">{{cite book | vauthors = D'Amato MR |title=Learning Processes: Instrumental Conditioning|year=1969|publisher=The Macmillan Company|location=Toronto| veditors = Marx MH }}</ref> The table below illustrates the adding and subtracting of stimuli (pleasant or aversive) in relation to reinforcement vs. punishment. |
||
{| class="wikitable" |
{| class="wikitable" |
||
|+Comparison chart |
|||
|- |
|- |
||
! !! [[Reward system|Rewarding]] (pleasant) stimulus |
! !! [[Reward system|Rewarding]] (pleasant) stimulus |
||
Line 69: | Line 61: | ||
|- |
|- |
||
! Adding/presenting |
! Adding/presenting |
||
| Positive reinforcement |
| Positive reinforcement<blockquote>Example: Reading a book because it is fun and interesting</blockquote> |
||
| Positive punishment<blockquote>Example: [[Corporal punishment]], such as [[Spanking|spanking a child]]</blockquote> |
|||
| Positive punishment |
|||
|- |
|- |
||
! Removing/taking away |
! Removing/taking away |
||
| Negative punishment<blockquote>Example: Loss of privileges (e.g., [[screen time]] or permission to attend a desired event) if a rule is broken</blockquote> |
|||
| Negative punishment |
|||
| Negative reinforcement<blockquote>Example: Reading a book because it allows the reader to escape feelings of boredom or unhappiness</blockquote> |
|||
| Negative reinforcement |
|||
|- |
|||
|} |
|} |
||
For example, offering a child candy if he cleans his room is positive reinforcement. Spanking a child if he breaks a window is positive punishment. Taking away a child's toys for misbehaving is negative punishment. Giving a child a break from his chores if he performs well on a test is negative reinforcement. "Positive and negative" do not carry the meaning of "good and bad" in this usage. |
|||
=== |
===Further ideas and concepts=== |
||
* Distinguishing between positive and negative reinforcement can be difficult and may not always be necessary. Focusing on what is being removed or added and how it affects behavior can be more helpful. |
* Distinguishing between positive and negative reinforcement can be difficult and may not always be necessary. Focusing on what is being removed or added and how it affects behavior can be more helpful. |
||
* An event that punishes behavior for some may reinforce behavior for others |
* An event that punishes behavior for some may reinforce behavior for others. |
||
* Some reinforcement can include both positive and negative features, such as a drug addict taking drugs for the added euphoria (positive reinforcement) and also to eliminate withdrawal symptoms (negative reinforcement). |
* Some reinforcement can include both positive and negative features, such as a drug addict taking drugs for the added euphoria (positive reinforcement) and also to eliminate withdrawal symptoms (negative reinforcement). |
||
* Reinforcement in the business world is essential in driving productivity. Employees are constantly motivated by the ability to receive a positive stimulus, such as a promotion or a bonus. Employees are also driven by negative reinforcement, such as by eliminating unpleasant tasks. |
* Reinforcement in the business world is essential in driving productivity. Employees are constantly motivated by the ability to receive a positive stimulus, such as a promotion or a bonus. Employees are also driven by negative reinforcement, such as by eliminating unpleasant tasks. |
||
Line 88: | Line 79: | ||
===Primary and secondary reinforcers{{anchor|Conditioned reinforcer}}=== |
===Primary and secondary reinforcers{{anchor|Conditioned reinforcer}}=== |
||
A '''primary reinforcer''', sometimes called an ''' unconditioned reinforcer''', is a stimulus that does not require [[associative learning|pairing with a different stimulus]] in order to function as a reinforcer and most likely has obtained this function through the evolution and its role in species' survival.<ref>Skinner, B.F. (1974). About Behaviorism</ref> Examples of primary reinforcers include food, water, and sex. Some primary reinforcers, such as certain drugs, may mimic the effects of other primary reinforcers. While these primary reinforcers are fairly stable through life and across individuals, the reinforcing value of different primary reinforcers varies due to multiple factors (e.g., genetics, experience). Thus, one person may prefer one type of food while another avoids it. Or one person may eat much food while another eats very little. So even though food is a primary reinforcer for both individuals, the value of food as a reinforcer differs between them. |
|||
A '' |
A ''primary reinforcer'', sometimes called an ''unconditioned reinforcer'', is a stimulus that does not require [[associative learning|pairing with a different stimulus]] in order to function as a reinforcer and most likely has obtained this function through the evolution and its role in species' survival.<ref>Skinner, B.F. (1974). About Behaviorism</ref> Examples of primary reinforcers include food, water, and sex. Some primary reinforcers, such as certain drugs, may mimic the effects of other primary reinforcers. While these primary reinforcers are fairly stable through life and across individuals, the reinforcing value of different primary reinforcers varies due to multiple factors (e.g., genetics, experience). Thus, one person may prefer one type of food while another avoids it. Or one person may eat much food while another eats very little. So even though food is a primary reinforcer for both individuals, the value of food as a reinforcer differs between them. |
||
A ''secondary reinforcer'', sometimes called a '''conditioned reinforcer'''<!--Redirected here - bolded per MOS:BOLD-->, is a stimulus or situation that has acquired its function as a reinforcer after [[associative learning|pairing with a stimulus]] that functions as a reinforcer. This stimulus may be a primary reinforcer or another conditioned reinforcer (such as money). |
|||
When trying to distinguish primary and secondary reinforcers in human examples, use the "caveman test." If the stimulus is something that a caveman would naturally find desirable (e.g. candy) then it is a primary reinforcer. If, on the other hand, the caveman would not react to it (e.g. a dollar bill), it is a secondary reinforcer. As with primary reinforcers, an organism can experience satisfaction and deprivation with secondary reinforcers. |
When trying to distinguish primary and secondary reinforcers in human examples, use the "caveman test." If the stimulus is something that a caveman would naturally find desirable (e.g. candy) then it is a primary reinforcer. If, on the other hand, the caveman would not react to it (e.g. a dollar bill), it is a secondary reinforcer. As with primary reinforcers, an organism can experience satisfaction and deprivation with secondary reinforcers. |
||
===Other reinforcement terms=== |
===Other reinforcement terms=== |
||
* A generalized reinforcer is a conditioned reinforcer that has obtained the reinforcing function by pairing with many other reinforcers and functions as a reinforcer under a wide-variety of [[motivating operation]]s. (One example of this is money because it is paired with many other reinforcers).<ref name=Miltenberger>Miltenberger, R. G. "Behavioral Modification: Principles and Procedures". [[Thomson/Wadsworth]], 2008.</ref>{{rp|83}} |
* A generalized reinforcer is a conditioned reinforcer that has obtained the reinforcing function by pairing with many other reinforcers and functions as a reinforcer under a wide-variety of [[motivating operation]]s. (One example of this is money because it is paired with many other reinforcers).<ref name=Miltenberger>Miltenberger, R. G. "Behavioral Modification: Principles and Procedures". [[Thomson/Wadsworth]], 2008.</ref>{{rp|83}} |
||
* In reinforcer sampling, a potentially reinforcing but unfamiliar stimulus is presented to an organism without regard to any prior behavior. |
* In reinforcer sampling, a potentially reinforcing but unfamiliar stimulus is presented to an organism without regard to any prior behavior. |
||
Line 104: | Line 97: | ||
* Noncontingent reinforcement refers to response-independent delivery of stimuli identified as reinforcers for some behaviors of that organism. However, this typically entails time-based delivery of stimuli identified as maintaining aberrant behavior, which decreases the rate of the target behavior.<ref>{{cite journal | vauthors = Tucker M, Sigafoos J, Bushell H | title = Use of noncontingent reinforcement in the treatment of challenging behavior. A review and clinical guide | journal = Behavior Modification | volume = 22 | issue = 4 | pages = 529–47 | date = October 1998 | pmid = 9755650 | doi = 10.1177/01454455980224005 | s2cid = 21542125 }}</ref> As no measured behavior is identified as being strengthened, there is controversy surrounding the use of the term noncontingent "reinforcement".<ref>{{cite book | vauthors = Droleskey RE, Andrews K, Chiarantini L, DeLoach JR | chapter = Use of fluorescent probes for describing the process of encapsulation by hypotonic dialysis | series = Advances in Experimental Medicine and Biology| volume = 326 | pages = 73–80 |doi=10.1007/978-1-4615-3030-5_9| pmid = 1284187 | title = The Use of Resealed Erythrocytes as Carriers and Bioreactors | year = 1992 | isbn = 978-1-4613-6321-7 }}</ref> |
* Noncontingent reinforcement refers to response-independent delivery of stimuli identified as reinforcers for some behaviors of that organism. However, this typically entails time-based delivery of stimuli identified as maintaining aberrant behavior, which decreases the rate of the target behavior.<ref>{{cite journal | vauthors = Tucker M, Sigafoos J, Bushell H | title = Use of noncontingent reinforcement in the treatment of challenging behavior. A review and clinical guide | journal = Behavior Modification | volume = 22 | issue = 4 | pages = 529–47 | date = October 1998 | pmid = 9755650 | doi = 10.1177/01454455980224005 | s2cid = 21542125 }}</ref> As no measured behavior is identified as being strengthened, there is controversy surrounding the use of the term noncontingent "reinforcement".<ref>{{cite book | vauthors = Droleskey RE, Andrews K, Chiarantini L, DeLoach JR | chapter = Use of fluorescent probes for describing the process of encapsulation by hypotonic dialysis | series = Advances in Experimental Medicine and Biology| volume = 326 | pages = 73–80 |doi=10.1007/978-1-4615-3030-5_9| pmid = 1284187 | title = The Use of Resealed Erythrocytes as Carriers and Bioreactors | year = 1992 | isbn = 978-1-4613-6321-7 }}</ref> |
||
==Natural and artificial== |
==Natural and artificial reinforcement== |
||
In his 1967 paper, ''Arbitrary and Natural Reinforcement'', [[Charles Ferster]] proposed classifying reinforcement into events that increase the frequency of an operant behavior as a natural consequence of the behavior itself, and events that affect frequency by their requirement of human mediation, such as in a [[token economy]] where subjects are rewarded for certain behavior by the therapist. |
In his 1967 paper, ''Arbitrary and Natural Reinforcement'', [[Charles Ferster]] proposed classifying reinforcement into events that increase the frequency of an operant behavior as a natural consequence of the behavior itself, and events that affect frequency by their requirement of human mediation, such as in a [[token economy]] where subjects are rewarded for certain behavior by the therapist. |
||
In 1970, Baer and Wolf developed the concept of " |
In 1970, Baer and Wolf developed the concept of "behavioral traps."<ref>{{cite book | first1 = Donald M. | last1 = Baer | first2 = Montrose M. | last2 = Wolf | chapter = The entry into natural communities of reinforcement | veditors = Ulrich R, Stachnik T, Mabry J | title = Control of human behavior | volume = 2 | pages = 319–24 | location = Glenview, IL | publisher = Scott Foresman }}</ref> A behavioral trap requires only a simple response to enter the trap, yet once entered, the trap cannot be resisted in creating general behavior change. It is the use of a behavioral trap that increases a person's repertoire, by exposing them to the naturally occurring reinforcement of that behavior. Behavioral traps have four characteristics: |
||
* They are "baited" with desirable reinforcers that "lure" the student into the trap. |
* They are "baited" with desirable reinforcers that "lure" the student into the trap. |
||
* Only a low-effort response already in the repertoire is necessary to enter the trap. |
* Only a low-effort response already in the repertoire is necessary to enter the trap. |
||
Line 113: | Line 107: | ||
* They can remain effective for long periods of time because the person shows few, if any, satiation effects. |
* They can remain effective for long periods of time because the person shows few, if any, satiation effects. |
||
Thus, artificial reinforcement can be used to build or develop generalizable skills, eventually transitioning to naturally occurring reinforcement to maintain or increase the behavior. Another example is a social situation that will generally result from a specific behavior once it has met a certain criterion |
Thus, artificial reinforcement can be used to build or develop generalizable skills, eventually transitioning to naturally occurring reinforcement to maintain or increase the behavior. Another example is a social situation that will generally result from a specific behavior once it has met a certain criterion. |
||
== Intermittent reinforcement schedules == |
== Intermittent reinforcement schedules == |
||
Behavior is not always reinforced every time it is emitted, and the pattern of reinforcement strongly affects how fast an operant response is learned, what its rate is at any given time, and how long it continues when reinforcement ceases. The simplest rules controlling reinforcement are continuous reinforcement, where every response is reinforced, and extinction, where no response is reinforced. Between these extremes, more complex ''schedules of reinforcement'' specify the rules that determine how and when a response will be followed by a reinforcer. |
Behavior is not always reinforced every time it is emitted, and the pattern of reinforcement strongly affects how fast an operant response is learned, what its rate is at any given time, and how long it continues when reinforcement ceases. The simplest rules controlling reinforcement are continuous reinforcement, where every response is reinforced, and extinction, where no response is reinforced. Between these extremes, more complex ''schedules of reinforcement'' specify the rules that determine how and when a response will be followed by a reinforcer. |
||
Line 121: | Line 116: | ||
===Simple schedules=== |
===Simple schedules=== |
||
[[File:Schedule of reinforcement.png|thumb|275px|right|A chart demonstrating the different response rate of the four simple schedules of reinforcement, each hatch mark designates a reinforcer being given]] |
|||
[[File:Schedule of reinforcement.png|thumb|right|A chart demonstrating the different response rate of the four simple schedules of reinforcement, each hatch mark designates a reinforcer being given]] |
|||
* '''Ratio schedule''' – the reinforcement depends only on the number of responses the organism has performed. |
* '''Ratio schedule''' – the reinforcement depends only on the number of responses the organism has performed. |
||
* '''Continuous reinforcement (CRF)''' – a schedule of reinforcement in which every occurrence of the instrumental response (desired response) is followed by the reinforcer.<ref name=Miltenberger/>{{rp|86}} |
* '''Continuous reinforcement (CRF)''' – a schedule of reinforcement in which every occurrence of the instrumental response (desired response) is followed by the reinforcer.<ref name=Miltenberger/>{{rp|86}} |
||
** Lab example: each time a rat presses a bar it gets a pellet of food. |
|||
** Real-world example: each time a dog defecates outside its owner gives it a treat; each time a person puts $1 in a candy machine and presses the buttons they receive a candy bar. |
|||
Simple schedules have a single rule to determine when a single type of reinforcer is delivered for a specific response. |
Simple schedules have a single rule to determine when a single type of reinforcer is delivered for a specific response. |
||
* |
* ''Fixed ratio'' (FR) – schedules deliver reinforcement after every ''n''th response.<ref name=Miltenberger/>{{rp|88}} An FR 1 schedule is synonymous with a CRF schedule. |
||
* ''Variable ratio schedule'' (VR) – reinforced on average every ''n''th response, but not always on the ''n''th response.<ref name=Miltenberger/>{{rp|88}} |
|||
** Example: FR 2 = every second desired response the subject makes is reinforced. |
|||
* ''Fixed interval'' (FI) – reinforced after ''n'' amount of time. |
|||
* ''Variable interval'' (VI) – reinforced on an average of ''n'' amount of time, but not always exactly ''n'' amount of time.<ref name=Miltenberger/>{{rp|89}} |
|||
** Real-world example: FR 10 = Used car dealer gets a $1000 bonus for each 10 cars sold on the lot. |
|||
* ''Fixed time'' (FT) – Provides a reinforcing stimulus at a fixed time since the last reinforcement delivery, regardless of whether the subject has responded or not. In other words, it is a non-contingent schedule. |
|||
* '''Variable ratio schedule''' (VR) – reinforced on average every ''n''th response, but not always on the ''n''th response.<ref name=Miltenberger/>{{rp|88}} |
|||
* ''Variable time'' (VT) – Provides reinforcement at an average variable time since last reinforcement, regardless of whether the subject has responded or not. |
|||
** Lab example: VR 4 = first pellet delivered on 2 bar presses, second pellet delivered on 6 bar presses, third pellet 4 bar presses (2 + 6 + 4 = 12; 12 / 3= 4 bar presses to receive pellet). |
|||
** Real-world example: slot machines (because, though the probability of hitting the jackpot is constant, the number of lever presses needed to hit the jackpot is variable). |
|||
* '''Fixed interval''' (FI) – reinforced after ''n'' amount of time. |
|||
** Example: FI 1-s = reinforcement provided for the first response after 1 second. |
|||
** Lab example: FI 15-s = rat's bar-pressing behavior is reinforced for the first bar press after 15 seconds passes since the last reinforcement. |
|||
** Real-world example: FI 30-min = a 30-minute washing machine cycle. |
|||
* '''Variable interval''' (VI) – reinforced on an average of ''n'' amount of time, but not always exactly ''n'' amount of time.<ref name=Miltenberger/>{{rp|89}} |
|||
** Example: VI 4-min = first pellet delivered after 2 minutes, second delivered after 6 minutes, third is delivered after 4 minutes (2 + 6 + 4 = 12; 12 / 3 = 4). Reinforcement is delivered on the average after 4 minutes. |
|||
** Lab example: VI 10-s = a rat's bar-pressing behavior is reinforced for the first bar press after an average of 10 seconds passes since the last reinforcement. |
|||
** Real-world example: VI 30-min = Going fishing—you might catch a fish after 10 minutes, then have to wait an hour, then have to wait 20 minutes. |
|||
* '''Fixed time''' (FT) – Provides a reinforcing stimulus at a fixed time since the last reinforcement delivery, regardless of whether the subject has responded or not. In other words, it is a non-contingent schedule. |
|||
** Lab example: FT 5-s = rat gets food every 5 seconds regardless of the behavior. |
|||
** Real-world example: FT 30-d = a person gets an annuity check every month regardless of behavior between checks |
|||
* '''Variable time''' (VT) – Provides reinforcement at an average variable time since last reinforcement, regardless of whether the subject has responded or not. |
|||
Simple schedules are utilized in many differential reinforcement<ref>{{cite journal | vauthors = Vollmer TR, Iwata BA | title = Differential reinforcement as treatment for behavior disorders: procedural and functional variations | journal = Research in Developmental Disabilities | volume = 13 | issue = 4 | pages = 393–417 | date = 1992 | pmid = 1509180 | doi=10.1016/0891-4222(92)90013-v}}</ref> procedures: |
Simple schedules are utilized in many differential reinforcement<ref>{{cite journal | vauthors = Vollmer TR, Iwata BA | title = Differential reinforcement as treatment for behavior disorders: procedural and functional variations | journal = Research in Developmental Disabilities | volume = 13 | issue = 4 | pages = 393–417 | date = 1992 | pmid = 1509180 | doi=10.1016/0891-4222(92)90013-v}}</ref> procedures: |
||
* |
* ''Differential reinforcement of alternative behavior'' (DRA) - A conditioning procedure in which an undesired response is decreased by placing it on [[Extinction (psychology)|extinction]] or, less commonly, providing contingent punishment, while simultaneously providing reinforcement contingent on a desirable response. An example would be a teacher attending to a student only when they raise their hand, while ignoring the student when he or she calls out. |
||
* |
* ''Differential reinforcement of other behavior'' (DRO) – Also known as omission training procedures, an instrumental conditioning procedure in which a positive reinforcer is periodically delivered only if the participant does something other than the target response. An example would be reinforcing any hand action other than nose picking.<ref name="Miltenberger" />{{rp|338}} |
||
* |
* ''Differential reinforcement of incompatible behavior'' (DRI) – Used to reduce a frequent behavior without [[punishment (psychology)|punishing]] it by reinforcing an incompatible response. An example would be reinforcing clapping to reduce nose picking |
||
* |
* ''Differential reinforcement of low response rate'' (DRL) – Used to encourage low rates of responding. It is like an interval schedule, except that premature responses reset the time required between behavior. |
||
* ''Differential reinforcement of high rate'' (DRH) – Used to increase high rates of responding. It is like an interval schedule, except that a minimum number of responses are required in the interval in order to receive reinforcement. |
|||
** Lab example: DRL 10-s = a rat is reinforced for the first response after 10 seconds, but if the rat responds earlier than 10 seconds there is no reinforcement and the rat has to wait 10 seconds from that premature response without another response before bar pressing will lead to reinforcement. |
|||
** Real-world example: "If you ask me for a potato chip no more than once every 10 minutes, I will give it to you. If you ask more often, I will give you none." |
|||
* '''Differential reinforcement of high rate''' (DRH) – Used to increase high rates of responding. It is like an interval schedule, except that a minimum number of responses are required in the interval in order to receive reinforcement. |
|||
** Lab example: DRH 10-s/FR 15 = a rat must press a bar 15 times within a 10-second increment to get reinforced. |
|||
** Real-world example: "If [[Lance Armstrong]] is going to win the [[Tour de France]] he has to pedal ''x'' number of times during the ''y''-hour race." |
|||
====Effects of different types of simple schedules==== |
====Effects of different types of simple schedules==== |
||
* Fixed ratio: activity slows after reinforcer is delivered, then response rates increase until the next reinforcer delivery (post-reinforcement pause). |
* Fixed ratio: activity slows after reinforcer is delivered, then response rates increase until the next reinforcer delivery (post-reinforcement pause). |
||
* Variable ratio: rapid, steady rate of responding; most resistant to [[Extinction (psychology)|extinction]]. |
* Variable ratio: rapid, steady rate of responding; most resistant to [[Extinction (psychology)|extinction]]. |
||
Line 178: | Line 156: | ||
===Compound schedules=== |
===Compound schedules=== |
||
<!--Do these refer to the same behavior, the same reinforcers? Yes, they refer to the same behavior and same reinforcer except given at different schedules--> |
<!--Do these refer to the same behavior, the same reinforcers? Yes, they refer to the same behavior and same reinforcer except given at different schedules--> |
||
Compound schedules combine two or more different simple schedules in some way using the same reinforcer for the same behavior. There are many possibilities; among those most often used are: |
Compound schedules combine two or more different simple schedules in some way using the same reinforcer for the same behavior. There are many possibilities; among those most often used are: |
||
* |
* ''Alternative schedules''' – A type of compound schedule where two or more simple schedules are in effect and whichever schedule is completed first results in reinforcement.<ref>{{cite book | vauthors = Iversen IH, Lattal KA | url = https://books.google.com/books?id=uVYJAwAAQBAJ | title = Experimental Analysis of Behavior | date = 1991 | publisher = Elsevier | location = Amsterdam |isbn = 9781483291260}}</ref> |
||
* |
* ''Conjunctive schedules'' – A complex schedule of reinforcement where two or more simple schedules are in effect independently of each other, and requirements on all of the simple schedules must be met for reinforcement. |
||
* |
* ''Multiple schedules'' – Two or more schedules alternate over time, with a stimulus indicating which is in force. Reinforcement is delivered if the response requirement is met while a schedule is in effect. |
||
* ''Mixed schedules'' – Either of two, or more, schedules may occur with no stimulus indicating which is in force. Reinforcement is delivered if the response requirement is met while a schedule is in effect. |
|||
** Example: FR4 when given a whistle and FI6 when given a bell ring. |
|||
*[[File:Operant_Conditioning_Involves_Choice.png|thumb|Administrating two reinforcement schedules at the same time]]''Concurrent schedules'' – A complex reinforcement procedure in which the participant can choose any one of two or more simple reinforcement schedules that are available simultaneously. Organisms are free to change back and forth between the response alternatives at any time. |
|||
* '''Mixed schedules''' – Either of two, or more, schedules may occur with no stimulus indicating which is in force. Reinforcement is delivered if the response requirement is met while a schedule is in effect. |
|||
* ''Concurrent-chain schedule of reinforcement''' – A complex reinforcement procedure in which the participant is permitted to choose during the first link which of several simple reinforcement schedules will be in effect in the second link. Once a choice has been made, the rejected alternatives become unavailable until the start of the next trial. |
|||
** Example: FI6 and then VR3 without any stimulus warning of the change in schedule. |
|||
* ''Interlocking schedules'' – A single schedule with two components where progress in one component affects progress in the other component. In an interlocking FR 60 FI 120-s schedule, for example, each response subtracts time from the interval component such that each response is "equal" to removing two seconds from the FI schedule. |
|||
*[[File:Operant_Conditioning_Involves_Choice.png|thumb|263x263px|Administrating two reinforcement schedules at the same time]]'''Concurrent schedules''' – A complex reinforcement procedure in which the participant can choose any one of two or more simple reinforcement schedules that are available simultaneously. Organisms are free to change back and forth between the response alternatives at any time. |
|||
* ''Chained schedules'' – Reinforcement occurs after two or more successive schedules have been completed, with a stimulus indicating when one schedule has been completed and the next has started |
|||
** Real-world example: changing channels on a television. |
|||
* ''Tandem schedules'' – Reinforcement occurs when two or more successive schedule requirements have been completed, with no stimulus indicating when a schedule has been completed and the next has started. |
|||
* '''Concurrent-chain schedule of reinforcement''' – A complex reinforcement procedure in which the participant is permitted to choose during the first link which of several simple reinforcement schedules will be in effect in the second link. Once a choice has been made, the rejected alternatives become unavailable until the start of the next trial. |
|||
* ''Higher-order schedules'' – completion of one schedule is reinforced according to a second schedule; e.g. in FR2 (FI10 secs), two successive fixed interval schedules require completion before a response is reinforced. |
|||
* '''Interlocking schedules''' – A single schedule with two components where progress in one component affects progress in the other component. In an interlocking FR 60 FI 120-s schedule, for example, each response subtracts time from the interval component such that each response is "equal" to removing two seconds from the FI schedule. |
|||
* '''Chained schedules''' – Reinforcement occurs after two or more successive schedules have been completed, with a stimulus indicating when one schedule has been completed and the next has started |
|||
** Example: On an FR 10 schedule in the presence a red light, a pigeon pecks a green disc 10 times; then, a yellow light indicates an FR 3 schedule is active; after the pigeon pecks a yellow disc 3 times, a green light to indicates a VI 6-s schedule is in effect; if this were the final schedule in the chain, the pigeon would be reinforced for pecking a green disc on a VI 6-s schedule; however, all schedule requirements in the chain must be met before a reinforcer is provided. |
|||
* '''Tandem schedules''' – Reinforcement occurs when two or more successive schedule requirements have been completed, with no stimulus indicating when a schedule has been completed and the next has started. |
|||
** Example: VR 10, after it is completed the schedule is changed without warning to FR 10, after that it is changed without warning to FR 16, etc. At the end of the series of schedules, a reinforcer is finally given. |
|||
* '''Higher-order schedules''' – completion of one schedule is reinforced according to a second schedule; e.g. in FR2 (FI10 secs), two successive fixed interval schedules require completion before a response is reinforced. |
|||
===Superimposed schedules=== |
===Superimposed schedules=== |
||
{{cleanup section|reason=convert Author (Year) citations to wiki style|date=January 2024}} |
{{cleanup section|reason=convert Author (Year) citations to wiki style|date=January 2024}} |
||
The [[psychology]] term ''superimposed schedules of reinforcement'' refers to a structure of rewards where two or more simple schedules of reinforcement operate simultaneously. Reinforcers can be positive, negative, or both. An example is a person who comes home after a long day at work. The behavior of opening the front door is rewarded by a big kiss on the lips by the person's spouse and a rip in the pants from the family dog jumping enthusiastically. Another example of superimposed schedules of reinforcement is a pigeon in an experimental cage pecking at a button. The pecks deliver a hopper of grain every 20th peck, and access to water after every 200 pecks. |
The [[psychology]] term ''superimposed schedules of reinforcement'' refers to a structure of rewards where two or more simple schedules of reinforcement operate simultaneously. Reinforcers can be positive, negative, or both. An example is a person who comes home after a long day at work. The behavior of opening the front door is rewarded by a big kiss on the lips by the person's spouse and a rip in the pants from the family dog jumping enthusiastically. Another example of superimposed schedules of reinforcement is a pigeon in an experimental cage pecking at a button. The pecks deliver a hopper of grain every 20th peck, and access to water after every 200 pecks. |
||
Line 211: | Line 186: | ||
===Concurrent schedules=== |
===Concurrent schedules=== |
||
In [[operant conditioning]], concurrent schedules of reinforcement are schedules of reinforcement that are simultaneously available to an animal subject or human participant, so that the subject or participant can respond on either schedule. For example, in a [[two-alternative forced choice]] task, a [[pigeon]] in a [[Skinner box]] is faced with two pecking keys; pecking responses can be made on either, and food reinforcement might follow a peck on either. The schedules of reinforcement arranged for pecks on the two keys can be different. They may be independent, or they may be linked so that behavior on one key affects the likelihood of reinforcement on the other. |
In [[operant conditioning]], concurrent schedules of reinforcement are schedules of reinforcement that are simultaneously available to an animal subject or human participant, so that the subject or participant can respond on either schedule. For example, in a [[two-alternative forced choice]] task, a [[pigeon]] in a [[Skinner box]] is faced with two pecking keys; pecking responses can be made on either, and food reinforcement might follow a peck on either. The schedules of reinforcement arranged for pecks on the two keys can be different. They may be independent, or they may be linked so that behavior on one key affects the likelihood of reinforcement on the other. |
||
Line 220: | Line 196: | ||
==Shaping== |
==Shaping== |
||
{{Main|Shaping (psychology)}} |
{{Main|Shaping (psychology)}} |
||
Shaping is the reinforcement of successive approximations to a desired instrumental response. In training a rat to press a lever, for example, simply turning toward the lever is reinforced at first. Then, only turning and stepping toward it is reinforced. Eventually the rat will be reinforced for pressing the lever. The successful attainment of one behavior starts the shaping process for the next. As training progresses, the response becomes progressively more like the desired behavior, with each subsequent behavior becoming a closer approximation of the final behavior.<ref>{{cite book | vauthors = Schacter DL, Gilbert DT, Wegner DM | chapter = Chapter 7: Learning | title = Psychology | edition = 2nd | publisher = Worth Publishers | location = New York | year = 2011 | pages = [https://archive.org/details/psychology0000scha/page/284 284–85] | isbn = 978-1-4292-3719-2 | chapter-url = https://archive.org/details/psychology0000scha/page/284 }}</ref> |
Shaping is the reinforcement of successive approximations to a desired instrumental response. In training a rat to press a lever, for example, simply turning toward the lever is reinforced at first. Then, only turning and stepping toward it is reinforced. Eventually the rat will be reinforced for pressing the lever. The successful attainment of one behavior starts the shaping process for the next. As training progresses, the response becomes progressively more like the desired behavior, with each subsequent behavior becoming a closer approximation of the final behavior.<ref>{{cite book | vauthors = Schacter DL, Gilbert DT, Wegner DM | chapter = Chapter 7: Learning | title = Psychology | edition = 2nd | publisher = Worth Publishers | location = New York | year = 2011 | pages = [https://archive.org/details/psychology0000scha/page/284 284–85] | isbn = 978-1-4292-3719-2 | chapter-url = https://archive.org/details/psychology0000scha/page/284 }}</ref> |
||
Line 228: | Line 205: | ||
==Chaining== |
==Chaining== |
||
{{Main|Chaining}} |
{{Main|Chaining}} |
||
Chaining involves linking discrete behaviors together in a series, such that the consequence of each behavior is both the reinforcement for the previous behavior, and the antecedent stimulus for the next behavior. There are many ways to teach chaining, such as forward chaining (starting from the first behavior in the chain), backwards chaining (starting from the last behavior) and total task chaining (teaching each behavior in the chain simultaneously). People's morning routines are a typical chain, with a series of behaviors (e.g. showering, drying off, getting dressed) occurring in sequence as a well learned habit. |
Chaining involves linking discrete behaviors together in a series, such that the consequence of each behavior is both the reinforcement for the previous behavior, and the antecedent stimulus for the next behavior. There are many ways to teach chaining, such as forward chaining (starting from the first behavior in the chain), backwards chaining (starting from the last behavior) and total task chaining (teaching each behavior in the chain simultaneously). People's morning routines are a typical chain, with a series of behaviors (e.g. showering, drying off, getting dressed) occurring in sequence as a well learned habit. |
||
Line 233: | Line 211: | ||
Challenging behaviors seen in individuals with autism and other related disabilities have successfully managed and maintained in studies using a scheduled of chained reinforcements.<ref>{{Cite journal |date=2020-07-24 |title=CORRIGENDUM to "Further Evaluations of Functional Communication Training and Chained Schedules of Reinforcement to Treat Multiple Functions of Challenging Behavior" |journal=Behavior Modification |volume=46 |issue=1 |pages=254 |doi=10.1177/0145445520945810 |pmid=32706269 |s2cid=241136859 |issn=0145-4455|doi-access=free }}</ref> Functional communication training is an intervention that often uses chained schedules of reinforcement to effectively promote the appropriate and desired functional communication response.<ref>{{Cite journal |last1=Falcomata |first1=Terry S. |last2=Roane |first2=Henry S. |last3=Muething |first3=Colin S. |last4=Stephenson |first4=Kasey M. |last5=Ing |first5=Anna D. |date=2012-02-09 |title=Functional Communication Training and Chained Schedules of Reinforcement to Treat Challenging Behavior Maintained by Terminations of Activity Interruptions |url=http://dx.doi.org/10.1177/0145445511433821 |journal=Behavior Modification |volume=36 |issue=5 |pages=630–649 |doi=10.1177/0145445511433821 |pmid=22327267 |s2cid=29108702 |issn=0145-4455}}</ref> |
Challenging behaviors seen in individuals with autism and other related disabilities have successfully managed and maintained in studies using a scheduled of chained reinforcements.<ref>{{Cite journal |date=2020-07-24 |title=CORRIGENDUM to "Further Evaluations of Functional Communication Training and Chained Schedules of Reinforcement to Treat Multiple Functions of Challenging Behavior" |journal=Behavior Modification |volume=46 |issue=1 |pages=254 |doi=10.1177/0145445520945810 |pmid=32706269 |s2cid=241136859 |issn=0145-4455|doi-access=free }}</ref> Functional communication training is an intervention that often uses chained schedules of reinforcement to effectively promote the appropriate and desired functional communication response.<ref>{{Cite journal |last1=Falcomata |first1=Terry S. |last2=Roane |first2=Henry S. |last3=Muething |first3=Colin S. |last4=Stephenson |first4=Kasey M. |last5=Ing |first5=Anna D. |date=2012-02-09 |title=Functional Communication Training and Chained Schedules of Reinforcement to Treat Challenging Behavior Maintained by Terminations of Activity Interruptions |url=http://dx.doi.org/10.1177/0145445511433821 |journal=Behavior Modification |volume=36 |issue=5 |pages=630–649 |doi=10.1177/0145445511433821 |pmid=22327267 |s2cid=29108702 |issn=0145-4455}}</ref> |
||
==Mathematical models== |
==Mathematical models== |
||
{{expand section|date=February 2024}} |
|||
There has been research on building a mathematical model of reinforcement. This model is known as MPR, which is short for [[mathematical principles of reinforcement]]. Peter Killeen has made key discoveries in the field with his research on pigeons.<ref>{{cite journal |last1=Killeen |first1=Peter R. | name-list-style = vanc |title=Mathematical principles of reinforcement |journal=Behavioral and Brain Sciences |date=4 February 2010 |volume=17 |issue=1 |pages=105–135 |doi=10.1017/S0140525X00033628 | url = http://cogprints.org/591/1/199802001.html }}</ref> |
There has been research on building a mathematical model of reinforcement. This model is known as MPR, which is short for [[mathematical principles of reinforcement]]. Peter Killeen has made key discoveries in the field with his research on pigeons.<ref>{{cite journal |last1=Killeen |first1=Peter R. | name-list-style = vanc |title=Mathematical principles of reinforcement |journal=Behavioral and Brain Sciences |date=4 February 2010 |volume=17 |issue=1 |pages=105–135 |doi=10.1017/S0140525X00033628 | url = http://cogprints.org/591/1/199802001.html }}</ref> |
||
==Criticisms== |
|||
The standard definition of behavioral reinforcement has been criticized as [[circular definition|circular]], since it appears to argue that response strength is increased by reinforcement, and defines reinforcement as something that increases response strength (i.e., response strength is increased by things that increase response strength). However, the correct usage<ref>{{cite book |vauthors=Skinner BF |veditors=Epstein R |title=Skinner for the classroom : selected papers |date=1982 |publisher=Research Press |location=Champaign, Ill. |isbn=978-0-87822-261-2 |url-access=registration |url=https://archive.org/details/skinnerforclassr00skin }}</ref> of reinforcement is that something is a reinforcer ''because'' of its effect on behavior, and not the other way around. It becomes circular if one says that a particular stimulus strengthens behavior because it is a reinforcer, and does not explain why a stimulus is producing that effect on the behavior. Other definitions have been proposed, such as F.D. Sheffield's "consummatory behavior contingent on a response", but these are not broadly used in psychology.<ref>{{cite book | first1 = Franco J. | last1 = Vaccarino | first2 = Bernard B. | last2 = Schiff | first3 = Stephen E. | last3 = Glickman | editor-last1=Mowrer |editor-first1=Robert R. |editor-last2=Klein |editor-first2=Stephen B. | name-list-style = vanc |title=Contemporary learning theories |date=1989 |publisher=Lawrence Erlbaum Associates |location=Hillsdale, N.J. |isbn=978-0-89859-915-2}}</ref> |
|||
Increasingly, understanding of the role reinforcers play is moving away from a "strengthening" effect to a "signalling" effect.<ref>{{cite journal | vauthors = Cowie S, Davison M, Elliffe D | title = Reinforcement: food signals the time and location of future food | journal = Journal of the Experimental Analysis of Behavior| volume = 96 | issue = 1 | pages = 63–86 | date = July 2011 | pmid = 21765546 | pmc = 3136894 | doi = 10.1901/jeab.2011.96-63 }}</ref> That is, the view that reinforcers increase responding because they signal the behaviors that are likely to result in reinforcement. While in most practical applications, the effect of any given reinforcer will be the same regardless of whether the reinforcer is signalling or strengthening, this approach helps to explain a number of behavioral phenomena including patterns of responding on intermittent reinforcement schedules (fixed interval scallops) and the [[differential outcomes effect]].<ref>{{cite journal |last1=McCormack |first1=Jessica |last2=Arnold-Saritepe |first2=Angela |last3=Elliffe |first3=Douglas | name-list-style = vanc |date= June 2017 | title=The differential outcomes effect in children with autism |journal=Behavioral Interventions |volume=32 |issue=4 |pages=357–369 |doi=10.1002/bin.1489 }}</ref> <!--There's more controversies that can be added here.--> |
|||
== Applications == |
== Applications == |
||
Line 252: | Line 227: | ||
===Animal training=== |
===Animal training=== |
||
{{main|Animal training}} |
{{main|Animal training}} |
||
[[File:Chicken on a skateboard.JPG|right|thumb |
[[File:Chicken on a skateboard.JPG|right|thumb|A chicken riding a skateboard]] |
||
Animal trainers and pet owners were applying the principles and practices of operant conditioning long before these ideas were named and studied, and animal training still provides one of the clearest and most convincing examples of operant control. Of the concepts and procedures described in this article, a few of the most salient are: availability of immediate reinforcement (e.g. the ever-present bag of dog yummies); contingency, assuring that reinforcement follows the desired behavior and not something else; the use of secondary reinforcement, as in sounding a clicker immediately after a desired response; shaping, as in gradually getting a dog to jump higher and higher; intermittent reinforcement, reducing the frequency of those yummies to induce persistent behavior without satiation; chaining, where a complex behavior is gradually put together.<ref>{{cite book | vauthors = McGreevy PD, Boakes RA |title=Carrots and sticks: principles of animal training |date=2007 |publisher=Cambridge University Press |location=Cambridge |isbn=978-0-521-68691-4}}</ref> |
Animal trainers and pet owners were applying the principles and practices of operant conditioning long before these ideas were named and studied, and animal training still provides one of the clearest and most convincing examples of operant control. Of the concepts and procedures described in this article, a few of the most salient are: availability of immediate reinforcement (e.g. the ever-present bag of dog yummies); contingency, assuring that reinforcement follows the desired behavior and not something else; the use of secondary reinforcement, as in sounding a clicker immediately after a desired response; shaping, as in gradually getting a dog to jump higher and higher; intermittent reinforcement, reducing the frequency of those yummies to induce persistent behavior without satiation; chaining, where a complex behavior is gradually put together.<ref>{{cite book | vauthors = McGreevy PD, Boakes RA |title=Carrots and sticks: principles of animal training |date=2007 |publisher=Cambridge University Press |location=Cambridge |isbn=978-0-521-68691-4}}</ref> |
||
=== Child behavior – parent management training === |
=== Child behavior – parent management training === |
||
{{Main|Parent management training}} |
{{Main|Parent management training}} |
||
Providing positive reinforcement for appropriate child behaviors is a major focus of parent management training. Typically, parents learn to reward appropriate behavior through social rewards (such as praise, smiles, and hugs) as well as concrete rewards (such as stickers or points towards a larger reward as part of an incentive system created collaboratively with the child).<ref name=Kazdin2010>Kazdin AE (2010). Problem-solving skills training and parent management training for oppositional defiant disorder and conduct disorder. ''[https://books.google.com/books?id=QLzBv53CU2UC&q=Reinforcement Evidence-based psychotherapies for children and adolescents (2nd ed.)],'' 211–226. New York: Guilford Press.</ref> In addition, parents learn to select simple behaviors as an initial focus and reward each of the small steps that their child achieves towards reaching a larger goal (this concept is called "successive approximations").<ref name=Kazdin2010/><ref name=PMTO>Forgatch MS, Patterson GR (2010). Parent management training — Oregon model: An intervention for antisocial behavior in children and adolescents. ''[https://books.google.com/books?id=QLzBv53CU2UC&q=Reinforcement Evidence-based psychotherapies for children and adolescents (2nd ed.)],'' 159–78. New York: Guilford Press.</ref> They may also use indirect rewards such through [[progress chart]]s. Providing positive reinforcement in the classroom can be beneficial to student success. When applying positive reinforcement to students, it's crucial to make it individualized to that student's needs. This way, the student understands why they are receiving the praise, they can accept it, and eventually learn to continue the action that was earned by positive reinforcement. For example, using rewards or extra recess time might apply to some students more, whereas others might accept the enforcement by receiving stickers or check marks indicating praise. |
Providing positive reinforcement for appropriate child behaviors is a major focus of parent management training. Typically, parents learn to reward appropriate behavior through social rewards (such as praise, smiles, and hugs) as well as concrete rewards (such as stickers or points towards a larger reward as part of an incentive system created collaboratively with the child).<ref name=Kazdin2010>Kazdin AE (2010). Problem-solving skills training and parent management training for oppositional defiant disorder and conduct disorder. ''[https://books.google.com/books?id=QLzBv53CU2UC&q=Reinforcement Evidence-based psychotherapies for children and adolescents (2nd ed.)],'' 211–226. New York: Guilford Press.</ref> In addition, parents learn to select simple behaviors as an initial focus and reward each of the small steps that their child achieves towards reaching a larger goal (this concept is called "successive approximations").<ref name=Kazdin2010/><ref name=PMTO>Forgatch MS, Patterson GR (2010). Parent management training — Oregon model: An intervention for antisocial behavior in children and adolescents. ''[https://books.google.com/books?id=QLzBv53CU2UC&q=Reinforcement Evidence-based psychotherapies for children and adolescents (2nd ed.)],'' 159–78. New York: Guilford Press.</ref> They may also use indirect rewards such through [[progress chart]]s. Providing positive reinforcement in the classroom can be beneficial to student success. When applying positive reinforcement to students, it's crucial to make it individualized to that student's needs. This way, the student understands why they are receiving the praise, they can accept it, and eventually learn to continue the action that was earned by positive reinforcement. For example, using rewards or extra recess time might apply to some students more, whereas others might accept the enforcement by receiving stickers or check marks indicating praise. |
||
===Economics=== |
===Economics=== |
||
{{main|Behavioral economics}} |
{{main|Behavioral economics}} |
||
{{further|Consumer demand tests (animals)}} |
{{further|Consumer demand tests (animals)}} |
||
Line 267: | Line 245: | ||
=== Gambling – variable ratio scheduling === |
=== Gambling – variable ratio scheduling === |
||
{{Main|Gambling}} |
{{Main|Gambling}} |
||
As stated earlier in this article, a variable ratio schedule yields reinforcement after the emission of an unpredictable number of responses. This schedule typically generates rapid, persistent responding. Slot machines pay off on a variable ratio schedule, and they produce just this sort of persistent lever-pulling behavior in gamblers. Because the machines are programmed to pay out less money than they take in, the persistent slot-machine user invariably loses in the long run. Slots machines, and thus variable ratio reinforcement, have often been blamed as a factor underlying gambling addiction.<ref>{{cite journal | vauthors = Lozano Bleda JH, Pérez Nieto MA | title = Impulsivity, intelligence, and discriminating reinforcement contingencies in a fixed-ratio 3 schedule | journal = The Spanish Journal of Psychology | volume = 15 | issue = 3 | pages = 922–9 | date = November 2012 | pmid = 23156902 | doi=10.5209/rev_sjop.2012.v15.n3.39384| s2cid = 144193503 }}</ref> |
As stated earlier in this article, a variable ratio schedule yields reinforcement after the emission of an unpredictable number of responses. This schedule typically generates rapid, persistent responding. Slot machines pay off on a variable ratio schedule, and they produce just this sort of persistent lever-pulling behavior in gamblers. Because the machines are programmed to pay out less money than they take in, the persistent slot-machine user invariably loses in the long run. Slots machines, and thus variable ratio reinforcement, have often been blamed as a factor underlying gambling addiction.<ref>{{cite journal | vauthors = Lozano Bleda JH, Pérez Nieto MA | title = Impulsivity, intelligence, and discriminating reinforcement contingencies in a fixed-ratio 3 schedule | journal = The Spanish Journal of Psychology | volume = 15 | issue = 3 | pages = 922–9 | date = November 2012 | pmid = 23156902 | doi=10.5209/rev_sjop.2012.v15.n3.39384| s2cid = 144193503 }}</ref> |
||
=== Managing behavior in organizations === |
|||
{{Main|Managing behavior in organizations}} |
|||
An alternative to traditional pay for performance incentive schemes that is rooted in reinforcement theory, known as the O.B. Mod Approach, has been proposed as a practical approach to managing the performance-related behaviors of an organization's members. . O.B. Mod. and its "reinforce-for-performance" basis has been shown empirically to yield performance improvements in both manufacturing and service organizations, though improvements varied by type of reinforcer in both contexts.<ref>{{cite journal | vauthors = Luthans F, Stajkovic AD | title = Reinforce for performance: The need to go beyond pay and even rewards | journal = Academy of Management Executive | volume = 13 | issue = 2 | pages = 49–57 | date = 1999 |doi=10.5465/AME.1999.1899548| url = https://digitalcommons.unl.edu/managementfacpub/169 }}</ref> |
|||
=== Nudge theory === |
|||
{{Main|Nudge theory}} |
|||
Nudge theory (or nudge) is a concept in [[behavioral science]], [[political theory]] and [[economics]] which argues that positive reinforcement and indirect suggestions to try to achieve non-forced [[Compliance (psychology)|compliance]] can [[Social influence|influence]] the motives, incentives and [[decision making]] of groups and individuals, at least as effectively – if not more effectively – than direct instruction, legislation, or enforcement. |
|||
===Praise=== |
===Praise=== |
||
{{Main|Praise}} |
{{Main|Praise}} |
||
The concept of praise as a means of behavioral reinforcement in humans is rooted in B.F. Skinner's model of operant conditioning. Through this lens, praise has been viewed as a means of positive reinforcement, wherein an observed behavior is made more likely to occur by contingently praising said behavior.<ref>{{cite book|last1=Kazdin|first1=Alan|title=History of behavior modification: Experimental foundations of contemporary research|url=https://archive.org/details/historyofbehavio0000kazd|url-access=registration|date=1978|publisher=University Park Press|location=Baltimore|isbn=9780839112051}}</ref> Hundreds of studies have demonstrated the effectiveness of praise in promoting positive behaviors, notably in the study of teacher and parent use of praise on child in promoting improved behavior and academic performance,<ref>{{cite journal | vauthors = Baker GL, Barnes HJ | title = Superior vena cava syndrome: etiology, diagnosis, and treatment | journal = American Journal of Critical Care | volume = 1 | issue = 1 | pages = 54–64 |pmid=1307879 | year = 1992 | doi = 10.4037/ajcc1992.1.1.54 }}</ref><ref name="Garland et al. 2008"/> but also in the study of work performance.<ref>{{cite journal | vauthors = Crowell CR, Anderson DC, Abel DM, Sergio JP | title = Task clarification, performance feedback, and social praise: Procedures for improving the customer service of bank tellers | journal = Journal of Applied Behavior Analysis| volume = 21 | issue = 1 | pages = 65–71 | date = 1988 | pmid = 16795713 | pmc = 1286094 | doi = 10.1901/jaba.1988.21-65 }}</ref> Praise has also been demonstrated to reinforce positive behaviors in non-praised adjacent individuals (such as a classmate of the praise recipient) through vicarious reinforcement.<ref name="Kazdin, 1973">{{cite journal | vauthors = Goldman NC | title = Adenoid cystic carcinoma of the external auditory canal | journal = Otolaryngology–Head and Neck Surgery | volume = 106 | issue = 2 | pages = 214–5 |pmid=1310808| year = 1992 | doi = 10.1177/019459989210600211 | s2cid = 23782303 }}</ref> Praise may be more or less effective in changing behavior depending on its form, content and delivery. In order for praise to effect positive behavior change, it must be contingent on the positive behavior (i.e., only administered after the targeted behavior is enacted), must specify the particulars of the behavior that is to be reinforced, and must be delivered sincerely and credibly.<ref name="Brophy, 1981">{{cite journal|last1=Brophy|first1=Jere | name-list-style = vanc |title=On praising effectively|journal=The Elementary School Journal|date=1981|volume=81|issue=5|pages=269–278 |jstor=1001606|doi=10.1086/461229 |s2cid=144444174 }}</ref> |
The concept of praise as a means of behavioral reinforcement in humans is rooted in B.F. Skinner's model of operant conditioning. Through this lens, praise has been viewed as a means of positive reinforcement, wherein an observed behavior is made more likely to occur by contingently praising said behavior.<ref>{{cite book|last1=Kazdin|first1=Alan|title=History of behavior modification: Experimental foundations of contemporary research|url=https://archive.org/details/historyofbehavio0000kazd|url-access=registration|date=1978|publisher=University Park Press|location=Baltimore|isbn=9780839112051}}</ref> Hundreds of studies have demonstrated the effectiveness of praise in promoting positive behaviors, notably in the study of teacher and parent use of praise on child in promoting improved behavior and academic performance,<ref>{{cite journal | vauthors = Baker GL, Barnes HJ | title = Superior vena cava syndrome: etiology, diagnosis, and treatment | journal = American Journal of Critical Care | volume = 1 | issue = 1 | pages = 54–64 |pmid=1307879 | year = 1992 | doi = 10.4037/ajcc1992.1.1.54 }}</ref><ref name="Garland et al. 2008"/> but also in the study of work performance.<ref>{{cite journal | vauthors = Crowell CR, Anderson DC, Abel DM, Sergio JP | title = Task clarification, performance feedback, and social praise: Procedures for improving the customer service of bank tellers | journal = Journal of Applied Behavior Analysis| volume = 21 | issue = 1 | pages = 65–71 | date = 1988 | pmid = 16795713 | pmc = 1286094 | doi = 10.1901/jaba.1988.21-65 }}</ref> Praise has also been demonstrated to reinforce positive behaviors in non-praised adjacent individuals (such as a classmate of the praise recipient) through vicarious reinforcement.<ref name="Kazdin, 1973">{{cite journal | vauthors = Goldman NC | title = Adenoid cystic carcinoma of the external auditory canal | journal = Otolaryngology–Head and Neck Surgery | volume = 106 | issue = 2 | pages = 214–5 |pmid=1310808| year = 1992 | doi = 10.1177/019459989210600211 | s2cid = 23782303 }}</ref> Praise may be more or less effective in changing behavior depending on its form, content and delivery. In order for praise to effect positive behavior change, it must be contingent on the positive behavior (i.e., only administered after the targeted behavior is enacted), must specify the particulars of the behavior that is to be reinforced, and must be delivered sincerely and credibly.<ref name="Brophy, 1981">{{cite journal|last1=Brophy|first1=Jere | name-list-style = vanc |title=On praising effectively|journal=The Elementary School Journal|date=1981|volume=81|issue=5|pages=269–278 |jstor=1001606|doi=10.1086/461229 |s2cid=144444174 }}</ref> |
||
Acknowledging the effect of praise as a positive reinforcement strategy, numerous behavioral and cognitive behavioral interventions have incorporated the use of praise in their protocols.<ref name="Simonsen et al 2008">{{cite journal|last1=Simonsen|first1=Brandi|last2=Fairbanks|first2=Sarah|last3=Briesch|first3=Amy|last4=Myers|first4=Diane|last5=Sugai|first5=George | name-list-style = vanc |title=Evidence-based Practices in Classroom Management: Considerations for Research to Practice|journal=Education and Treatment of Children|date=2008|volume=31|issue=1|pages=351–380|doi=10.1353/etc.0.0007|s2cid=145087451}}</ref><ref name="Weisz & Kazdin, 2010">{{cite book|last1=Weisz|first1=John R.|last2=Kazdin|first2=Alan E. | name-list-style = vanc |title=Evidence-based psychotherapies for children and adolescents|date=2010|publisher=Guilford Press|url=https://books.google.com/books?id=QLzBv53CU2UC|isbn=9781606235256}}</ref> The strategic use of praise is recognized as an evidence-based practice in both classroom management<ref name="Simonsen et al 2008" /> and parenting training interventions,<ref name="Garland et al. 2008">{{cite journal | vauthors = Garland AF, Hawley KM, Brookman-Frazee L, Hurlburt MS | title = Identifying common elements of evidence-based psychosocial treatments for children's disruptive behavior problems | journal = Journal of the American Academy of Child and Adolescent Psychiatry | volume = 47 | issue = 5 | pages = 505–14 | date = May 2008 | pmid = 18356768 | doi = 10.1097/CHI.0b013e31816765c2 }}</ref> though praise is often subsumed in intervention research into a larger category of positive reinforcement, which includes strategies such as strategic attention and behavioral rewards. |
Acknowledging the effect of praise as a positive reinforcement strategy, numerous behavioral and cognitive behavioral interventions have incorporated the use of praise in their protocols.<ref name="Simonsen et al 2008">{{cite journal|last1=Simonsen|first1=Brandi|last2=Fairbanks|first2=Sarah|last3=Briesch|first3=Amy|last4=Myers|first4=Diane|last5=Sugai|first5=George | name-list-style = vanc |title=Evidence-based Practices in Classroom Management: Considerations for Research to Practice|journal=Education and Treatment of Children|date=2008|volume=31|issue=1|pages=351–380|doi=10.1353/etc.0.0007|s2cid=145087451}}</ref><ref name="Weisz & Kazdin, 2010">{{cite book|last1=Weisz|first1=John R.|last2=Kazdin|first2=Alan E. | name-list-style = vanc |title=Evidence-based psychotherapies for children and adolescents|date=2010|publisher=Guilford Press|url=https://books.google.com/books?id=QLzBv53CU2UC|isbn=9781606235256}}</ref> The strategic use of praise is recognized as an evidence-based practice in both classroom management<ref name="Simonsen et al 2008" /> and parenting training interventions,<ref name="Garland et al. 2008">{{cite journal | vauthors = Garland AF, Hawley KM, Brookman-Frazee L, Hurlburt MS | title = Identifying common elements of evidence-based psychosocial treatments for children's disruptive behavior problems | journal = Journal of the American Academy of Child and Adolescent Psychiatry | volume = 47 | issue = 5 | pages = 505–14 | date = May 2008 | pmid = 18356768 | doi = 10.1097/CHI.0b013e31816765c2 }}</ref> though praise is often subsumed in intervention research into a larger category of positive reinforcement, which includes strategies such as strategic attention and behavioral rewards. |
||
===Manipulation=== |
|||
Braiker identified the following ways that manipulators [[Abusive power and control|control]] their victims:<ref name=braiker>{{Cite book|title=Who's Pulling Your Strings ? How to Break The Cycle of Manipulation |first=Harriet B.|last=Braiker | name-list-style = vanc |year=2004 |isbn=0-07-144672-9}}</ref> |
|||
* [[Positive reinforcement]]: includes praise, [[superficial charm]], superficial [[sympathy]] ([[crocodile tears]]), excessive apologizing, money, approval, gifts, attention, facial expressions such as a forced laugh or [[smile]], and public recognition. |
|||
* [[Negative reinforcement]]: may involve removing one from a negative situation |
|||
* [[#Intermittent reinforcement; schedules|Intermittent or partial reinforcement]]: Partial or intermittent negative reinforcement can create an effective [[climate of fear]] and doubt. Partial or intermittent positive reinforcement can encourage the victim to persist – for example in most forms of gambling, the gambler is likely to win now and again but still lose money overall. |
|||
* [[Punishment (psychology)|Punishment]]: includes [[nagging]], yelling, the [[silent treatment]], [[intimidation]], threats, [[profanity|swearing]], [[emotional blackmail]], the [[guilt trip]], sulking, crying, and [[playing the victim]]. |
|||
* Traumatic one-trial learning: using [[verbal abuse]], explosive anger, or other intimidating behavior to establish dominance or superiority; even one incident of such behavior can [[classical conditioning|condition]] or train victims to avoid upsetting, confronting or contradicting the manipulator. |
|||
===Traumatic bonding=== |
===Traumatic bonding=== |
||
{{Main|Traumatic bonding}} |
{{Main|Traumatic bonding}} |
||
Traumatic bonding occurs as the result of ongoing [[cycle of abuse|cycles of abuse]] in which the intermittent reinforcement of reward and [[Punishment (psychology)|punishment]] creates powerful emotional bonds that are resistant to change.<ref>{{Cite journal|title = Traumatic Bonding: The development of emotional attachments in battered women and other relationships of intermittent abuse|last1 = Dutton|date = 1981|journal = Victimology |last2 = Painter|issue = 7}}</ref><ref name="Sanderson2008">Chrissie Sanderson. ''[https://books.google.com/books?id=5vA42Opyx9cC&pg=PA84 Counselling Survivors of Domestic Abuse]''. Jessica Kingsley Publishers; 15 June 2008. {{ISBN|978-1-84642-811-1}}. p. 84.</ref> |
Traumatic bonding occurs as the result of ongoing [[cycle of abuse|cycles of abuse]] in which the intermittent reinforcement of reward and [[Punishment (psychology)|punishment]] creates powerful emotional bonds that are resistant to change.<ref>{{Cite journal|title = Traumatic Bonding: The development of emotional attachments in battered women and other relationships of intermittent abuse|last1 = Dutton|date = 1981|journal = Victimology |last2 = Painter|issue = 7}}</ref><ref name="Sanderson2008">Chrissie Sanderson. ''[https://books.google.com/books?id=5vA42Opyx9cC&pg=PA84 Counselling Survivors of Domestic Abuse]''. Jessica Kingsley Publishers; 15 June 2008. {{ISBN|978-1-84642-811-1}}. p. 84.</ref> |
||
Line 302: | Line 266: | ||
===Video games=== |
===Video games=== |
||
{{Main|Compulsion loop}} |
{{Main|Compulsion loop}} |
||
Most video games are designed around some type of compulsion loop, adding a type of positive reinforcement through a variable rate schedule to keep the player playing the game, though this can also lead to [[video game addiction]].<ref>{{cite web | first = John | last = Hopson | name-list-style = vanc | url = http://www.gamasutra.com/view/feature/131494/behavioral_game_design.php | title = Behavioral Game Design | work = [[Gamasutra]] | date = 27 April 2001 }}</ref> |
Most video games are designed around some type of compulsion loop, adding a type of positive reinforcement through a variable rate schedule to keep the player playing the game, though this can also lead to [[video game addiction]].<ref>{{cite web | first = John | last = Hopson | name-list-style = vanc | url = http://www.gamasutra.com/view/feature/131494/behavioral_game_design.php | title = Behavioral Game Design | work = [[Gamasutra]] | date = 27 April 2001 }}</ref> |
||
Line 308: | Line 273: | ||
As part of a trend in the [[video game monetization|monetization of video games]] in the 2010s, some games offered "loot boxes" as rewards or purchasable by real-world funds that offered a random selection of in-game items, distributed by rarity. The practice has been tied to the same methods that slot machines and other gambling devices dole out rewards, as it follows a variable rate schedule. While the general perception that loot boxes are a form of gambling, the practice is only classified as such in a few countries as gambling and otherwise legal. However, methods to use those items as virtual currency for online gambling or trading for real-world money has created a [[skin gambling]] market that is under legal evaluation.<ref name="eg pegi">{{cite web | url = http://www.eurogamer.net/articles/2017-10-11-are-loot-boxes-gambling | title = Are loot boxes gambling? | first = Vic | last = Hood | name-list-style = vanc | date = October 12, 2017 | access-date = October 12, 2017 | work = [[Eurogamer]] }}</ref> |
As part of a trend in the [[video game monetization|monetization of video games]] in the 2010s, some games offered "loot boxes" as rewards or purchasable by real-world funds that offered a random selection of in-game items, distributed by rarity. The practice has been tied to the same methods that slot machines and other gambling devices dole out rewards, as it follows a variable rate schedule. While the general perception that loot boxes are a form of gambling, the practice is only classified as such in a few countries as gambling and otherwise legal. However, methods to use those items as virtual currency for online gambling or trading for real-world money has created a [[skin gambling]] market that is under legal evaluation.<ref name="eg pegi">{{cite web | url = http://www.eurogamer.net/articles/2017-10-11-are-loot-boxes-gambling | title = Are loot boxes gambling? | first = Vic | last = Hood | name-list-style = vanc | date = October 12, 2017 | access-date = October 12, 2017 | work = [[Eurogamer]] }}</ref> |
||
==Criticisms== |
|||
=== Workplace culture of fear === |
|||
{{Main|Culture of fear|Organizational culture|Toxic workplace|Workplace bullying}} |
|||
Ashforth discussed potentially destructive sides of [[leadership]] and identified what he referred to as [[petty tyrants]]: leaders who exercise a tyrannical style of management, resulting in a climate of fear in the workplace.<ref name=ashforth>{{cite journal | title = Petty tyranny in organizations | last = Ashforth | first = Blake | name-list-style = vanc | journal = Human Relations | volume = 47 | issue = 7 | pages = 755–778 | date = 1994 | doi=10.1177/001872679404700701| s2cid = 145699243 }}</ref> Partial or intermittent [[negative reinforcement]] can create an effective climate of fear and [[doubt]].<ref name="braiker"/> When employees get the sense that bullies are tolerated, a climate of fear may be the result.<ref name=Organisational>{{cite book | vauthors = Helge H, Sheehan MJ, Cooper CL, Einarsen S | veditors = Einarsen S, Hoel H, Zapf D, Cooper C | chapter = Organisational Effects of Workplace Bullying | title = Bullying and Harassment in the Workplace: Developments in Theory, Research, and Practice | date = 2010 | publisher = CRC Press | location = Boca Raton, FL | isbn = 978-1-4398-0489-6 | edition = 2nd }}</ref> |
|||
The standard definition of behavioral reinforcement has been criticized as [[circular definition|circular]], since it appears to argue that response strength is increased by reinforcement, and defines reinforcement as something that increases response strength (i.e., response strength is increased by things that increase response strength). However, the correct usage<ref>{{cite book |vauthors=Skinner BF |veditors=Epstein R |title=Skinner for the classroom : selected papers |date=1982 |publisher=Research Press |location=Champaign, Ill. |isbn=978-0-87822-261-2 |url-access=registration |url=https://archive.org/details/skinnerforclassr00skin }}</ref> of reinforcement is that something is a reinforcer ''because'' of its effect on behavior, and not the other way around. It becomes circular if one says that a particular stimulus strengthens behavior because it is a reinforcer, and does not explain why a stimulus is producing that effect on the behavior. Other definitions have been proposed, such as F.D. Sheffield's "consummatory behavior contingent on a response", but these are not broadly used in psychology.<ref>{{cite book | first1 = Franco J. | last1 = Vaccarino | first2 = Bernard B. | last2 = Schiff | first3 = Stephen E. | last3 = Glickman | editor-last1=Mowrer |editor-first1=Robert R. |editor-last2=Klein |editor-first2=Stephen B. | name-list-style = vanc |title=Contemporary learning theories |date=1989 |publisher=Lawrence Erlbaum Associates |location=Hillsdale, N.J. |isbn=978-0-89859-915-2}}</ref> |
|||
Individual differences in sensitivity to [[Reward system|reward]], [[Punishment (psychology)|punishment]], and [[motivation]] have been studied under the premises of [[reinforcement sensitivity theory]] and have also been [[Reinforcement sensitivity theory#Workplace performance|applied to workplace performance]]. |
|||
Increasingly, understanding of the role reinforcers play is moving away from a "strengthening" effect to a "signalling" effect.<ref>{{cite journal | vauthors = Cowie S, Davison M, Elliffe D | title = Reinforcement: food signals the time and location of future food | journal = Journal of the Experimental Analysis of Behavior| volume = 96 | issue = 1 | pages = 63–86 | date = July 2011 | pmid = 21765546 | pmc = 3136894 | doi = 10.1901/jeab.2011.96-63 }}</ref> That is, the view that reinforcers increase responding because they signal the behaviors that are likely to result in reinforcement. While in most practical applications, the effect of any given reinforcer will be the same regardless of whether the reinforcer is signalling or strengthening, this approach helps to explain a number of behavioral phenomena including patterns of responding on intermittent reinforcement schedules (fixed interval scallops) and the [[differential outcomes effect]].<ref>{{cite journal |last1=McCormack |first1=Jessica |last2=Arnold-Saritepe |first2=Angela |last3=Elliffe |first3=Douglas | name-list-style = vanc |date= June 2017 | title=The differential outcomes effect in children with autism |journal=Behavioral Interventions |volume=32 |issue=4 |pages=357–369 |doi=10.1002/bin.1489 }}</ref> <!--There's more controversies that can be added here.--> |
|||
== See also == |
== See also == |
||
{{columns-list|colwidth=30em| |
{{columns-list|colwidth=30em| |
||
* [[Abusive power and control]] |
|||
* [[Applied behavior analysis]] |
* [[Applied behavior analysis]] |
||
* [[Behavioral cusp]] |
* [[Behavioral cusp]] |
||
* [[Carrot and stick]] |
|||
* [[Child grooming]] |
|||
* [[Dog training]] |
* [[Dog training]] |
||
* [[Idealisation]] |
|||
* [[Learned industriousness]] |
* [[Learned industriousness]] |
||
* [[Overjustification effect]] |
* [[Overjustification effect]] |
||
* [[Pavlovian-instrumental transfer]] |
* [[Pavlovian-instrumental transfer]] |
||
* [[Punishment]] |
* [[Punishment]] |
||
* [[Reinforcement learning]] |
|||
* [[Reinforcement sensitivity theory]] |
* [[Reinforcement sensitivity theory]] |
||
* [[Reward system]] |
* [[Reward system]] |
||
* [[Society for Quantitative Analysis of Behavior]] |
|||
* [[Token economy]] |
* [[Token economy]] |
||
}} |
}} |
||
== References == |
== References == |
||
{{Reflist|32em}}<ref>{{Cite journal |last1=Burdon |first1=William M. |last2=St. De Lore |first2=Jef |last3=Prendergast |first3=Michael L. |date=September 7, 2011 |title=Developing and Implementing a Positive Behavioral Reinforcement Intervention in Prison-Based Drug Treatment: Project BRITE |journal=Journal of Psychoactive Drugs |language=en |volume=43 |issue=sup1 |pages=40–50 |doi=10.1080/02791072.2011.601990 |pmid=22185038 |issn=0279-1072|pmc=3429341 }}</ref> |
|||
{{Reflist|32em}} |
|||
== |
==Further reading == |
||
{{refbegin|32em}} |
{{refbegin|32em}} |
||
* {{cite thesis | vauthors = Brechner KC | date = 1974 | title = An experimental analysis of social traps. | degree = PhD | publisher = [[Arizona State University]] }} |
* {{cite thesis | vauthors = Brechner KC | date = 1974 | title = An experimental analysis of social traps. | degree = PhD | publisher = [[Arizona State University]] }} |
||
Line 359: | Line 319: | ||
* [http://psych.athabascau.ca/html/prtut/reinpair.htm An On-Line Positive Reinforcement Tutorial] |
* [http://psych.athabascau.ca/html/prtut/reinpair.htm An On-Line Positive Reinforcement Tutorial] |
||
* [http://www.scholarpedia.org/article/Reinforcement Scholarpedia Reinforcement] |
* [http://www.scholarpedia.org/article/Reinforcement Scholarpedia Reinforcement] |
||
* [http://www.scienceofbehavior.com/lms/mod/glossary/view.php?id=408 scienceofbehavior.com] |
* [http://www.scienceofbehavior.com/lms/mod/glossary/view.php?id=408 scienceofbehavior.com] {{Webarchive|url=https://web.archive.org/web/20111002023421/http://www.scienceofbehavior.com/lms/mod/glossary/view.php?id=408 |date=2 October 2011 }} |
||
{{Reinforcement disorders}} |
{{Reinforcement disorders}} |
||
{{Authority control}} |
{{Authority control}} |
||
Latest revision as of 06:44, 8 October 2024
In behavioral psychology, reinforcement refers to consequences that increase the likelihood of an organism's future behavior, typically in the presence of a particular antecedent stimulus.[1] For example, a rat can be trained to push a lever to receive food whenever a light is turned on. In this example, the light is the antecedent stimulus, the lever pushing is the operant behavior, and the food is the reinforcer. Likewise, a student that receives attention and praise when answering a teacher's question will be more likely to answer future questions in class. The teacher's question is the antecedent, the student's response is the behavior, and the praise and attention are the reinforcements.
Consequences that lead to appetitive behavior such as subjective "wanting" and "liking" (desire and pleasure) function as rewards or positive reinforcement.[2] There is also negative reinforcement, which involves taking away an undesirable stimulus. An example of negative reinforcement would be taking an aspirin to relieve a headache.
Reinforcement is an important component of operant conditioning and behavior modification. The concept has been applied in a variety of practical areas, including parenting, coaching, therapy, self-help, education, and management.
Terminology
[edit]Addiction and dependence glossary[3][4][5] | |
---|---|
| |
In the behavioral sciences, the terms "positive" and "negative" refer when used in their strict technical sense to the nature of the action performed by the conditioner rather than to the responding operant's evaluation of that action and its consequence(s). "Positive" actions are those that add a factor, be it pleasant or unpleasant, to the environment, whereas "negative" actions are those that remove or withhold from the environment a factor of either type. In turn, the strict sense of "reinforcement" refers only to reward-based conditioning; the introduction of unpleasant factors and the removal or withholding of pleasant factors are instead referred to as "punishment", which when used in its strict sense thus stands in contradistinction to "reinforcement". Thus, "positive reinforcement" refers to the addition of a pleasant factor, "positive punishment" refers to the addition of an unpleasant factor, "negative reinforcement" refers to the removal or withholding of an unpleasant factor, and "negative punishment" refers to the removal or withholding of a pleasant factor.
This usage is at odds with some non-technical usages of the four term combinations, especially in the case of the term "negative reinforcement", which is often used to denote what technical parlance would describe as "positive punishment" in that the non-technical usage interprets "reinforcement" as subsuming both reward and punishment and "negative" as referring to the responding operant's evaluation of the factor being introduced. By contrast, technical parlance would use the term "negative reinforcement" to describe encouragement of a given behavior by creating a scenario in which an unpleasant factor is or will be present but engaging in the behavior results in either escaping from that factor or preventing its occurrence, as in Martin Seligman’s experimente involving dogs learning to avoid electric shocks.
Introduction
[edit]B.F. Skinner was a well-known and influential researcher who articulated many of the theoretical constructs of reinforcement and behaviorism. Skinner defined reinforcers according to the change in response strength (response rate) rather than to more subjective criteria, such as what is pleasurable or valuable to someone. Accordingly, activities, foods or items considered pleasant or enjoyable may not necessarily be reinforcing (because they produce no increase in the response preceding them). Stimuli, settings, and activities only fit the definition of reinforcers if the behavior that immediately precedes the potential reinforcer increases in similar situations in the future; for example, a child who receives a cookie when he or she asks for one. If the frequency of "cookie-requesting behavior" increases, the cookie can be seen as reinforcing "cookie-requesting behavior". If however, "cookie-requesting behavior" does not increase the cookie cannot be considered reinforcing.
The sole criterion that determines if a stimulus is reinforcing is the change in probability of a behavior after administration of that potential reinforcer. Other theories may focus on additional factors such as whether the person expected a behavior to produce a given outcome, but in the behavioral theory, reinforcement is defined by an increased probability of a response.
The study of reinforcement has produced an enormous body of reproducible experimental results. Reinforcement is the central concept and procedure in special education, applied behavior analysis, and the experimental analysis of behavior and is a core concept in some medical and psychopharmacology models, particularly addiction, dependence, and compulsion.
History
[edit]Laboratory research on reinforcement is usually dated from the work of Edward Thorndike, known for his experiments with cats escaping from puzzle boxes.[6] A number of others continued this research, notably B.F. Skinner, who published his seminal work on the topic in The Behavior of Organisms, in 1938, and elaborated this research in many subsequent publications.[7] Notably Skinner argued that positive reinforcement is superior to punishment in shaping behavior.[8] Though punishment may seem just the opposite of reinforcement, Skinner claimed that they differ immensely, saying that positive reinforcement results in lasting behavioral modification (long-term) whereas punishment changes behavior only temporarily (short-term) and has many detrimental side-effects.
A great many researchers subsequently expanded our understanding of reinforcement and challenged some of Skinner's conclusions. For example, Azrin and Holz defined punishment as a “consequence of behavior that reduces the future probability of that behavior,”[9] and some studies have shown that positive reinforcement and punishment are equally effective in modifying behavior.[citation needed] Research on the effects of positive reinforcement, negative reinforcement and punishment continue today as those concepts are fundamental to learning theory and apply to many practical applications of that theory.
Operant conditioning
[edit]Operant conditioning | Extinction | ||||||||||||||||||||||||||||||
Reinforcement Increase behavior | Punishment Decrease behavior | ||||||||||||||||||||||||||||||
Positive reinforcement Add appetitive stimulus following correct behavior | Negative reinforcement | Positive punishment Add noxious stimulus following behavior | Negative punishment Remove appetitive stimulus following behavior | ||||||||||||||||||||||||||||
Escape Remove noxious stimulus following correct behavior | Active avoidance Behavior avoids noxious stimulus | ||||||||||||||||||||||||||||||
The term operant conditioning was introduced by Skinner to indicate that in his experimental paradigm, the organism is free to operate on the environment. In this paradigm, the experimenter cannot trigger the desirable response; the experimenter waits for the response to occur (to be emitted by the organism) and then a potential reinforcer is delivered. In the classical conditioning paradigm, the experimenter triggers (elicits) the desirable response by presenting a reflex eliciting stimulus, the unconditional stimulus (UCS), which they pair (precede) with a neutral stimulus, the conditional stimulus (CS).
Reinforcement is a basic term in operant conditioning. For the punishment aspect of operant conditioning, see punishment (psychology).
Positive reinforcement
[edit]Positive reinforcement occurs when a desirable event or stimulus is presented as a consequence of a behavior and the chance that this behavior will manifest in similar environments increases.[10]: 253 For example, if reading a book is fun, then experiencing the fun positively reinforces the behavior of reading fun books. The person who receives the positive reinforcement (i.e., who has fun reading the book) will read more books to have more fun.
The high probability instruction (HPI) treatment is a behaviorist treatment based on the idea of positive reinforcement.
Negative reinforcement
[edit]Negative reinforcement increases the rate of a behavior that avoids or escapes an aversive situation or stimulus.[10]: 252–253 That is, something unpleasant is already happening, and the behavior helps the person avoid or escape the unpleasantness. In contrast to positive reinforcement, which involves adding a pleasant stimulus, in negative reinforcement, the focus is on the removal of an unpleasant situation or stimulus. For example, if someone feels unhappy, then they might engage in a behavior (e.g., reading books) to escape from the aversive situation (e.g., their unhappy feelings).[10]: 253 The success of that avoidant or escapist behavior in removing the unpleasant situation or stimulus reinforces the behavior.
Doing something unpleasant to people to prevent or remove a behavior from happening again is punishment, not negative reinforcement.[10]: 252 The main difference is that reinforcement always increases the likelihood of a behavior (e.g., channel surfing while bored temporarily alleviated boredom; therefore, there will be more channel surfing while bored), whereas punishment decreases it (e.g., hangovers are an unpleasant stimulus, so people learn to avoid the behavior that led to that unpleasant stimulus).
Extinction
[edit]Extinction occurs when a given behavior is ignored (i.e. followed up with no consequence). Behaviors disappear over time when they continuously receive no reinforcement. During a deliberate extinction, the targeted behavior spikes first (in an attempt to produce the expected, previously reinforced effects), and then declines over time. Neither reinforcement nor extinction need to be deliberate in order to have an effect on a subject's behavior. For example, if a child reads books because they are fun, then the parents' decision to ignore the book reading will not remove the positive reinforcement (i.e., fun) the child receives from reading books. However, if a child engages in a behavior to get attention from the parents, then the parents' decision to ignore the behavior will cause the behavior to go extinct, and the child will find a different behavior to get their parents' attention.
Reinforcement versus punishment
[edit]Reinforcers serve to increase behaviors whereas punishers serve to decrease behaviors; thus, positive reinforcers are stimuli that the subject will work to attain, and negative reinforcers are stimuli that the subject will work to be rid of or to end.[11] The table below illustrates the adding and subtracting of stimuli (pleasant or aversive) in relation to reinforcement vs. punishment.
Rewarding (pleasant) stimulus | Aversive (unpleasant) stimulus | |
---|---|---|
Adding/presenting | Positive reinforcement
|
Positive punishment
|
Removing/taking away | Negative punishment
|
Negative reinforcement
|
Further ideas and concepts
[edit]- Distinguishing between positive and negative reinforcement can be difficult and may not always be necessary. Focusing on what is being removed or added and how it affects behavior can be more helpful.
- An event that punishes behavior for some may reinforce behavior for others.
- Some reinforcement can include both positive and negative features, such as a drug addict taking drugs for the added euphoria (positive reinforcement) and also to eliminate withdrawal symptoms (negative reinforcement).
- Reinforcement in the business world is essential in driving productivity. Employees are constantly motivated by the ability to receive a positive stimulus, such as a promotion or a bonus. Employees are also driven by negative reinforcement, such as by eliminating unpleasant tasks.
- Though negative reinforcement has a positive effect in the short term for a workplace (i.e. encourages a financially beneficial action), over-reliance on a negative reinforcement hinders the ability of workers to act in a creative, engaged way creating growth in the long term.[12]
Primary and secondary reinforcers
[edit]A primary reinforcer, sometimes called an unconditioned reinforcer, is a stimulus that does not require pairing with a different stimulus in order to function as a reinforcer and most likely has obtained this function through the evolution and its role in species' survival.[13] Examples of primary reinforcers include food, water, and sex. Some primary reinforcers, such as certain drugs, may mimic the effects of other primary reinforcers. While these primary reinforcers are fairly stable through life and across individuals, the reinforcing value of different primary reinforcers varies due to multiple factors (e.g., genetics, experience). Thus, one person may prefer one type of food while another avoids it. Or one person may eat much food while another eats very little. So even though food is a primary reinforcer for both individuals, the value of food as a reinforcer differs between them.
A secondary reinforcer, sometimes called a conditioned reinforcer, is a stimulus or situation that has acquired its function as a reinforcer after pairing with a stimulus that functions as a reinforcer. This stimulus may be a primary reinforcer or another conditioned reinforcer (such as money).
When trying to distinguish primary and secondary reinforcers in human examples, use the "caveman test." If the stimulus is something that a caveman would naturally find desirable (e.g. candy) then it is a primary reinforcer. If, on the other hand, the caveman would not react to it (e.g. a dollar bill), it is a secondary reinforcer. As with primary reinforcers, an organism can experience satisfaction and deprivation with secondary reinforcers.
Other reinforcement terms
[edit]- A generalized reinforcer is a conditioned reinforcer that has obtained the reinforcing function by pairing with many other reinforcers and functions as a reinforcer under a wide-variety of motivating operations. (One example of this is money because it is paired with many other reinforcers).[14]: 83
- In reinforcer sampling, a potentially reinforcing but unfamiliar stimulus is presented to an organism without regard to any prior behavior.
- Socially-mediated reinforcement involves the delivery of reinforcement that requires the behavior of another organism. For example, another person is providing the reinforcement.
- The Premack principle is a special case of reinforcement elaborated by David Premack, which states that a highly preferred activity can be used effectively as a reinforcer for a less-preferred activity.[14]: 123
- Reinforcement hierarchy is a list of actions, rank-ordering the most desirable to least desirable consequences that may serve as a reinforcer. A reinforcement hierarchy can be used to determine the relative frequency and desirability of different activities, and is often employed when applying the Premack principle.[citation needed]
- Contingent outcomes are more likely to reinforce behavior than non-contingent responses. Contingent outcomes are those directly linked to a causal behavior, such a light turning on being contingent on flipping a switch. Note that contingent outcomes are not necessary to demonstrate reinforcement, but perceived contingency may increase learning.
- Contiguous stimuli are stimuli closely associated by time and space with specific behaviors. They reduce the amount of time needed to learn a behavior while increasing its resistance to extinction. [citation needed] Giving a dog a piece of food immediately after sitting is more contiguous with (and therefore more likely to reinforce) the behavior than a several minute delay in food delivery following the behavior.
- Noncontingent reinforcement refers to response-independent delivery of stimuli identified as reinforcers for some behaviors of that organism. However, this typically entails time-based delivery of stimuli identified as maintaining aberrant behavior, which decreases the rate of the target behavior.[15] As no measured behavior is identified as being strengthened, there is controversy surrounding the use of the term noncontingent "reinforcement".[16]
Natural and artificial reinforcement
[edit]In his 1967 paper, Arbitrary and Natural Reinforcement, Charles Ferster proposed classifying reinforcement into events that increase the frequency of an operant behavior as a natural consequence of the behavior itself, and events that affect frequency by their requirement of human mediation, such as in a token economy where subjects are rewarded for certain behavior by the therapist.
In 1970, Baer and Wolf developed the concept of "behavioral traps."[17] A behavioral trap requires only a simple response to enter the trap, yet once entered, the trap cannot be resisted in creating general behavior change. It is the use of a behavioral trap that increases a person's repertoire, by exposing them to the naturally occurring reinforcement of that behavior. Behavioral traps have four characteristics:
- They are "baited" with desirable reinforcers that "lure" the student into the trap.
- Only a low-effort response already in the repertoire is necessary to enter the trap.
- Interrelated contingencies of reinforcement inside the trap motivate the person to acquire, extend, and maintain targeted skills.[18]
- They can remain effective for long periods of time because the person shows few, if any, satiation effects.
Thus, artificial reinforcement can be used to build or develop generalizable skills, eventually transitioning to naturally occurring reinforcement to maintain or increase the behavior. Another example is a social situation that will generally result from a specific behavior once it has met a certain criterion.
Intermittent reinforcement schedules
[edit]Behavior is not always reinforced every time it is emitted, and the pattern of reinforcement strongly affects how fast an operant response is learned, what its rate is at any given time, and how long it continues when reinforcement ceases. The simplest rules controlling reinforcement are continuous reinforcement, where every response is reinforced, and extinction, where no response is reinforced. Between these extremes, more complex schedules of reinforcement specify the rules that determine how and when a response will be followed by a reinforcer.
Specific schedules of reinforcement reliably induce specific patterns of response, and these rules apply across many different species. The varying consistency and predictability of reinforcement is an important influence on how the different schedules operate. Many simple and complex schedules were investigated at great length by B.F. Skinner using pigeons.
Simple schedules
[edit]- Ratio schedule – the reinforcement depends only on the number of responses the organism has performed.
- Continuous reinforcement (CRF) – a schedule of reinforcement in which every occurrence of the instrumental response (desired response) is followed by the reinforcer.[14]: 86
Simple schedules have a single rule to determine when a single type of reinforcer is delivered for a specific response.
- Fixed ratio (FR) – schedules deliver reinforcement after every nth response.[14]: 88 An FR 1 schedule is synonymous with a CRF schedule.
- Variable ratio schedule (VR) – reinforced on average every nth response, but not always on the nth response.[14]: 88
- Fixed interval (FI) – reinforced after n amount of time.
- Variable interval (VI) – reinforced on an average of n amount of time, but not always exactly n amount of time.[14]: 89
- Fixed time (FT) – Provides a reinforcing stimulus at a fixed time since the last reinforcement delivery, regardless of whether the subject has responded or not. In other words, it is a non-contingent schedule.
- Variable time (VT) – Provides reinforcement at an average variable time since last reinforcement, regardless of whether the subject has responded or not.
Simple schedules are utilized in many differential reinforcement[19] procedures:
- Differential reinforcement of alternative behavior (DRA) - A conditioning procedure in which an undesired response is decreased by placing it on extinction or, less commonly, providing contingent punishment, while simultaneously providing reinforcement contingent on a desirable response. An example would be a teacher attending to a student only when they raise their hand, while ignoring the student when he or she calls out.
- Differential reinforcement of other behavior (DRO) – Also known as omission training procedures, an instrumental conditioning procedure in which a positive reinforcer is periodically delivered only if the participant does something other than the target response. An example would be reinforcing any hand action other than nose picking.[14]: 338
- Differential reinforcement of incompatible behavior (DRI) – Used to reduce a frequent behavior without punishing it by reinforcing an incompatible response. An example would be reinforcing clapping to reduce nose picking
- Differential reinforcement of low response rate (DRL) – Used to encourage low rates of responding. It is like an interval schedule, except that premature responses reset the time required between behavior.
- Differential reinforcement of high rate (DRH) – Used to increase high rates of responding. It is like an interval schedule, except that a minimum number of responses are required in the interval in order to receive reinforcement.
Effects of different types of simple schedules
[edit]- Fixed ratio: activity slows after reinforcer is delivered, then response rates increase until the next reinforcer delivery (post-reinforcement pause).
- Variable ratio: rapid, steady rate of responding; most resistant to extinction.
- Fixed interval: responding increases towards the end of the interval; poor resistance to extinction.
- Variable interval: steady activity results, good resistance to extinction.
- Ratio schedules produce higher rates of responding than interval schedules, when the rates of reinforcement are otherwise similar.
- Variable schedules produce higher rates and greater resistance to extinction than most fixed schedules. This is also known as the Partial Reinforcement Extinction Effect (PREE).
- The variable ratio schedule produces both the highest rate of responding and the greatest resistance to extinction (for example, the behavior of gamblers at slot machines).
- Fixed schedules produce "post-reinforcement pauses" (PRP), where responses will briefly cease immediately following reinforcement, though the pause is a function of the upcoming response requirement rather than the prior reinforcement.[20]
- The PRP of a fixed interval schedule is frequently followed by a "scallop-shaped" accelerating rate of response, while fixed ratio schedules produce a more "angular" response.
- fixed interval scallop: the pattern of responding that develops with fixed interval reinforcement schedule, performance on a fixed interval reflects subject's accuracy in telling time.
- The PRP of a fixed interval schedule is frequently followed by a "scallop-shaped" accelerating rate of response, while fixed ratio schedules produce a more "angular" response.
- Organisms whose schedules of reinforcement are "thinned" (that is, requiring more responses or a greater wait before reinforcement) may experience "ratio strain" if thinned too quickly. This produces behavior similar to that seen during extinction.
- Ratio strain: the disruption of responding that occurs when a fixed ratio response requirement is increased too rapidly.
- Ratio run: high and steady rate of responding that completes each ratio requirement. Usually higher ratio requirement causes longer post-reinforcement pauses to occur.
- Partial reinforcement schedules are more resistant to extinction than continuous reinforcement schedules.
- Ratio schedules are more resistant than interval schedules and variable schedules more resistant than fixed ones.
- Momentary changes in reinforcement value lead to dynamic changes in behavior.[21]
Compound schedules
[edit]Compound schedules combine two or more different simple schedules in some way using the same reinforcer for the same behavior. There are many possibilities; among those most often used are:
- Alternative schedules' – A type of compound schedule where two or more simple schedules are in effect and whichever schedule is completed first results in reinforcement.[22]
- Conjunctive schedules – A complex schedule of reinforcement where two or more simple schedules are in effect independently of each other, and requirements on all of the simple schedules must be met for reinforcement.
- Multiple schedules – Two or more schedules alternate over time, with a stimulus indicating which is in force. Reinforcement is delivered if the response requirement is met while a schedule is in effect.
- Mixed schedules – Either of two, or more, schedules may occur with no stimulus indicating which is in force. Reinforcement is delivered if the response requirement is met while a schedule is in effect.
- Concurrent schedules – A complex reinforcement procedure in which the participant can choose any one of two or more simple reinforcement schedules that are available simultaneously. Organisms are free to change back and forth between the response alternatives at any time.
- Concurrent-chain schedule of reinforcement' – A complex reinforcement procedure in which the participant is permitted to choose during the first link which of several simple reinforcement schedules will be in effect in the second link. Once a choice has been made, the rejected alternatives become unavailable until the start of the next trial.
- Interlocking schedules – A single schedule with two components where progress in one component affects progress in the other component. In an interlocking FR 60 FI 120-s schedule, for example, each response subtracts time from the interval component such that each response is "equal" to removing two seconds from the FI schedule.
- Chained schedules – Reinforcement occurs after two or more successive schedules have been completed, with a stimulus indicating when one schedule has been completed and the next has started
- Tandem schedules – Reinforcement occurs when two or more successive schedule requirements have been completed, with no stimulus indicating when a schedule has been completed and the next has started.
- Higher-order schedules – completion of one schedule is reinforced according to a second schedule; e.g. in FR2 (FI10 secs), two successive fixed interval schedules require completion before a response is reinforced.
Superimposed schedules
[edit]This section may require cleanup to meet Wikipedia's quality standards. The specific problem is: convert Author (Year) citations to wiki style. (January 2024) |
The psychology term superimposed schedules of reinforcement refers to a structure of rewards where two or more simple schedules of reinforcement operate simultaneously. Reinforcers can be positive, negative, or both. An example is a person who comes home after a long day at work. The behavior of opening the front door is rewarded by a big kiss on the lips by the person's spouse and a rip in the pants from the family dog jumping enthusiastically. Another example of superimposed schedules of reinforcement is a pigeon in an experimental cage pecking at a button. The pecks deliver a hopper of grain every 20th peck, and access to water after every 200 pecks.
Superimposed schedules of reinforcement are a type of compound schedule that evolved from the initial work on simple schedules of reinforcement by B.F. Skinner and his colleagues (Skinner and Ferster, 1957). They demonstrated that reinforcers could be delivered on schedules, and further that organisms behaved differently under different schedules. Rather than a reinforcer, such as food or water, being delivered every time as a consequence of some behavior, a reinforcer could be delivered after more than one instance of the behavior. For example, a pigeon may be required to peck a button switch ten times before food appears. This is a "ratio schedule". Also, a reinforcer could be delivered after an interval of time passed following a target behavior. An example is a rat that is given a food pellet immediately following the first response that occurs after two minutes has elapsed since the last lever press. This is called an "interval schedule".
In addition, ratio schedules can deliver reinforcement following fixed or variable number of behaviors by the individual organism. Likewise, interval schedules can deliver reinforcement following fixed or variable intervals of time following a single response by the organism. Individual behaviors tend to generate response rates that differ based upon how the reinforcement schedule is created. Much subsequent research in many labs examined the effects on behaviors of scheduling reinforcers.
If an organism is offered the opportunity to choose between or among two or more simple schedules of reinforcement at the same time, the reinforcement structure is called a "concurrent schedule of reinforcement". Brechner (1974, 1977) introduced the concept of superimposed schedules of reinforcement in an attempt to create a laboratory analogy of social traps, such as when humans overharvest their fisheries or tear down their rainforests. Brechner created a situation where simple reinforcement schedules were superimposed upon each other. In other words, a single response or group of responses by an organism led to multiple consequences. Concurrent schedules of reinforcement can be thought of as "or" schedules, and superimposed schedules of reinforcement can be thought of as "and" schedules. Brechner and Linder (1981) and Brechner (1987) expanded the concept to describe how superimposed schedules and the social trap analogy could be used to analyze the way energy flows through systems.
Superimposed schedules of reinforcement have many real-world applications in addition to generating social traps. Many different human individual and social situations can be created by superimposing simple reinforcement schedules. For example, a human being could have simultaneous tobacco and alcohol addictions. Even more complex situations can be created or simulated by superimposing two or more concurrent schedules. For example, a high school senior could have a choice between going to Stanford University or UCLA, and at the same time have the choice of going into the Army or the Air Force, and simultaneously the choice of taking a job with an internet company or a job with a software company. That is a reinforcement structure of three superimposed concurrent schedules of reinforcement.
Superimposed schedules of reinforcement can create the three classic conflict situations (approach–approach conflict, approach–avoidance conflict, and avoidance–avoidance conflict) described by Kurt Lewin (1935) and can operationalize other Lewinian situations analyzed by his force field analysis. Other examples of the use of superimposed schedules of reinforcement as an analytical tool are its application to the contingencies of rent control (Brechner, 2003) and problem of toxic waste dumping in the Los Angeles County storm drain system (Brechner, 2010).
Concurrent schedules
[edit]In operant conditioning, concurrent schedules of reinforcement are schedules of reinforcement that are simultaneously available to an animal subject or human participant, so that the subject or participant can respond on either schedule. For example, in a two-alternative forced choice task, a pigeon in a Skinner box is faced with two pecking keys; pecking responses can be made on either, and food reinforcement might follow a peck on either. The schedules of reinforcement arranged for pecks on the two keys can be different. They may be independent, or they may be linked so that behavior on one key affects the likelihood of reinforcement on the other.
It is not necessary for responses on the two schedules to be physically distinct. In an alternate way of arranging concurrent schedules, introduced by Findley in 1958, both schedules are arranged on a single key or other response device, and the subject can respond on a second key to change between the schedules. In such a "Findley concurrent" procedure, a stimulus (e.g., the color of the main key) signals which schedule is in effect.
Concurrent schedules often induce rapid alternation between the keys. To prevent this, a "changeover delay" is commonly introduced: each schedule is inactivated for a brief period after the subject switches to it.
When both the concurrent schedules are variable intervals, a quantitative relationship known as the matching law is found between relative response rates in the two schedules and the relative reinforcement rates they deliver; this was first observed by R.J. Herrnstein in 1961. Matching law is a rule for instrumental behavior which states that the relative rate of responding on a particular response alternative equals the relative rate of reinforcement for that response (rate of behavior = rate of reinforcement). Animals and humans have a tendency to prefer choice in schedules.[23]
Shaping
[edit]Shaping is the reinforcement of successive approximations to a desired instrumental response. In training a rat to press a lever, for example, simply turning toward the lever is reinforced at first. Then, only turning and stepping toward it is reinforced. Eventually the rat will be reinforced for pressing the lever. The successful attainment of one behavior starts the shaping process for the next. As training progresses, the response becomes progressively more like the desired behavior, with each subsequent behavior becoming a closer approximation of the final behavior.[24]
The intervention of shaping is used in many training situations, and also for individuals with autism as well as other developmental disabilities. When shaping is combined with other evidence-based practices such as Functional Communication Training (FCT),[25] it can yield positive outcomes for human behavior. Shaping typically uses continuous reinforcement, but the response can later be shifted to an intermittent reinforcement schedule.
Shaping is also used for food refusal.[26] Food refusal is when an individual has a partial or total aversion to food items. This can be as minimal as being a picky eater to so severe that it can affect an individual's health. Shaping has been used to have a high success rate for food acceptance.[27]
Chaining
[edit]Chaining involves linking discrete behaviors together in a series, such that the consequence of each behavior is both the reinforcement for the previous behavior, and the antecedent stimulus for the next behavior. There are many ways to teach chaining, such as forward chaining (starting from the first behavior in the chain), backwards chaining (starting from the last behavior) and total task chaining (teaching each behavior in the chain simultaneously). People's morning routines are a typical chain, with a series of behaviors (e.g. showering, drying off, getting dressed) occurring in sequence as a well learned habit.
Challenging behaviors seen in individuals with autism and other related disabilities have successfully managed and maintained in studies using a scheduled of chained reinforcements.[28] Functional communication training is an intervention that often uses chained schedules of reinforcement to effectively promote the appropriate and desired functional communication response.[29]
Mathematical models
[edit]This section needs expansion. You can help by adding to it. (February 2024) |
There has been research on building a mathematical model of reinforcement. This model is known as MPR, which is short for mathematical principles of reinforcement. Peter Killeen has made key discoveries in the field with his research on pigeons.[30]
Applications
[edit]Reinforcement and punishment are ubiquitous in human social interactions, and a great many applications of operant principles have been suggested and implemented. Following are a few examples.
Addiction and dependence
[edit]Positive and negative reinforcement play central roles in the development and maintenance of addiction and drug dependence. An addictive drug is intrinsically rewarding; that is, it functions as a primary positive reinforcer of drug use. The brain's reward system assigns it incentive salience (i.e., it is "wanted" or "desired"),[31][32][33] so as an addiction develops, deprivation of the drug leads to craving. In addition, stimuli associated with drug use – e.g., the sight of a syringe, and the location of use – become associated with the intense reinforcement induced by the drug.[31][32][33] These previously neutral stimuli acquire several properties: their appearance can induce craving, and they can become conditioned positive reinforcers of continued use.[31][32][33] Thus, if an addicted individual encounters one of these drug cues, a craving for the associated drug may reappear. For example, anti-drug agencies previously used posters with images of drug paraphernalia as an attempt to show the dangers of drug use. However, such posters are no longer used because of the effects of incentive salience in causing relapse upon sight of the stimuli illustrated in the posters.
In drug dependent individuals, negative reinforcement occurs when a drug is self-administered in order to alleviate or "escape" the symptoms of physical dependence (e.g., tremors and sweating) and/or psychological dependence (e.g., anhedonia, restlessness, irritability, and anxiety) that arise during the state of drug withdrawal.[31]
Animal training
[edit]Animal trainers and pet owners were applying the principles and practices of operant conditioning long before these ideas were named and studied, and animal training still provides one of the clearest and most convincing examples of operant control. Of the concepts and procedures described in this article, a few of the most salient are: availability of immediate reinforcement (e.g. the ever-present bag of dog yummies); contingency, assuring that reinforcement follows the desired behavior and not something else; the use of secondary reinforcement, as in sounding a clicker immediately after a desired response; shaping, as in gradually getting a dog to jump higher and higher; intermittent reinforcement, reducing the frequency of those yummies to induce persistent behavior without satiation; chaining, where a complex behavior is gradually put together.[34]
Child behavior – parent management training
[edit]Providing positive reinforcement for appropriate child behaviors is a major focus of parent management training. Typically, parents learn to reward appropriate behavior through social rewards (such as praise, smiles, and hugs) as well as concrete rewards (such as stickers or points towards a larger reward as part of an incentive system created collaboratively with the child).[35] In addition, parents learn to select simple behaviors as an initial focus and reward each of the small steps that their child achieves towards reaching a larger goal (this concept is called "successive approximations").[35][36] They may also use indirect rewards such through progress charts. Providing positive reinforcement in the classroom can be beneficial to student success. When applying positive reinforcement to students, it's crucial to make it individualized to that student's needs. This way, the student understands why they are receiving the praise, they can accept it, and eventually learn to continue the action that was earned by positive reinforcement. For example, using rewards or extra recess time might apply to some students more, whereas others might accept the enforcement by receiving stickers or check marks indicating praise.
Economics
[edit]Both psychologists and economists have become interested in applying operant concepts and findings to the behavior of humans in the marketplace. An example is the analysis of consumer demand, as indexed by the amount of a commodity that is purchased. In economics, the degree to which price influences consumption is called "the price elasticity of demand." Certain commodities are more elastic than others; for example, a change in price of certain foods may have a large effect on the amount bought, while gasoline and other essentials may be less affected by price changes. In terms of operant analysis, such effects may be interpreted in terms of motivations of consumers and the relative value of the commodities as reinforcers.[37]
Gambling – variable ratio scheduling
[edit]As stated earlier in this article, a variable ratio schedule yields reinforcement after the emission of an unpredictable number of responses. This schedule typically generates rapid, persistent responding. Slot machines pay off on a variable ratio schedule, and they produce just this sort of persistent lever-pulling behavior in gamblers. Because the machines are programmed to pay out less money than they take in, the persistent slot-machine user invariably loses in the long run. Slots machines, and thus variable ratio reinforcement, have often been blamed as a factor underlying gambling addiction.[38]
Praise
[edit]The concept of praise as a means of behavioral reinforcement in humans is rooted in B.F. Skinner's model of operant conditioning. Through this lens, praise has been viewed as a means of positive reinforcement, wherein an observed behavior is made more likely to occur by contingently praising said behavior.[39] Hundreds of studies have demonstrated the effectiveness of praise in promoting positive behaviors, notably in the study of teacher and parent use of praise on child in promoting improved behavior and academic performance,[40][41] but also in the study of work performance.[42] Praise has also been demonstrated to reinforce positive behaviors in non-praised adjacent individuals (such as a classmate of the praise recipient) through vicarious reinforcement.[43] Praise may be more or less effective in changing behavior depending on its form, content and delivery. In order for praise to effect positive behavior change, it must be contingent on the positive behavior (i.e., only administered after the targeted behavior is enacted), must specify the particulars of the behavior that is to be reinforced, and must be delivered sincerely and credibly.[44]
Acknowledging the effect of praise as a positive reinforcement strategy, numerous behavioral and cognitive behavioral interventions have incorporated the use of praise in their protocols.[45][46] The strategic use of praise is recognized as an evidence-based practice in both classroom management[45] and parenting training interventions,[41] though praise is often subsumed in intervention research into a larger category of positive reinforcement, which includes strategies such as strategic attention and behavioral rewards.
Traumatic bonding
[edit]Traumatic bonding occurs as the result of ongoing cycles of abuse in which the intermittent reinforcement of reward and punishment creates powerful emotional bonds that are resistant to change.[47][48]
The other source indicated that [49] 'The necessary conditions for traumatic bonding are that one person must dominate the other and that the level of abuse chronically spikes and then subsides. The relationship is characterized by periods of permissive, compassionate, and even affectionate behavior from the dominant person, punctuated by intermittent episodes of intense abuse. To maintain the upper hand, the victimizer manipulates the behavior of the victim and limits the victim's options so as to perpetuate the power imbalance. Any threat to the balance of dominance and submission may be met with an escalating cycle of punishment ranging from seething intimidation to intensely violent outbursts. The victimizer also isolates the victim from other sources of support, which reduces the likelihood of detection and intervention, impairs the victim's ability to receive countervailing self-referent feedback, and strengthens the sense of unilateral dependency ... The traumatic effects of these abusive relationships may include the impairment of the victim's capacity for accurate self-appraisal, leading to a sense of personal inadequacy and a subordinate sense of dependence upon the dominating person. Victims also may encounter a variety of unpleasant social and legal consequences of their emotional and behavioral affiliation with someone who perpetrated aggressive acts, even if they themselves were the recipients of the aggression.
Video games
[edit]Most video games are designed around some type of compulsion loop, adding a type of positive reinforcement through a variable rate schedule to keep the player playing the game, though this can also lead to video game addiction.[50]
As part of a trend in the monetization of video games in the 2010s, some games offered "loot boxes" as rewards or purchasable by real-world funds that offered a random selection of in-game items, distributed by rarity. The practice has been tied to the same methods that slot machines and other gambling devices dole out rewards, as it follows a variable rate schedule. While the general perception that loot boxes are a form of gambling, the practice is only classified as such in a few countries as gambling and otherwise legal. However, methods to use those items as virtual currency for online gambling or trading for real-world money has created a skin gambling market that is under legal evaluation.[51]
Criticisms
[edit]The standard definition of behavioral reinforcement has been criticized as circular, since it appears to argue that response strength is increased by reinforcement, and defines reinforcement as something that increases response strength (i.e., response strength is increased by things that increase response strength). However, the correct usage[52] of reinforcement is that something is a reinforcer because of its effect on behavior, and not the other way around. It becomes circular if one says that a particular stimulus strengthens behavior because it is a reinforcer, and does not explain why a stimulus is producing that effect on the behavior. Other definitions have been proposed, such as F.D. Sheffield's "consummatory behavior contingent on a response", but these are not broadly used in psychology.[53]
Increasingly, understanding of the role reinforcers play is moving away from a "strengthening" effect to a "signalling" effect.[54] That is, the view that reinforcers increase responding because they signal the behaviors that are likely to result in reinforcement. While in most practical applications, the effect of any given reinforcer will be the same regardless of whether the reinforcer is signalling or strengthening, this approach helps to explain a number of behavioral phenomena including patterns of responding on intermittent reinforcement schedules (fixed interval scallops) and the differential outcomes effect.[55]
See also
[edit]References
[edit]- ^ Definition of reinforcement from the American Psychological Association Retrieved on January 30th, 2024
- ^ Schultz W (July 2015). "Neuronal Reward and Decision Signals: From Theories to Data". Physiological Reviews. 95 (3): 853–951. doi:10.1152/physrev.00023.2014. PMC 4491543. PMID 26109341.
Rewards in operant conditioning are positive reinforcers. ... Operant behavior gives a good definition for rewards. Anything that makes an individual come back for more is a positive reinforcer and therefore a reward. Although it provides a good definition, positive reinforcement is only one of several reward functions. ... Rewards are attractive. They are motivating and make us exert an effort. ... Rewards induce approach behavior, also called appetitive or preparatory behavior, and consummatory behavior. ... Thus any stimulus, object, event, activity, or situation that has the potential to make us approach and consume it is by definition a reward. ... Intrinsic rewards are activities that are pleasurable on their own and are undertaken for their own sake, without being the means for getting extrinsic rewards. ... Intrinsic rewards are genuine rewards in their own right, as they induce learning, approach, and pleasure, like perfectioning, playing, and enjoying the piano. Although they can serve to condition higher order rewards, they are not conditioned, higher order rewards, as attaining their reward properties does not require pairing with an unconditioned reward.
- ^ Malenka RC, Nestler EJ, Hyman SE (2009). "Chapter 15: Reinforcement and Addictive Disorders". In Sydor A, Brown RY (eds.). Molecular Neuropharmacology: A Foundation for Clinical Neuroscience (2nd ed.). New York: McGraw-Hill Medical. pp. 364–375. ISBN 9780071481274.
- ^ Nestler EJ (December 2013). "Cellular basis of memory for addiction". Dialogues in Clinical Neuroscience. 15 (4): 431–443. PMC 3898681. PMID 24459410.
Despite the importance of numerous psychosocial factors, at its core, drug addiction involves a biological process: the ability of repeated exposure to a drug of abuse to induce changes in a vulnerable brain that drive the compulsive seeking and taking of drugs, and loss of control over drug use, that define a state of addiction. ... A large body of literature has demonstrated that such ΔFosB induction in D1-type [nucleus accumbens] neurons increases an animal's sensitivity to drug as well as natural rewards and promotes drug self-administration, presumably through a process of positive reinforcement ... Another ΔFosB target is cFos: as ΔFosB accumulates with repeated drug exposure it represses c-Fos and contributes to the molecular switch whereby ΔFosB is selectively induced in the chronic drug-treated state.41. ... Moreover, there is increasing evidence that, despite a range of genetic risks for addiction across the population, exposure to sufficiently high doses of a drug for long periods of time can transform someone who has relatively lower genetic loading into an addict.
- ^ Volkow ND, Koob GF, McLellan AT (January 2016). "Neurobiologic Advances from the Brain Disease Model of Addiction". New England Journal of Medicine. 374 (4): 363–371. doi:10.1056/NEJMra1511480. PMC 6135257. PMID 26816013.
Substance-use disorder: A diagnostic term in the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) referring to recurrent use of alcohol or other drugs that causes clinically and functionally significant impairment, such as health problems, disability, and failure to meet major responsibilities at work, school, or home. Depending on the level of severity, this disorder is classified as mild, moderate, or severe.
Addiction: A term used to indicate the most severe, chronic stage of substance-use disorder, in which there is a substantial loss of self-control, as indicated by compulsive drug taking despite the desire to stop taking the drug. In the DSM-5, the term addiction is synonymous with the classification of severe substance-use disorder. - ^ Thorndike E (June 1898). "Some Experiments on Animal Intelligence". Science. 7 (181): 818–24. Bibcode:1898Sci.....7..818T. doi:10.1126/science.7.181.818. PMID 17769765.
- ^ Skinner, B. F. "The Behavior of Organisms: An Experimental Analysis", 1938 New York: Appleton-Century-Crofts
- ^ Skinner BF (1948). Walden Two. Toronto: The Macmillan Company.
- ^ Honig W (1966). Operant Behavior: Areas of Research and Application. New York: Meredith Publishing Company. p. 381.
- ^ a b c d Flora S (2004). The Power of Reinforcement. Albany: State University of New York Press.
- ^ D'Amato MR (1969). Marx MH (ed.). Learning Processes: Instrumental Conditioning. Toronto: The Macmillan Company.
- ^ Harter JK (2002). Keyes CL (ed.). Well-Being in the Workplace and its Relationship to Business Outcomes: A Review of the Gallup Studies (PDF). Washington D.C.: American Psychological Association.
- ^ Skinner, B.F. (1974). About Behaviorism
- ^ a b c d e f g Miltenberger, R. G. "Behavioral Modification: Principles and Procedures". Thomson/Wadsworth, 2008.
- ^ Tucker M, Sigafoos J, Bushell H (October 1998). "Use of noncontingent reinforcement in the treatment of challenging behavior. A review and clinical guide". Behavior Modification. 22 (4): 529–47. doi:10.1177/01454455980224005. PMID 9755650. S2CID 21542125.
- ^ Droleskey RE, Andrews K, Chiarantini L, DeLoach JR (1992). "Use of fluorescent probes for describing the process of encapsulation by hypotonic dialysis". The Use of Resealed Erythrocytes as Carriers and Bioreactors. Advances in Experimental Medicine and Biology. Vol. 326. pp. 73–80. doi:10.1007/978-1-4615-3030-5_9. ISBN 978-1-4613-6321-7. PMID 1284187.
- ^ Baer DM, Wolf MM. "The entry into natural communities of reinforcement". In Ulrich R, Stachnik T, Mabry J (eds.). Control of human behavior. Vol. 2. Glenview, IL: Scott Foresman. pp. 319–24.
- ^ Kohler FW, Greenwood CR (1986). "Toward a technology of generalization: The identification of natural contingencies of reinforcement". The Behavior Analyst. 9 (1): 19–26. doi:10.1007/bf03391926. PMC 2741872. PMID 22478644.
- ^ Vollmer TR, Iwata BA (1992). "Differential reinforcement as treatment for behavior disorders: procedural and functional variations". Research in Developmental Disabilities. 13 (4): 393–417. doi:10.1016/0891-4222(92)90013-v. PMID 1509180.
- ^ Derenne A, Flannery KA (2007). "Within Session FR Pausing". The Behavior Analyst Today. 8 (2): 175–86. doi:10.1037/h0100611.
- ^ McSweeney FK, Murphy ES, Kowal BP (2001). "Dynamic changes in reinforcer value: Some misconceptions and why you should care". The Behavior Analyst Today. 2 (4): 341–349. doi:10.1037/h0099952.
- ^ Iversen IH, Lattal KA (1991). Experimental Analysis of Behavior. Amsterdam: Elsevier. ISBN 9781483291260.
- ^ Martin TL, Yu CT, Martin GL, Fazzio D (2006). "On Choice, Preference, and Preference For Choice". The Behavior Analyst Today. 7 (2): 234–48. doi:10.1037/h0100083. PMC 3558524. PMID 23372459.
- ^ Schacter DL, Gilbert DT, Wegner DM (2011). "Chapter 7: Learning". Psychology (2nd ed.). New York: Worth Publishers. pp. 284–85. ISBN 978-1-4292-3719-2.
- ^ Ghaemmaghami, Mahshid; Hanley, Gregory P.; Jessel, Joshua; Landa, Robin (14 May 2018). "Shaping complex functional communication responses". Journal of Applied Behavior Analysis. 51 (3): 502–520. doi:10.1002/jaba.468. ISSN 0021-8855. PMID 29761485.
- ^ Tarbox and Lanagan Bermudez, Jonathan and Taira (2017). Treating Feeding Challenges in Autism. San Diego: Academic Press. pp. 1–6. ISBN 978-0-12-813563-1.
- ^ Turner, Virginia R; et al. (2020). "Response Shaping to Improve Food Acceptance for Children with Autism: Effects of Small and Large Food Sets". Research in Developmental Disabilities. 98: 103574. doi:10.1016/j.ridd.2020.103574. PMID 31982827. S2CID 210922007.
- ^ "CORRIGENDUM to "Further Evaluations of Functional Communication Training and Chained Schedules of Reinforcement to Treat Multiple Functions of Challenging Behavior"". Behavior Modification. 46 (1): 254. 24 July 2020. doi:10.1177/0145445520945810. ISSN 0145-4455. PMID 32706269. S2CID 241136859.
- ^ Falcomata, Terry S.; Roane, Henry S.; Muething, Colin S.; Stephenson, Kasey M.; Ing, Anna D. (9 February 2012). "Functional Communication Training and Chained Schedules of Reinforcement to Treat Challenging Behavior Maintained by Terminations of Activity Interruptions". Behavior Modification. 36 (5): 630–649. doi:10.1177/0145445511433821. ISSN 0145-4455. PMID 22327267. S2CID 29108702.
- ^ Killeen PR (4 February 2010). "Mathematical principles of reinforcement". Behavioral and Brain Sciences. 17 (1): 105–135. doi:10.1017/S0140525X00033628.
- ^ a b c d Edwards S (2016). "Reinforcement principles for addiction medicine; from recreational drug use to psychiatric disorder". Neuroscience for Addiction Medicine: From Prevention to Rehabilitation - Constructs and Drugs. Progress in Brain Research. Vol. 223. pp. 63–76. doi:10.1016/bs.pbr.2015.07.005. ISBN 9780444635457. PMID 26806771.
Abused substances (ranging from alcohol to psychostimulants) are initially ingested at regular occasions according to their positive reinforcing properties. Importantly, repeated exposure to rewarding substances sets off a chain of secondary reinforcing events, whereby cues and contexts associated with drug use may themselves become reinforcing and thereby contribute to the continued use and possible abuse of the substance(s) of choice. ...
An important dimension of reinforcement highly relevant to the addiction process (and particularly relapse) is secondary reinforcement (Stewart, 1992). Secondary reinforcers (in many cases also considered conditioned reinforcers) likely drive the majority of reinforcement processes in humans. In the specific case of drug [addiction], cues and contexts that are intimately and repeatedly associated with drug use will often themselves become reinforcing ... A fundamental piece of Robinson and Berridge's incentive-sensitization theory of addiction posits that the incentive value or attractive nature of such secondary reinforcement processes, in addition to the primary reinforcers themselves, may persist and even become sensitized over time in league with the development of drug addiction (Robinson and Berridge, 1993). ...
Negative reinforcement is a special condition associated with a strengthening of behavioral responses that terminate some ongoing (presumably aversive) stimulus. In this case we can define a negative reinforcer as a motivational stimulus that strengthens such an "escape" response. Historically, in relation to drug addiction, this phenomenon has been consistently observed in humans whereby drugs of abuse are self-administered to quench a motivational need in the state of withdrawal (Wikler, 1952). - ^ a b c Berridge KC (April 2012). "From prediction error to incentive salience: mesolimbic computation of reward motivation". The European Journal of Neuroscience. 35 (7): 1124–43. doi:10.1111/j.1460-9568.2012.07990.x. PMC 3325516. PMID 22487042.
When a Pavlovian CS+ is attributed with incentive salience it not only triggers 'wanting' for its UCS, but often the cue itself becomes highly attractive – even to an irrational degree. This cue attraction is another signature feature of incentive salience. The CS becomes hard not to look at (Wiers & Stacy, 2006; Hickey et al., 2010a; Piech et al., 2010; Anderson et al., 2011). The CS even takes on some incentive properties similar to its UCS. An attractive CS often elicits behavioral motivated approach, and sometimes an individual may even attempt to 'consume' the CS somewhat as its UCS (e.g., eat, drink, smoke, have sex with, take as drug). 'Wanting' of a CS can turn also turn the formerly neutral stimulus into an instrumental conditioned reinforcer, so that an individual will work to obtain the cue (however, there exist alternative psychological mechanisms for conditioned reinforcement too).
- ^ a b c Berridge KC, Kringelbach ML (May 2015). "Pleasure systems in the brain". Neuron. 86 (3): 646–64. doi:10.1016/j.neuron.2015.02.018. PMC 4425246. PMID 25950633.
An important goal in future for addiction neuroscience is to understand how intense motivation becomes narrowly focused on a particular target. Addiction has been suggested to be partly due to excessive incentive salience produced by sensitized or hyper-reactive dopamine systems that produce intense 'wanting' (Robinson and Berridge, 1993). But why one target becomes more 'wanted' than all others has not been fully explained. In addicts or agonist-stimulated patients, the repetition of dopamine-stimulation of incentive salience becomes attributed to particular individualized pursuits, such as taking the addictive drug or the particular compulsions. In Pavlovian reward situations, some cues for reward become more 'wanted' more than others as powerful motivational magnets, in ways that differ across individuals (Robinson et al., 2014b; Saunders and Robinson, 2013). ... However, hedonic effects might well change over time. As a drug was taken repeatedly, mesolimbic dopaminergic sensitization could consequently occur in susceptible individuals to amplify 'wanting' (Leyton and Vezina, 2013; Lodge and Grace, 2011; Wolf and Ferrario, 2010), even if opioid hedonic mechanisms underwent down-regulation due to continual drug stimulation, producing 'liking' tolerance. Incentive-sensitization would produce addiction, by selectively magnifying cue-triggered 'wanting' to take the drug again, and so powerfully cause motivation even if the drug became less pleasant (Robinson and Berridge, 1993).
- ^ McGreevy PD, Boakes RA (2007). Carrots and sticks: principles of animal training. Cambridge: Cambridge University Press. ISBN 978-0-521-68691-4.
- ^ a b Kazdin AE (2010). Problem-solving skills training and parent management training for oppositional defiant disorder and conduct disorder. Evidence-based psychotherapies for children and adolescents (2nd ed.), 211–226. New York: Guilford Press.
- ^ Forgatch MS, Patterson GR (2010). Parent management training — Oregon model: An intervention for antisocial behavior in children and adolescents. Evidence-based psychotherapies for children and adolescents (2nd ed.), 159–78. New York: Guilford Press.
- ^ Domjan, M. (2009). The Principles of Learning and Behavior. Wadsworth Publishing Company. 6th Edition. pages 244–249.
- ^ Lozano Bleda JH, Pérez Nieto MA (November 2012). "Impulsivity, intelligence, and discriminating reinforcement contingencies in a fixed-ratio 3 schedule". The Spanish Journal of Psychology. 15 (3): 922–9. doi:10.5209/rev_sjop.2012.v15.n3.39384. PMID 23156902. S2CID 144193503.
- ^ Kazdin, Alan (1978). History of behavior modification: Experimental foundations of contemporary research. Baltimore: University Park Press. ISBN 9780839112051.
- ^ Baker GL, Barnes HJ (1992). "Superior vena cava syndrome: etiology, diagnosis, and treatment". American Journal of Critical Care. 1 (1): 54–64. doi:10.4037/ajcc1992.1.1.54. PMID 1307879.
- ^ a b Garland AF, Hawley KM, Brookman-Frazee L, Hurlburt MS (May 2008). "Identifying common elements of evidence-based psychosocial treatments for children's disruptive behavior problems". Journal of the American Academy of Child and Adolescent Psychiatry. 47 (5): 505–14. doi:10.1097/CHI.0b013e31816765c2. PMID 18356768.
- ^ Crowell CR, Anderson DC, Abel DM, Sergio JP (1988). "Task clarification, performance feedback, and social praise: Procedures for improving the customer service of bank tellers". Journal of Applied Behavior Analysis. 21 (1): 65–71. doi:10.1901/jaba.1988.21-65. PMC 1286094. PMID 16795713.
- ^ Goldman NC (1992). "Adenoid cystic carcinoma of the external auditory canal". Otolaryngology–Head and Neck Surgery. 106 (2): 214–5. doi:10.1177/019459989210600211. PMID 1310808. S2CID 23782303.
- ^ Brophy J (1981). "On praising effectively". The Elementary School Journal. 81 (5): 269–278. doi:10.1086/461229. JSTOR 1001606. S2CID 144444174.
- ^ a b Simonsen B, Fairbanks S, Briesch A, Myers D, Sugai G (2008). "Evidence-based Practices in Classroom Management: Considerations for Research to Practice". Education and Treatment of Children. 31 (1): 351–380. doi:10.1353/etc.0.0007. S2CID 145087451.
- ^ Weisz JR, Kazdin AE (2010). Evidence-based psychotherapies for children and adolescents. Guilford Press. ISBN 9781606235256.
- ^ Dutton; Painter (1981). "Traumatic Bonding: The development of emotional attachments in battered women and other relationships of intermittent abuse". Victimology (7).
- ^ Chrissie Sanderson. Counselling Survivors of Domestic Abuse. Jessica Kingsley Publishers; 15 June 2008. ISBN 978-1-84642-811-1. p. 84.
- ^ "Traumatic Bonding | Encyclopedia.com".
- ^ Hopson J (27 April 2001). "Behavioral Game Design". Gamasutra.
- ^ Hood V (12 October 2017). "Are loot boxes gambling?". Eurogamer. Retrieved 12 October 2017.
- ^ Skinner BF (1982). Epstein R (ed.). Skinner for the classroom : selected papers. Champaign, Ill.: Research Press. ISBN 978-0-87822-261-2.
- ^ Vaccarino FJ, Schiff BB, Glickman SE (1989). Mowrer RR, Klein SB (eds.). Contemporary learning theories. Hillsdale, N.J.: Lawrence Erlbaum Associates. ISBN 978-0-89859-915-2.
- ^ Cowie S, Davison M, Elliffe D (July 2011). "Reinforcement: food signals the time and location of future food". Journal of the Experimental Analysis of Behavior. 96 (1): 63–86. doi:10.1901/jeab.2011.96-63. PMC 3136894. PMID 21765546.
- ^ McCormack J, Arnold-Saritepe A, Elliffe D (June 2017). "The differential outcomes effect in children with autism". Behavioral Interventions. 32 (4): 357–369. doi:10.1002/bin.1489.
Further reading
[edit]- Brechner KC (1974). An experimental analysis of social traps (PhD thesis). Arizona State University.
- Brechner KC (1977). "An experimental analysis of social traps". Journal of Experimental Social Psychology. 13 (6): 552–64. doi:10.1016/0022-1031(77)90054-3.
- Brechner KC (1987). Social Traps, Individual Traps, and Theory in Social Psychology. Bulletin No. 870001. Pasadena, CA: Time River Laboratory.
- Brechner KC (28 February 2003). "Superimposed schedules applied to rent control.". In Levine DK, Pesendorfer W (eds.). Economic and Game Theory.
- Brechner KC, Linder DE (1981). "A social trap analysis of energy distribution systems". In Baum A, Singer JE (eds.). Advances in Environmental Psychology. Vol. 3. Hillsdale, NJ: Lawrence Erlbaum & Associates.
- Chance P (2003). Learning and Behavior (5th ed.). Toronto: Thomson-Wadsworth.
- Cowie S (2019). "Some weaknesses of a response-strength account of reinforcer effects". European Journal of Behavior Analysis. 21 (2): 1–16. doi:10.1080/15021149.2019.1685247. S2CID 210503231.
- Dinsmoor JA (November 2004). "The etymology of basic concepts in the experimental analysis of behavior". Journal of the Experimental Analysis of Behavior. 82 (3): 311–6. doi:10.1901/jeab.2004.82-311. PMC 1285013. PMID 15693525.
- Ferster CB, Skinner BF (1957). Schedules of reinforcement. New York: Appleton-Century-Crofts. ISBN 0-13-792309-0.
- Lewin K (1935). A dynamic theory of personality: Selected papers. New York: McGraw-Hill. ISBN 9781447497134.
- Skinner BF (1938). The behavior of organisms. New York: Appleton-Century-Crofts. ISBN 9780996453905.
- Skinner BF (1956). "A case history in scientific method". American Psychologist. 11 (5): 221–33. doi:10.1037/h0047662.
- Zeiler MD (July 1968). "Fixed and variable schedules of response-independent reinforcement". Journal of the Experimental Analysis of Behavior. 11 (4): 405–14. doi:10.1901/jeab.1968.11-405. PMC 1338502. PMID 5672249.
- "Glossary of reinforcement terms". University of Iowa. Archived from the original on 13 April 2007.
- Harter JK, Shmidt FL, Keyes CL (2002). "Well-Being in the Workplace and its Relationship to Business Outcomes: A Review of the Gallup Studies.". In Keyes CL, Haidt J (eds.). Flourishing: The Positive Person and the Good Life. Washington D.C.: American Psychological Association. pp. 205–224.
External links
[edit]- An On-Line Positive Reinforcement Tutorial
- Scholarpedia Reinforcement
- scienceofbehavior.com Archived 2 October 2011 at the Wayback Machine
- ^ Burdon, William M.; St. De Lore, Jef; Prendergast, Michael L. (7 September 2011). "Developing and Implementing a Positive Behavioral Reinforcement Intervention in Prison-Based Drug Treatment: Project BRITE". Journal of Psychoactive Drugs. 43 (sup1): 40–50. doi:10.1080/02791072.2011.601990. ISSN 0279-1072. PMC 3429341. PMID 22185038.