Serial Pattern Learning: Pigeons Prefer an Improving Schedule Over an Initially Easier Fixed Ratio Schedule

doi:10.21203/rs.3.rs-3401060/v1

Download PDF

Research Article

Serial Pattern Learning: Pigeons Prefer an Improving Schedule Over an Initially Easier Fixed Ratio Schedule

https://doi.org/10.21203/rs.3.rs-3401060/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Serial pattern learning describes behavior in which a subject anticipates not only the time and effort needed for the next reinforcer but also the pattern of time and effort to reinforcers after the first. Chandel et al. (2021) found that pigeons left a progressive schedule (in which each reinforcer was successively harder to obtain) earlier than would have been optimal. They argued that the pigeons anticipated the harder to obtain reinforcers beyond the next one. In the present experiments, pigeons were trained on a progressive schedule for which each reinforcer was successively easier to obtain but the initial choice was between a fixed ratio schedule (FR23) for which a reinforcer was easier to obtain than the first reinforcer on the improving progressive schedule (32 pecks). Delayed discounting suggests that the pigeons would prefer the FR23 because more immediate reinforcers should be preferred, whereas serial pattern learning suggests that the progressive schedule might be preferred because easier to obtain reinforcers would follow the initially harder 32 pecks. In Experiment 1, a preference (nonsignificant) for the progressive schedule was found. In Experiment 2, when the two alternatives were equated for the number of reinforcers that could be obtained on each trial, a significant preference for the improving progressive schedule was found. The results of both experiments were consistent with the serial pattern learning hypothesis. In both experiments the pigeons the pigeons did not choose the more immediate reinforcer associated with fixed ratio alternative. Rather they showed a preference for the improving progressive schedule for which later reinforcers would be easier to obtain.

A progressive schedule typically is one in which the number of responses required before reinforcer delivery increases following each reinforcer (Hodos, 1961). A progressive schedule provides an operant analog of patch foraging behavior in which the time foraging in a patch depletes the patch of food. In nature, as the patch becomes depleted of food, one would expect an animal to leave that patch in search of a more fruitful patch. To model foraging behavior, one can ask how long an animal will stay in a patch when an alternative patch is available. The time in a patch should depend on several factors: (1) the rate at which the patch is depleted, (2) the availability of alternative patches, (3) the probability that the alternative patches provide greater probability of finding food, (4) the travel distance to the alternative patch, and other factors such as the danger (of predation or conflict) associated with leaving the present patch (e.g., Marshall et al., 2013).

Some studies that have tried to model foraging behavior have attempted to simplify the prediction of when animals would be expected leave a patch (Bhatt & Wasserman, 1987; Chandel et al., 2021). These studies have used operant procedures with pigeons using a highly predictable depleting schedule (e.g., starting with 4 pecks to obtain the first reinforcer and doubling the number of pecks with each reinforcer). They have signaled the value of the alternative patch by using a fixed ratio (a fixed number of responses for each reinforcer) of reinforcement in the alternative patch. They have also minimized the travel time between the progressively depleting patch and the alternative patch by providing the alternative patch on a different response key on the same operant panel. With experience, this procedure should eliminate any perceived danger associated with traveling to the alternative patch. Given considerable experience with the progressive schedule and with various signaled fixed ratio alternative patches, the question was would the pigeons leave the depleting patch at the optimal time (i.e., when the number of responses to obtain food on the progressive schedule was greater than the number of responses to obtain food on the fixed ratio schedule). If an animal left the depleting patch at the optimal time, it would be consistent with optimal foraging theory (Stephens & Krebs, 1986).

With the procedure used by Chandel et al. (2021), on each trial, the pigeon was presented with a progressive (depleting) schedule that required for a reinforcer, first 4 pecks, then 8 pecks, then 16 pecks, then 32, pecks, and finally 64 pecks. The signaled alternative fixed ratio schedule required 6 pecks for a reinforcer if it was red, 11 pecks for a reinforcer if it was green, 23 pecks for a reinforcer if it was blue, and 45 pecks for a reinforcer if it was yellow (see design in Figure 1). The number of pecks required for reinforcement on the alternative schedule was selected to be close to the geometric mean of the two successive number of pecks on the progressive schedule. Thus, the question was would the pigeons switch to the alternative signaled fixed ratio schedule after one reinforcer on the progressive schedule when the alternative patch signal was red, after two reinforcers when the alternative patch signal was green, after three reinforcers when the alternative patch signal was blue, and after four reinforcers when the alternative patch signal was yellow.

The results found with this procedure were unexpected. If the pigeons had not adequately learned the relation between the depleting schedule of reinforcement and the alternative fixed ratio schedule, or if there was added value to continuing on the depleting schedule in the form of a sunk cost (see Pattison, Zentall, & Watanabe, 2012; Watanabe, 2009) one might have expected the pigeons to stay with the depleting schedule longer than was optimal. Instead, the pigeons tended to leave the progressive schedule sooner than would have been optimal. Specifically, they switched to the green, 11 peck schedule, when the progressive schedule required only 8 pecks for reinforcement, and they switched to the blue, 23 peck schedule, when the progressive schedule required only 16 pecks for reinforcement. It was as if the pigeons were considering not only the number of pecks (or time) to the next reinforcer on the progressive schedule, but in addition, what would come after that. That is, the pigeons were considering the systematic decreasing value of the progressive schedule.

A related finding was reported by Hulse and Dorsky (1977) with rats. They assessed the running speed of rats that ran to the last of five straight-alley runs that had no reinforcer, when the preceding four runs had progressively less food (analogous to a depleting patch) and they compared the running speed with a group for which the decreasing sequence was not monotonic, but in other respects was the same. Specifically, the rats that ran to the progressively smaller number of pellets each day experienced 14, 7, 3, 1, 0 pellets on successive runs, whereas the rats that ran to the nonmonotonic series of pellets each day experienced 14, 1, 3, 7, 0 pellets on successive runs. Hulse and Dorsky found that the rats exposed to the progressively decreasing runs ran slower on the last run of the day with no reinforcer than the rats that ran to the nonmonotonically decreasing schedule. Hulse and Dorsky concluded that “the formally defined structural complexity of each pattern adequately predicts its relative difficulty.” They referred to this kind of structured integration as serial pattern learning.

In the Chandel et al. (2021) experiment, it was hypothesized that the pigeons were leaving the progressive schedule earlier than would be optimal because they were considering the decreasing value of the progressive schedule. If so, one might hypothesize that pigeons might also show the opposite tendency if they were trained on a schedule for which it was progressively easier to obtain a reinforcer.

Imagine a schedule in which the first reinforcer could be obtained with 32 pecks, the second with 16 pecks, the third with 8 pecks, and the fourth with 4 pecks. If the pigeon was exposed to that schedule, with an alternative, fixed ratio 23 schedule available, would the pigeon prefer making 32 pecks on an improving schedule, to making 23 pecks on the fixed ratio schedule because making 32 pecks would allow the pigeon to obtain a second reinforcer by making only 16 pecks, and a third reinforcer by making 8 pecks and a fourth reinforcer by making 4 pecks? Alternatively, pigeons are known to heavily discount delayed reinforcers (e.g., Green & Meyerson, 2004; Laude et al., 2014). It is possible that given a choice between 23 pecks and 32 pecks, the pigeon would prefer obtaining a reinforcer sooner, and the added reinforcers would be so heavily discounted that they would not outweigh the more immediate 23 pecks to reinforcement.

The purpose of Experiment 1 was to test the hypothesis that pigeons would consider not only the immediacy of reinforcement, comparing 23 pecks for the fixed ratio schedule to 32 pecks for the improving schedule, but also consider that after the first 32 pecks, the improving schedule would then become better. In the present experiment, we gave the pigeons a choice between a more immediate reinforcer following 23 pecks and a more delayed reinforcer following 32 pecks. But the reinforcer following 32 pecks was followed by an improving schedule of 16 pecks for a second reinforcer, followed by 8 pecks to a third reinforcer, followed by 4 pecks to a fourth reinforcer.

Method

Subjects

The subjects in Experiment 1 were 12 unsexed White Carneau pigeons from the Palmetto Pigeon Plant (Sumter, SC). All the pigeons had experience with successive color discriminations. Each pigeon was housed in a wire cage (28 X 38 X 30.5 cm) with unlimited access to water and grit. The room in which they were housed was on a twelve hours light:dark cycle. Each pigeon was held at 85% of its free feeding body weight. The pigeons were treated in accordance with University of Kentucky Animal Care Guidelines, Protocol #2020-3675.

Apparatus

The experiment was conducted in a BRS/LVE (Laurel, MD, USA) sound-attenuating operant test chamber with inside measurements 35 cm high, 30 cm long, and 35 cm across the response panel. The response panel had three horizontally aligned circular response keys (2.54 cm diameter spaced 6.0 cm apart edge to edge) located 25 cm from the floor of the chamber.

Behind each response key was a 12-stimulus in-line projector (Industrial Electronics Engineering, Van Nuys, CA, USA). Each key could project red and green hues. The bottom of the center-mounted feeder was 9.5 cm from the floor. Reinforcement consisted of 1.5-s access to Brown’s Premium Pigeon Feeds. A microcomputer in an adjacent room controlled the experiment.

Procedure

Pretraining. To give the pigeons experience with the number of pecks that would be required to receive a reinforcer, all birds received pretraining. Over the first two sessions, the pigeons were trained to peck the red key on the left 23 times for reinforcement or the green key on the right key 32 times for reinforcement. Once the pigeons reliably pecked both stimuli presented individually throughout a session, training began.

Training. Each trial began with two simultaneously presented schedules: a fixed ratio 23 schedule signaled by a red key on the left and progressive improving schedule signaled by a green key on the right. The response requirement for the progressive schedule started with 32 pecks for the first reinforcer, 16 pecks for the second reinforcer, 8 pecks for the third reinforcer, and 4 pecks for the fourth reinforcer. The design of the experiment is presented in Figure 2.

The number of pecks required for the fixed ratio schedule represented an approximation of the geometric mean of the pecks required for the first two reinforcers on the improving schedule (see Chandel et al., 2021). The pigeons had the option to peck either key to initiate the trial, and the colored keys for both schedules remained illuminated until the conclusion of the trial. If the pigeon chose the fixed ratio schedule, it received a reinforcer, and the trial was over. If the pigeon chose the progressive schedule, the trial would proceed until all four reinforcers were obtained and the trial was over (this design paralleled that used by Chandel et al.). If the pigeon chose the progressive schedule it could switch to the fixed ratio schedule at any time. Each session consisted of 48 trials, with a 5-s intertrial interval. Experiment 1 consisted of 44 sessions. Sessions were conducted six days a week.

Of primary interest was the pigeons’ initial choice. Given a choice between 23 pecks for a reinforcer to the fixed ratio schedule and 32 pecks for a reinforcer to the progressive schedule, the reinforcer could be obtained with least effort (most quickly) for choice of the fixed ratio schedule. However, if the pigeon was sensitive to the improvement in the progressive schedule (as well as the greater number of reinforcers that could be obtained), it might prefer the progressive schedule.

Results

The pigeons quickly showed a preference for the improving progressive schedule. The training results are presented in Figure 3. A one-way repeated measures analysis of variance performed on the preference scores as a function of training session indicated that the effect of sessions was not significant, F(1,43) = 1.36, p =.068. When the data were pooled over all 44 training sessions and tested for schedule preference (compared to chance), the preference for the progressive improving schedule (.57) was not significantly different from chance, t(11) = 1.57, p = .14. When the data were pooled over the last 10 sessions, the preference for the progressive schedule (.61) was still not significantly different from chance, t(11) = 1.93, p = .08.

Discussion

The results of Experiment 1 indicate that the pigeons did not prefer the more immediately reinforcing fixed ratio 23 schedule over the progressively improving schedule starting with 32 pecks. This result might suggest that the pigeons could anticipate that although the first reinforcer would require a greater number of pecks to obtain, successive reinforcers could be obtained with fewer pecks (16, 8 and 4 pecks) on that schedule.

Alternatively, a preference for the fixed ratio 23 schedule may have been obscured by the pigeons sensitivity to the number of reinforcers that could be obtained on each schedule. With the procedure used in this experiment, the pigeons could obtain four reinforcers on the progressively improving schedule, whereas they could obtain only one reinforcer on the fixed ratio 23 schedule. The rationale for this disparity was we were following the Chandel et al. (2021) design in which switching to the fixed ratio schedule resulted in a single reinforcer, whereas staying with the progressive depleting schedule in that experiment could obtain up to five reinforcers. In spite of this discrepancy in the number of reinforcers obtainable, the pigeons in the Chandel et al. experiment often left the depleating progressive schedule early.

In Experiment 1, although the pigeons did not prefer the fixed ratio 23 peck schedule, the smaller delay to reinforcement associated with the fixed ratio 23 peck schedule, than the 32 pecks associated with the progressive schedule, may have reduced the pigeons preference for the progressive schedule.

The purpose of Experiment 2 was to repeat the procedure used in Experiment 1 but to equate the number of reinforcers that could be obtained on the fixed ratio and improving progressive schedules. Thus, in Experiment 2, independent of the pigeons choice, four reinforcers could be obtained for choice of either schedule. If the pigeon chose the fixed interval schedule, it could obtain four reinforcers per trial, each requiring 23 pecks. If the pigeon chose the improving progressive schedule, it would also obtain four reinforcers, the first after 32 pecks, the second after 16 pecks, the third after 8 pecks, and the fourth after 4 pecks. If for any reason, the pigeon switched between schedules, it would obtain a total of four reinforcers on each trial.

Although the number of pecks required for the reinforcer on the improving progressive schedule were regularly decreasing, theories of delay discounting (Mazur, 1989) should give greater value to the shorter delay associated with the first reinforcer. According to delay discounting, at the time of initial choice, later reinforcers should be discounted to the point that they should have greatly reduced value, especially for pigeons, a species known to have very steep delay discounting functions (Green & Meyerson, 2004).

Method

Subjects

The subjects in Experiment 2 were nine unsexed White Carneau pigeons purchased from the Palmetto Pigeon Plant, Sumter, SC. All the subjects had experience learning to discriminate colors. All of the pigeons were housed and were treated as were the pigeons in Experiment 1.

Apparatus

The apparatus was the same as that used in Experiment 1

Procedure

The pretraining in Experiment 2 was the same as in Experiment 1. The training in Experiment 2 was similar to that of Experiment 1 with the exception that choice of fixed ratio schedule allowed the pigeon to obtain four successive fixed ratio reinforcers (that required 24, 24, 24, 24 pecks), the same number of reinforcers as choice of the progressive schedule (that required 32, 16, 8, 4 pecks). The design of Experiment 2 is presented in Figure 4. If a pigeon switched from one schedule to the other during a trial, it could receive only four reinforcers on that trial (in fact, that never happened). Once again, each session consisted of 48 trials, with a 5-s intertrial interval. Because there was little evidence of change in preference over training in Experiment 1, there were only 15 sessions of training in Experiment 2. Again, sessions were conducted six days a week.

Results

Once again, the pigeons quickly showed a preference for the progressive improving schedule. The training results are presented in Figure 5. A one-way repeated measures analysis of variance performed on the preference scores as a function of training session indicated that the effect of sessions was not significant, F(1,14) = 1.35, p>.05. When the data were pooled over all 15 training sessions and tested for schedule preference, the preference for the progressive improving schedule (.67) was significantly different from chance (.50), t(11) = 5.20, p = .0003, d = 3.14.

Discussion

The results of Experiment 2 indicate that in spite the fact that the number of reinforcers were equated for the fixed-ratio 24-peck alternative and the improving progressive schedule starting with 32 pecks, the pigeons showed a significant preference for the progressive decreasing schedule. These results are inconsistent with what one would expect for pigeons based on their steeply hyperbolically decreasing delay discounting functions (Green & Meyerson, 2004; Laude et al., 2014).

General Discussion

The results of the present experiment confirm and extend the result reported by Chandel et al. (2021) who found that pigeons left a progressively increasing ratio schedule earlier than would have been optimal. They left earlier than would have been optimal as indicated by the fact that they chose a larger number of pecks to reinforcement (on a signaled fixed ratio schedule) over a smaller number of pecks to reinforcement (on a progressive schedule with an increasing number of pecks to reinforcement). They did so presumably because the smaller number of pecks on the progressive schedule was going to be followed by an increasing number of pecks to the next reinforcers.

In the present experiments, we tested whether pigeons would prefer to choose to make a larger number of pecks (32) to reinforcement over a smaller number of pecks to reinforcement (23), if the larger number of pecks to the first reinforcer led to a smaller number of pecks to the second reinforcer (16), etc. Delay discounting theory might predict that the pigeons would weigh the first reinforcer obtainable from choice of an alternative more heavily than successive reinforcers.

It is possible that delay discounting theory would predict that because the second reinforcer on the improving progressive (decreasing requirement) schedule would come earlier than the second reinforcer on the fixed ratio schedule (16 pecks vs. 23 pecks), the progressive schedule might be preferred. However, the added delay from the initial schedule choice to the second reinforcer should make the value of the difference between 23 pecks and 16 pecks required to obtain the second reinforcer quite small. Furthermore, the total number of pecks to the second reinforcer following choice of the fixed ratio schedule, 46 pecks (two times 23 pecks), is still fewer than the total number of pecks to the second reinforcer following choice of the progressive decreasing requirement schedule, 48 pecks (32 plus 16 pecks).

The present research confirms and extends the finding that pigeons are capable of serial pattern learning. That is, they are capable of taking into account not only the effort and delay to the next reinforcer but also the general pattern of increasing or decreasing effort and delay to successive reinforcers. More generally, this finding has implications for the ability of animals to make choices based on their prospective memory (Cook et al., 1985; Zentall et al., 1990) or future planning (Raby et al. 2007). That is, they have the ability to integrate the pattern of responses and reinforcers to result from their choices.

Bhatt, R. S. & Wasserman, E. A. (1987). Choice behavior of pigeons on multiple schedules: A test of optimal foraging theory. Journal of Experimental Psychology, 13, 40-51.
Chandel, H., Boring, M., Zentall, T. R., & Wasserman, W. A (2021). Should I stay or should I go? Implications of a pigeon (Columba livia) foraging task for optimal foraging theory and serial pattern learning. Journal of Comparative Psychology, 135, 266-272.
Cook, R. G., Brown, M. F., & Riley, D. A. (1985). Flexible memory processing by rats: Use of prospective and retrospective information in the radial maze. Journal of Experimental Psychology: Animal Behavior Processes, 11, 453-469.
Green L, & Myerson J. (2004). A discounting framework for choice with delayed and probabilistic rewards. Psychological Bulletin,130(5),769-92. doi: 10.1037/0033-2909.130.5.769. PMID: 15367080; PMCID: PMC1382186.
Hodos W. (1961). Progressive ratio as a measure of reward strength. Science, 134, 943–944.
Hulse, S. H. & Dorsky, N. P. (1977). Structural complexity as a determinant of serial pattern learning. Learning and Motivation, 8, 488-506.
Laude, J. R., Beckmann, J. S., Daniels, C. W., & Zentall, T. R. (2014). Impulsivity affects gambling-like choice by pigeons. Journal of Experimental Psychology: Animal Behavior Processes, 40, 2-11.
Marshall, J. A. R., Trimmer, P. C., Houston, A. I., and McNamara, J. M. (2013). On evolutionary explanations of cognitive biases. Trends Ecol. Evol. 28, 469–473. doi: 10.1016/j.tree.2013.05.013
Mazur, J. E. (1989). Theories of probabilistic reinforcement. Journal of the Experimental Analysis of Behavior, 51, 87-99. PMID 2921590 DOI: 10.1901/jeab.1989.51-87
Pattison, K. F., Zentall, T. R., & Watanabe, S. (2012). Sunk cost: Pigeons (Columba livia) too show bias to complete a task rather than shift to another. Journal of Comparative Psychology, 126, 1-9.
Raby, C. R., Alexis, D.,M., Dickinson, A., Clayton, N. S. (2007). Planning for the future by western scrub-jays. Nature, 445(7130), 919-921. doi: 10.1038/nature05575. PMID: 17314979.
Stephens, D. W., & Krebs, J. R. (1986). Foraging Theory. Princeton, NJ: Princeton University Press.
Watanabe, S. (2009). The Concorde fallacy in pigeons. In S. Watanabe, A. P. Blaisdell, L. Huber, & A. Young (Eds.), Rational animals, irrational humans (pp 135-149).Tokyo, Japan: Keio University.
Zentall, T. R., Steirn, J. N., & Jackson‑Smith, P. (1990). Memory strategies in pigeons' performance of a radial‑arm‑maze analog task. Journal of Experimental Psychology: Animal Behavior Processes, 16, 358‑371.

No competing interests reported.

Download PDF

Version 1

posted

You are reading this latest preprint version

Serial Pattern Learning: Pigeons Prefer an Improving Schedule Over an Initially Easier Fixed Ratio Schedule

Status:

Version 1

Abstract

Figures

Full Text

Experiment 1

Experiment 2

References

Additional Declarations

Status:

Version 1