The fundamental goal of communication is to gain mutual understanding between conversational partners. Achieving this goal becomes more challenging in certain environments, such as in reverberant and noisy rooms, or in situations with communication barriers. Under these conditions, speakers must adjust their speech production and monitor its output to ensure intelligibility is maintained. These adjustments and monitoring processes likely increase the allocation of cognitive resources, yet the extent of resources required for these processes remains poorly understood. To address this knowledge gap, this study examined how speech modification and room acoustics affect a talker’s cognitive load.
The results support the hypothesis that cognitive load is elevated when speaking in clear speech compared to casual speech, as evidenced by both the subjective ratings of mental demand and the pupillometry data. Clear speech consistently led to higher ratings of mental demand across all room conditions. Furthermore, pupillometry data revealed that the ratio of pupil diameter change was significantly greater for clear speech than for casual speech in the No Effect and Small Room conditions, suggesting an increased cognitive load for clear speech. For casual speech, the ratio of pupil diameter change was less than 1, indicating that pupils constricted more compared to the baseline. This constriction is believed to be an artifact of the stimuli presentation slide; unlike the baseline condition, which displayed only a single cross in the middle, the speaking trials showed twenty sentences, making the sentence slides brighter and leading to pupil constriction. Despite this, it is important to emphasize that pupils dilated more for clear speech than for casual speech in these conditions. Additionally, the order of room conditions and speaking styles was randomized to control for the effect of slide content, supporting the conclusion that the observed pupillary responses primarily reflect the cognitive load differences between clear and casual speech styles.
The elevated cognitive load observed during clear speech production challenges the longstanding perception of clear speech being an “easy” method for enhancing intelligibility, traditionally thought to require minimal training for effective implementation12,14. Notably, the lack of significant differences in pupil response between casual and clear speech in environments with extended reverberation times suggests that the cognitive effort for clear speech mirrors that of casual speech under such conditions. This finding urges a deeper exploration into the cognitive demands of various speech modification techniques used in voice and speech therapy, potentially revealing even greater cognitive challenges. According to Cognitive Load Theory, cognitive overload can critically hinder learning capabilities33. Techniques that impose excessive cognitive demands may not only be difficult for patients to learn but also to apply in real-world scenarios, where multitasking is often necessary. Thus, by optimizing cognitive loads, therapists can facilitate a more effective learning process, increasing the likelihood that patients will successfully integrate and utilize new communication strategies in their daily interactions.
The expectation that reverberation would increase cognitive demand due to difficulty monitoring speech appears intuitive, as reverberation can blur speech sounds, making it harder for speakers to hear their own speech accurately and adjust it in real-time. However, the results provide minimal support for the hypothesis that longer reverberation times increase cognitive load. Subjective ratings indicated that room acoustics influenced mental demand, with significant differences found between the No Effect and Small Room conditions for the casual speech style. However, no change in mental demand was reported for clear speech across any room conditions. Moreover, the pupillometry data did not show a significant effect of room acoustics on pupillary responses for either casual or clear speech styles. The absence of the effect of room acoustics on cognitive load does not align with the premises of the H&H model that speakers engage in hyper-articulation as a strategic response to optimize communication. This unexpected result may suggest that conditions with longer reverberation times may have been needed to reveal its effect. To support this assumption, the acoustic examination of speech recordings indicated that the effects of room acoustics on speech production behaviors was minimal.
Contrary to our observations, previous literature has demonstrated the influence of acoustic environments on speech and voice production. For instance, Hodoshima, Arai, and Kurisu discovered that speech produced in reverberant conditions was more intelligible than speech in quiet settings34. Similarly, research on singers by Bottalico, Łastowiecka, Glasner, & Redman demonstrated that room acoustics significantly influence vibrato rate, extent, and pitch inaccuracy35, indicating that singers modify their vocal production in response to different performance spaces. These findings illustrate the adaptive nature of vocal production to acoustic environments. Notably, the study by Hodoshima, Arai, and Kurisu utilized reverberation times of 3.6 and 2.6 seconds34—longer than those used in our study. This suggests that exploring room acoustics with longer reverberation times in future research may more effectively reveal their impact on talkers’ cognitive load.
To the best of our knowledge, this study is the first to utilize pupillometry to examine the cognitive load associated with speech modification in varying room acoustics. Employing self-reports and pupillometry as dual measures of cognitive load enables us to capture a more comprehensive picture of the mental effort involved in speech production. This approach is widely utilized in cognitive science because it reveals cognitive loads beyond what individuals can detect themselves, offering insights into unconscious cognitive processes36. The discrepancy between the subjective ratings and pupillometry data observed in our study may suggest that the cognitive load associated with adjusting to room acoustics is subtle and might be overshadowed by the more pronounced effect of speech production style. Alternatively, this discrepancy might indicate that the talker’s perception of increased effort does not directly translate to a measurable physiological response in terms of pupillary change. Other psychophysiological methods might be more sensitive to the nuanced effects of room acoustics on cognitive load.
The observation that cognitive demand for producing clear speech remains constant across varying room acoustic conditions, including environments with long reverberation times, stimulates further inquiry into how speakers manage speech production in challenging acoustic environments. Two potential theories can be offered for the underlying mechanisms. The first, the invariance of cognitive load, suggests that the cognitive effort involved in producing clear speech is stable across different acoustic environments. This theory posits that engaging in clear speech production sets a fixed cognitive load that remains unaffected by changes in room acoustics. The second theory, cognitive prioritization for speech modification, posits that focusing on clear speech minimizes the impact of room acoustics on cognitive demand. While room acoustics might usually influence cognitive load, this theory argues that the deliberate focus on clear speech production can make these acoustic challenges secondary, highlighting a strategic redirection of cognitive resources towards speech clarity over environmental adaptation. Elucidating the underlying mechanisms is crucial for advancing speaker training methods. For instance, training programs could be tailored to either manage cognitive load during speech production or assist individuals in adapting to diverse acoustic environments, depending on their specific needs.
Limitations
This study, while providing valuable insights into the cognitive demands of speech production in various acoustic environments, has several limitations that warrant consideration. Firstly, the small sample size may limit the generalizability of our findings. Secondly, the reverberation times used in our experimental setups may not have been long enough to fully capture the impact of room acoustics on cognitive load. Additionally, the study’s reliance on pupillometry and self-reports as the sole measures of cognitive load may not encompass all aspects of cognitive effort involved in speech production. While pupillometry provides a valuable objective measure, incorporating other psychophysiological markers could offer a more nuanced understanding of the cognitive processes at play. Finally, the study did not account for individual differences in speech production habits, auditory feedback sensitivity, or previous training in speech modification techniques, all of which could influence how speakers adjust to varying acoustic conditions. Acknowledging these limitations, our findings lay the groundwork for further research aimed at exploring the intricate relationship among cognitive load, speech production, and room acoustics, ultimately guiding the development of more effective communication strategies and therapeutic interventions.