Sleep is a dynamic, complex physiological process essential for homeostasis, recovery, and survival (1, 2). Disrupted or delayed sleep is associated with impaired immune function (3), increased susceptibility to infections and impaired wound healing (4, 5), impaired metabolic and endocrine function (6), increased pain perception (7, 8) and impairment of neurophysiologic organization and memory consolidation (9).
Sleep deprivation affects up to 60% of all critically ill patients admitted to an intensive care unit (ICU) (10, 11). Sleep among these patients is often fragmented by frequent arousals and awakenings which hamper transitions to deeper stages of sleep, reduced duration of sleep, and disturbed distribution of sleep with up to half of the total sleep time occurring during the day (4, 5, 11, 12). Poor sleep during critical illness is considered to be a major stressor for patients during and after ICU admission. It is associated with the development of ICU delirium and long-term cognitive decline, and has detrimental effects on recovery, morbidity, and mortality (13, 14, 15).
The ICU is a unique environment where a multitude of intrinsic and environmental factors may hamper sleep (16, 17. 18, 19, 20, 21, 22). Although previous studies have provided new insights into the etiology and possible prevention of disturbed sleep in the ICU, their scope, statistical significance and reliability have thus far been constrained by the logistical challenges of measuring and assessing sleep objectively (2, 4, 20, 22, 23, 24, 25, 26, 27, 28).
Electroencephalography (EEG) has historically been the primary tool for objective sleep monitoring (28, 29). Polysomnography (PSG), combining EEG electromyography (EMG), and electrooculography (EOG) is the technique used to investigate sleep. The visual and manual annotation or scoring of these recordings commonly follows criteria originally set by Kales and Rechtschaffen (31), with additional changes later culminating in the American Academy of Sleep Medicine (AASM) Manual for the Scoring of Sleep (32). Hundreds or even thousands of 30 second epochs each comprising multiple channels of PSG data are typically processed by a single human expert. Although this method is considered to be the gold standard for routine clinical sleep analysis, most PSG studies in critically ill patients report difficulties in setting up, maintaining, and manually processing and scoring ICU sleep recordings (4, 12, 33, 34, 35, 36). The practical expertise required to apply and maintain the array of electrodes required for human scoring further limits scalability and increases costs. Furthermore, the reliability and repeatability of manual analysis of ICU sleep recordings is lower than for other clinical recordings (37). While Elliott et al. reported observed ‘reasonable’ to ‘good’ agreement between two combinations of 3 human scorers in discerning wake from sleep activity, the agreement on detailed sleep staging was much lower depending on individual sleep stages and the combination of human scorers (23).
The objective of this study is to investigate human inter-rater agreement in sleep staging following the AASM rules for sleep scoring, in a heterogeneous population of ICU patients.