How Do People with Late Bedtimes use Social Media?

Background: The use of social media before bedtime usually results in late bedtimes, which is a prevalent cause of insu�cient sleep among the general population of most countries. However, it is still unclear how people with late bedtimes use social media, which is crucial for adopting targeted behavior interventions to prevent insu�cient sleep. Methods: In this study, we randomly selected 100000 users from Sina Weibo and collected all their posting through web crawling. The posting time was proposed as a proxy to identify nights on which a user stays up late. A text classi�er and topic model were developed to identify the emotional states and themes of their posts. We also analyzed their posting/reposting activity, time-use patterns, and geographical distribution. Results: Our analyses show that habitually late sleepers express fewer emotions and use social media more for entertainment and getting information. People who rarely stay up late feel worse when staying up late, and they use social media more for emotional expression. People with late bedtimes mainly live in developed areas and use smartphones more when staying up late. Conclusion: This study depicts the online behavior of people with late bedtimes, which helps understand them and thereby adopt appropriately targeted interventions to avoid insu�cient sleep.


Background
Insu cient sleep is an important public health issue.It signi cantly increases the risk of mental and physical illness, such as obesity [1] , hypertension [2] , depression [3] , and anxiety [4] .Besides, insu cient sleep can shorten lifespan and even cause death [5,6] .According to the World Health Organization, 27% of people worldwide are suffering from insu cient sleep [7] .And in China, more than 60% of adolescents are suffering insu cient sleep and 38.2% of adults are experiencing insomnia [8,9] .Of even greater concern is that these percentages are showing an increasing trend.
To help reduce the burden of insu cient sleep, researchers in various elds have explored factors that in uence sleep.In the natural sciences, Patel et al. found that light pollution is signi cantly negatively correlated with sleep duration and sleep quality [8] .Meng et al. found that indoor noise is the most in uential environmental factor that causes insu cient sleep [9] .Zheng et al. found that indoor temperature had signi cant effects on sleep quality [10] .Social scientists have explored factors related to insu cient sleep from the human behavior perspective.Bhurosy et al. found healthy eating habits are helpful to prevent people from getting insu cient sleep and its adverse effects [11,12] .Liu et al. found that mobile phone addiction was negatively correlated with sleep quality [13] .Iglowstein et al. found that people get insu cient sleep mainly because of increasing bedtime delay but unchanged wake time [14] .
Late bedtime is a prevalent cause of insu cient sleep for the general population [15,16] .Late bedtime is therefore a signi cant behavior that needs to be addressed to improve people's sleep duration and quality [17] .
Researchers have considered the issue of late bedtime from a human behavior perspective.For instance, Adamo et al. found that excessive daily energy intake is associated with worse late bedtime [18] .Guerrero et al. found that greater screen time behavior is associated with later bedtime [19] , while Grummon et al.   found that later bedtime is associated with less healthy eating behavior [17] .Although these speci c behaviors related to late bedtime have been examined, the lived experiences of people with late bedtimes have not been adequately studied.The emotions of people with late bedtimes are also poorly understood.
Exploring these issues may be helpful for understanding and providing appropriate interventions for people with late bedtimes to improve their sleep quality.
Social media are prevalent among the general public [20] , which provides an opportunity to get users' realtime published data.These data cover many different themes, such as users' personalities, opinions, behaviors, lifestyles, and thoughts.Existing studies have shown that social media data could be used to explore sleep health issues.For instance, users with insomnia and sleep disorders have been identi ed based on the presence of prede ned insomnia-related keywords [21] .Insomnia symptoms have also been detected and identi ed using social media data [22] .However, these studies mainly focused on sleep disorder patients [23] .Research focusing on using social media data to investigate the emotional characteristics of people with late bedtimes is still in its infancy.
This study provides a detailed description of the lived experiences of a sample of China's general late bedtime population.The study was guided by the following research questions (RQ): RQ1.What are the emotional characteristics of people with late bedtimes?RQ2.How do people with late bedtimes use social media?2. Methods

Sample and Data collection
More than 20 million users were collected from Sina Weibo using snowball sampling techniques, from which 100,000 users were randomly selected.All their postings and pro le information were collected through web scraping.Finally, approximately 77274060 posts (including 13151050 original postings) were collected from 98,420 users.Other metadata included: 1) demographic data, such as gender, age, and location; 2) metadata along with the posting, including the post's source, time, and location tag.The location was mapped to its latitude and longitude.The local time of each post was calculated according to its longitude.To protect personal privacy, security measures have been adopted to ensure users cannot be identi ed as the person to whom the information belongs.
It should be noted that our study only included ordinary users but not celebrities or commercial organizations.The latter two accounts may be managed by a team, not just one person, which is meaningless in this study.The Python programming language was used in this study for data collection, analysis, and visualization.Figure 1 provides a owchart of data collection and processing.

Sleep Proxy
The timing of posts was considered as a proxy for bedtime and sleep duration.We assume a user is awake when there is a post from their Weibo account.The local time of each tweet was used to identify nights on which a user goes to bed late.
We calculated the daily Weibo activity, which shows that the general public's activities increase from 4 am to 11 pm and decrease from 11 pm to 4 am on Weibo.We deem tweets posted 11pm-4am as "latenight" tweets.Then, according to the frequency of user's activity 11pm-4am and the Pittsburgh Sleep Quality Index [24] , the sample was discretized into three groups: • daytime group -users rarely staying up late who posted late at night on less than one day per week; • medium group -users sometimes staying up late who posted late at night on one to three days per week; • midnight group -habitually late sleepers who posted late at night on at least three days per week.

Emotion Analysis
To identify and analyze the emotions of users, all users' posts were labeled as one of the following emotional attitudes: positive (+1), negative (-1), and neutral (0).A text classi er was developed to complete this task, and a labeling dataset was needed to train the classi er.However, as no publicly available dataset tted our requirements, 20000 posts were randomly selected and manually labeled as the training data for the classi er.Data labeling was undertaken independently by three members of our research team (native Chinese speakers).Any disagreement was resolved by consensus.
The training data was used to train the Chinese RoBERTa-large model, which provides the best results on sentiment analysis [25] .Fine-tuning was done to make the model achieve the best performance by adjusting the value of hyper-parameters, including steps, learning rate, and batch size.The F1-score was selected to measure the performance of the model, which provides a balance between precision and recall.Experimental result shows that our ne-tuned Roberta-large model performs well.The f1-score of the model for sentiment classi cation are: positive, 0.81; negative, 0.82; neutral, 0.88.The positive and negative emotions are included in further analysis.
Empirical studies by psychology researchers have repeatedly veri ed that positive and negative emotions are independent dimensions [26] .Positive emotions include happiness, enthusiasm, and optimism, whereas negative emotions include anger, depression, disgust, and anxiety.The disappearance of positive emotions does not mean the appearance of negative emotions [27] .Thus, we analyzed positive and negative emotions separately in this study.The 13151050 text-formed original postings were included for sentiment analysis, whereas reposted postings were not included because they were not considered to re ect a truly personal opinion [28] .
In this study, we calculated the percentage of users with positive emotions per hour and negative emotions per hour for each group.We then compared the difference of emotions among the three groups.
Since prior studies have found that emotions are related to gender, age, and latitude [29][30][31] , it is necessary to look at the subgroup users' emotional characteristics to judge the applicability of trial results.Users' demographic information was retrieved from our database and the distribution of emotions was calculated for gender subgroup, age subgroup, and latitude subgroup.
The seasonal cycle was estimated through time series decomposition to learn the cycle characteristics of the three groups.The amplitude of the cycle was also calculated to learn how emotions amplitude varies across the three groups.The amplitude of a cycle was de ned as the difference between the maximum and minimum values for the cycle [30] .

Thematic Analysis
A topic model was developed to identify themes contained in postings of the three groups.As it is the most popular unsupervised topic modeling algorithm through which topics can be extracted from texts, the Latent Dirichlet Allocation (LDA) generative statistical model was selected as the topic model for this study.LDA is a good method to "let the text talk" because the result of LDA does not depend on the evaluator's personal perspective or experience [32] .The number of topics (K) is required before running the LDA model.To determine the best K value, we selected 100 thousand posts and constructed LDA models with K from 1 to 10, and then measured the quality of the theme based on the coherence [33] .Since the coherence is highest when K=4, we selected K=4 for the thematic analysis of the three user groups.All original postings of users were included in the LDA model.According to users' posts and the keywords returned from the LDA results, we labeled each cluster with appropriate themes.

Posting Behavior Analysis
In social media, users can post original postings to express personal experiences, emotions, and thoughts.They can also repost each other's postings to diffuse information that they are interested in.By analyzing the posting/reposting behavior, we can better understand users' behavior and interests.To learn the difference of the posting behavior among the three groups, we calculated the percentage of original postings posted per hour among the three user groups.

Geographic Distribution
Economy, culture, lifestyle, and food habits vary across different provinces in China.To learn the social environmental differences, we calculated the percentage of people's late night posting for each province.

Cv Cv
This allowed observation of a nationwide geographic distribution of people with late bedtimes to learn whether there are regional differences.

Time Use Pattern
Learning the time use pattern and what users do with social media before bedtime is of great signi cance for adopting effective methods to interfere with late bedtime behavior.To explore the daily time-use pattern of users with late bedtimes, the device used for each posting was crawled from Weibo.These data were directly collected from postings published by the individuals in real-time, so they are not susceptible to memory bias.Each publishing device was categorized into four types of media device: smartphones, PCs, tablets, and others.The percentage of users who spent time engaging with these four media devices per hour was then calculated.

Emotion Analysis
Emotion analysis was conducted to explore the emotional characteristics of people with late bedtimes and to compare the differences of the emotional state among the three groups.Figure 2 depicts the weekly variation in emotions of the three groups.We observed that habitually late sleepers generally express fewer emotions than the other two groups.Notably, users who rarely stay up late express signi cantly more negative emotions late at night when they do stay up late.
We conducted a demographic subgroup analysis by calculating the distribution of emotions based on age, gender, and latitude to explore whether the emotional difference among the three groups was consistent across demographic characteristics.From Figure 3, we can observe that female users who habitually stay up late express fewer emotions than females in the other two groups.However, male users who rarely stay up late express relatively fewer negative emotion than males in the other two groups.The nding that habitually late sleepers generally express fewer emotions than the other two groups, are consistent across age and latitude subgroups.
Figure 4 displays the seasonal cycle of emotions in the three groups, from which we can observe that the cycle amplitude of negative emotions is larger for users who more frequently stay up late.

Thematic Analysis
The LDA cluster algorithm was used to learn what kind of themes were discussed by the three groups.According to the results returned by LDA, each cluster was labeled with an appropriate theme by observing the top 10 high frequency keywords in these clusters.The theme of each user group mentioned in their posts is shown in Table 1.For users who rarely stay up late, family and daily life were the most mentioned theme.Approximately 25% of the postings expressed personal feelings, while 24% of discussed social environment.Although most users were ordinary users (celebrities and enterprises were excluded), personal care (14%) was also discussed.For users who stay up late sometimes, personal feelings were most mentioned followed by social hotspots (28.3%).The remaining postings were about family and daily life (23.2%)and entertainment news (16.8%).For habitually late sleepers, the most common theme was entertainment news.About 26% of the posting talked about social pressure.The following two themes are social hotspots and personal pursuits.

Posting Behavior Analysis
Figure 5 shows the average number of postings per hour and the percentage of original postings per hour of the three groups.We can observe that users who stay up late with higher frequency are more active on social media and they published fewer original posts.For habitually late sleepers, the percentage of their original postings was far less than the percentage of their reposting messages during late nights.

Geographic Distribution
Figure 6 depicts the geographic distribution of people with late bedtimes, which helps with understanding geographic characteristics.People with late bedtimes mostly live in the Sichuan-Chongqing Region, and the eastern coastal regions such as Beijing-Tianjin-Hebei region, the Yangtze River Delta Region, Fujian Delta Region, and the Pearl River Delta Region.These areas are more developed than other regions in China.

Time Use Pattern
Figure 7 displays the hourly media usage of people with late bedtimes.We can see that people use a PC more than other media devices during work time.However, people engage signi cantly more with smartphones during leisure and sleep time.After midnight more than 50% use smartphones, suggesting that smartphone addiction may be an important factor in staying up late.

Discussion
This study investigated how people with late bedtimes use social media.Through the analysis of emotions variation, we found that users who habitually stay up late generally expressed fewer positive emotions and negative emotions.Users who rarely stay up late mainly expressed negative emotions late at night when they stay up late, which indicated that staying up late makes them feel worse; so, they tend to express themselves on social media to cope with the negative impact of negative emotions.Positive emotions are, of course, a bene t to psychological well-being and are worth cultivating [34] .Negative emotions could have either a positive or negative impact.Prior studies have shown that more frequent mixed emotions bene ts health [27,35] .The key is learning to cope with negative emotions rather than restraining them, which is bene cial for mental health [36,37] .
This study also conducted a demographic subgroup analysis of the emotions of the three user groups.
Previous studies mainly concentrated on adolescents or speci c areas [38] .However, our study provides a nationwide survey across age, gender, and areas.The subgroup analysis was conducted to determine whether the emotional difference was consistent across gender, age, and latitude.The results suggest that the correlation between late bedtime and emotions is consistent across different ages.Male users who rarely stay up late express relatively fewer negative emotions than males in the other two groups.One possible explanation may be that males are less willing to express negative emotions because it may cause them to appear vulnerable [39] .The results also suggest that latitude does not affect the association between late bedtimes and mood, which is consistent with previous ndings [30] .
We also observed the seasonal cycle of emotions among the three groups.The cycle amplitude of negative emotions was larger for users who stay up late more frequently, suggesting that users who habitually stay up late are more vulnerable to negative emotions.In addition, prior studies have shown that larger seasonal mood variation amplitude indicates seasonal affective disorder [40] , which also implies the adverse effects of late bedtimes on mental health.
Thematic and posting behavior analysis was conducted to learn what kind of topics were expressed in the postings.Through content analysis, it was found that habitually late sleepers mostly mentioned entertainment news topics and reposted more messages than they originally posted, which suggests that they tend to use social media more for entertainment and getting information.Approximately 26% of their postings were talking about social pressure, which indicates that users who habitually stay up late may experience more erce competition and greater social pressure.For users who rarely stay up late, personal life and feelings were the most mentioned theme, and they published more original postings than reposting messages.This suggests that they tend to use social media more for expressing personal emotions, which is also an explanation for why they expressed more emotions than users who habitually stay up late.
The geographical distribution of people with late bedtimes was found to be that people living in the eastern coastal regions and the Sichuan-Chongqing Region are more used to staying up late.This is similar to Weibo's 2020 User Development Report showing that the areas with the highest coverage of Weibo users are the Sichuan-Chongqing Region and the eastern coastal regions, such as the Beijing-Tianjin-Hebei region, the Yangtze River Delta Region, Fujian Delta Region, and the Pearl River Delta Region.People in areas where social media is more popular are also used to staying up late, which indicates that social media are positively correlated with staying up late behavior.This distribution also displays the typical structural characteristics of social ecological theory.The economy of these regions is more developed than other regions in China and so people have to study more and work harder for "fear of missing out" [41] ; they also experience greater social pressure.Therefore, the time before bed may be the only time for them to deal with the stress of the day.
The time use pattern of users with late bedtimes was also observed.Compared with the daytime, the proportion of people using smart mobile devices at night has increased signi cantly, which shows that people tend to spend a lot of time on smart devices at night for leisure activity.Studies have shown that media use before bedtime has adverse effects on sleep and mental health, such as depressive symptoms and suicidal tendencies [42] .Even if the light is not intense at night or late at night, our sleep quality will be affected [8] .Smartphones are a particularly signi cant factor in predicting sleep problems [43] .Smartphone addiction suggests a de ciency in self-regulation for people who delay their bedtime [41] .Thus, restricting mobile phone usage before bedtime may be an effective behavioral intervention to prevent late bedtimes and sleep problems.
There were some limitations to our study.The tweeting time was used to infer whether people stayed up late or not, which may not be highly accurate.People who stayed up late but did not post were missed, which may result in a bias between log data and users' schedules in reality.New algorithms should be employed in future work to infer the user's accurate bed and wake time.Also, it was not possible to capture all the online behavior data from users' pro le pages, such as the likes, comments, and browsing history.More relevant data should be collected in future related research.

Conclusion
Despite its limitations, our research shed light on the lived experiences of people with late bedtimes.We found that users who habitually stay up late expressed fewer personal emotions.They tend to use social media for entertainment and getting information.Users who rarely stay up late expressed more positive emotions, and they expressed more negative emotions late at night when they stayed up late; they also tend to use social media for expressing personal emotions.We also revealed individual factors and environmental factors that relate to late bedtimes.People with late bedtimes mainly live in developed areas, and they tend to indulge in smartphones late at night.This study sheds light on the behavior and needs of people with late bedtimes, and the ndings can be used to develop targeted behavioral interventions to prevent insu cient sleep.
Figures  Activity pattern (the proportion of speci c media use hourly throughout a day)

Figure 1 Flowchart of data collection and processing Figure 2 Daily variation of the three groups Figure 3
Figure 1

Figure 4 Seasonal cycle of the three groups Figure 5
Figure 4

Table 1
Themes of the posts in the three user groups