Fat-Burning or Fat-Shaming? Theme and Community Overlap Between Online, Weight-Stigmatizing and Exercise-Promoting Networks.

Purpose: This study aimed to determine if networks of users consistently posting about exercise and fat exist and overlap on social media sites. Method: We collected 3,772,507 posts from Twitter that included the words “fat” and “exercise”. Using network structure methods, we identied communities of interconnected users and overlaps between those tweeting “fat” and those tweeting “exercise”. Results: Common word pairings were identied using Natural Language Processing (NLP). Networks of users consistently talking about exercise (n=3,573) and fat (n=2,007) were found on Twitter. An increased mean total-degree and reduced average path length indicate that the tness-talk network serves as a connecting bridge between highly scattered communities of the weight-talk network. Conclusion: We identied groups on Twitter dedicated to consistently producing weight stigmatizing content and promoting exercise with weight-loss messages. These groups partially overlap with pro-health groups which could lead to users looking for exercise advice in Twitter to nd themselves immersed in a stigmatizing network.


Introduction
Around 90% of posts on social networking sites (SNS) containing the words "fat" or "obesity" are found on Twitter, while Facebook, blogs, forums and internet comments contain the other 10% [1]. Out of posts containing the words "fat", "obesity", and "overweight", the word "fat" is by far the most common [1]. This makes Twitter the most important place to study online social networks, and the word "fat" the keyword to use when looking for information related to weight. Furthermore, posts containing the word "fat" have been found to have a mostly negative connotation. For example, one study found that 56% of the messages containing the word "fat" were explicitly negative towards higher weight individuals [2]. This suggests the presence of weight stigma, the pervasive social devaluation and denigration of overweight and obese people, which may be problematic given that weight stigma is related to unhealthy behaviors, such as high calorie intake, binge eating, and physical inactivity [3]. Speci cally, negative portrayals of obesity in the media have been found to increase caloric intake and reduce physical activity and dieting self-e cacy [4].
Communities promoting weight loss online do so while overlapping with those promoting tness [5,6], which means that users looking for information related to healthy exercise habits might inadvertently end up receiving stigmatizing content. This could happen because weight loss messages are common within twitter stigmatizing communications [2]. Alternatively, the word "fat" works both as an adjective with a negative connotation (i.e., fat person), and as a noun that is commonly associated with exercise (i.e., burn fat), which could result in a bundling together of both concepts by Twitter's recommendation algorithm.
Finally, there is large body of research suggesting that individuals have an inaccurate understanding of physical activity, mainly reducing it to weight loss and appearance change [7], which could result in an overlap between stigmatizing and tness promoting content.
The focus on weight when promoting exercise could be particularly prejudicial if higher weight users are the ones searching for exercise-related information. Indeed, these users could be exposed to stigmatizing content if there is an overlap in themes and users. This study aimed to determine if networks of users consistently talking about exercise and fat exist on Twitter (objective 1). Assuming they do, we hypothesized that they overlap (objective 2), and that, in accordance with previous ndings, the content of their communications will be negative, or weight loss-related (objective 3).

Data collection
Data was collected using the Twitter Application Programming Interface (API), a web-based program that allows users to interact with Twitter's data (i.e., tweets and metadata). This allowed for keyword searches of all tweets containing the words "exercise" and/or "fat". These searches were implemented daily over a period of 3 months (November 2017 to February 2018). Each daily search provided the most recent tweets of the day (up-to 50,000). It also provided information on how many times each tweet was retweeted, the user-id of the user who produced the tweet, and the number of followers of that user.

Community structure and overlap
In order to determine if networks of users consistently talking about exercise and fat exist on Twitter (objective 1) users who tweeted at least once a week about "fat" or "exercise" were assigned to a core group, while those who only tweeted once in three months about "fat" or "exercise" were assigned to the visiting group. Users who tweeted more than once but not once a week were excluded from the study.
In order to study a network of Twitter users, each user is considered as a node. A node can follow another node, creating a link (e.g., A à B). In A à B, node A has an out-degree of 1 (because it follows B) and node B has an in-degree of 1 (because it is followed by A). In social networks, the direction of the arrow typically represents information ow, but in the case of Twitter the relationship is inversed because users follow other users to see the content they publicly post, not to send direct information. In short, if node A follows node B (A à B), node A is seeing information posted by node B.
Nodes with a large following within the network have a high in-degree, while nodes following a large number of users within the network have a high out-degree, with the sum of both hereby rereferred to as total-degree. Additionally, within the social network, communities can emerge [8]. These communities are composed of users with more connections among them but fewer connections to nodes outside the community which may themselves be part of different communities.
Relationships between users were mapped to allow for social network analysis and to determine network structure. The user-id from the core group was used to identify the followers of each individual user. The relationships were then ltered to only include relationships within the core group. Using these relationships, a core weight-talk network and a tness-talk network were mapped. Additionally, to determine if there is an overlap between weight-talk and tness-talk communities, a combined network was mapped, and the metrics of this network compared to the previous ones.
To answer questions regarding structure and overlap (objective 2), four standard network metrics were used: density, average path length, mean total-degree, and clustering coe cient. Density measures how many connections exist in the network and ranges from zero (no connections) to one (all possible connections exist). Average path length refers to the average shortest paths between two nodes and is important because it indicates how far in average, information has to travel to reach from one node to another. Mean total-degree refers to the average number of connections per node and is useful for determining how well connected a network is. Finally, clustering coe cient is a measure of the extent to which nodes in a network tend to cluster together, which allows us to understand the network structure in reference to its communities.
Additionally, modularity was calculated within the network. This is the identi cation of communities within a network based on the similarities between their connections. It was done using the fast unfolding of communities in large networks algorithm [8]. This algorithm decomposes the networks into sub-units or communities, which are sets of highly inter-connected nodes. Four main communities per network were extracted. Hubs were found by ranking the nodes based on their in-degree. This allowed for a linguistic corpus analysis based on the communities of each network.

Linguistic corpus
The linguistic corpus is the whole set of text data (tweets) to be analyzed. Simple Natural Language Processing (NLP) techniques like the division of sentences into individual words (tokenization) and their subsequent analyses in duos or triplets (n-grams) were used to analyze this data. Latent Dirichlet Allocation (LDA), which allows for the grouping of observations (words) to be explained by unobserved groups (themes), was also used. A total of 3,772,507 tweets were collected, however out of those, non-English-language tweets (n=510,145) and tweets from people that were not in the core or visiting categories (n=1,291,155) were removed. As a result, the total corpus was reduced to 1,971,207 tweets.

Linguistic n-grams
In order to con rm that the communicational content of the weight-talk and tness-talk network is explicitly negative, or weight loss related (objective 3) the tweets were divided into the four main communities of each core social network. After that, a list of linguistic bigrams and trigrams that excluded prepositions, conjunctions, and linguistic llers was generated. Finally, a Latent Dirichlet Allocation (LDA) model was applied to differentiate individual unobserved clusters of words, which allowed for a qualitative lexical text analysis and extraction of common themes within the communities.

Network characteristics
The weight-talk and tness-talk networks were similar in size, both in its core (N fat =2,007; N exercise =3,573) and visiting forms (N fat =737,102; N exercise =703,948). The number of tweets, however, was smaller in the core weight-talk network. A total of 3,573 users consistently use their pro les as a way to communicate about exercise, while 2,007 use them for "fat" related communication, indicating the existence of dedicated networks. The number of users and tweets in the visiting and core networks, as well as the four main communities, are described in Table 1.

Network structure
The structure and communities of the weight-talk and tness-talk networks was calculated using the fast unfolding of communities in large networks algorithm [8]. The mean total-degree is larger in the tnesstalk network (13.40) than in the weight-talk one (5.54) indicating that the tness-talk network users are more connected overall, which makes the average path length smaller (3.79 for the tness-talk, and 6.02 for the weight-talk network). The communities within the networks, however, seem to be more clustered together in the weight-talk network (0.17) than in the tness-talk one (0.093), creating a divided network with a larger diameter (24 for the weight-talk network compared to 16 in the tness-talk one).

Overlap between networks
A small percentage (7.6%) of users are in both networks. The mean total-degree of the combined network (12.91) is larger than the averaging of the means of the individual networks (9.71). The average path length (4.31), and the clustering coe cient (0.107) for the combined network drop below the averages of both individual networks, while the diameter is widely reduced from the weight-talk network. This could indicate that the users in the tness-talk network serve as a connecting bridge between highly scattered communities of the weight-talk network. Table 2 shows descriptions of the networks.

Content of communications
To understand the overall content of each group, the tweets were divided into individual words (tokenization), pairs of words (bigrams) or triplets of words (trigrams). The lists reveal themes of physical tness, health, and tness for health in the tness-talk network's communities. As for the weight-talk network, the communities contain mainly words regarding weight loss or fat burning, insults, and blame regarding food and eating behaviors.

Discussion
The belief that health can be achieved through weight loss reduction behaviors such as physical activity seems to be common [7]. This study aimed to test this overlap within twitter communities that mainly shared and produced content relating to weight and tness by posing three important questions: Are there dedicated communities of users discussing "fat" and "exercise" online? If so, do this communities overlap? and what are the themes of these communities?
Results corroborated the existence of a large core network of users tweeting about tness or weight at least once per week. Dense and highly clustered communities were found within the individual networks with the communities in the weight-talk network mainly focusing on weight loss, stigmatization, and internalized stigma, and communities in the tness-talk network focusing on exercise for health and exercise for tness. While the geodesic distance in the tness-talk network was higher than that of the weight-talk network, the clustering coe cient was not. This corroborates previous ndings that weightloss promoting communities are more closely knitted together than pro-health ones [5]. It is also in line with research that has found that health-promoting tweets focus on nutrition and exercise, while prothinness ones focus on thinness and disordered eating behaviors [6].
The combination of both networks showed a mild overlap between them, which could indicate that the users in the tness-talk network serve as a connecting bridge between highly scattered communities of the weight-talk network. This is consistent with previous studies showing the prevalence of pro-thin type communications among weight-talk networks in Twitter [1,2]. Given that the Twitter algorithm recommends people whom the user is most likely to interact with, this could create an ever-growing community of bias. On the other hand, there could be an ingrained association between exercise and weight loss, through concepts like "fat-burning" [7]. This, in turn, might lead to people looking for prohealth advice to nd themselves within a stigmatizing network.
Finally, this study examined what people were tweeting and sharing about fat and exercise. We found that the networks were not only divided as a function of their interconnectivity, but also by their topics. In the case of the tness-talk network, tness and health-related exercise tweets were found to be the most common reasons for practicing physical activity. Other topics were more generic and unrelated to physical activity, like English learning and politics. In the weight-talk network, on the other hand, the most popular topics were weight loss-related tweets. This resembled previous ndings where weight loss was a common theme of Twitter [2]. The second most common category of tweets in the weight-talk network was explicit weight stigmatization, con rming previous ndings [1,2]. Internalized stigma tweets, which is to say tweets making explicit self-directed stigmatizing comments, were found in a speci c sub-cluster that also happens to house pro-anorexia accounts. And nally, a cluster of porn related accounts was found that held no relation to the study.