Who is Sharing Physical Activity, Diet and Weight Loss Information on Twitter? An Exploratory Thematic and Source Analysis of Tweets

Social media platforms such as are used to consume and share information, including health-related information. Previous research has analysed the obesity conversation on Twitter, but no work to date has explored content related to specic contributing health behaviours, such as physical activity, diet and weight loss. The aim of this study was to identify the content and source of physical activity, diet and weight loss information on Twitter, in order to identify common content and sources of these tweets. and interventions for combating misinformation and promoting evidence based messages in NCD prevention

motivate behaviour change (2). Twitter can support behaviour change by presenting the opportunity to engage with health messages (3). However, with no restrictions on who can tweet health information, or requirement to give the source for claims, the spread of health misinformation is clearly a risk (i.e. a health-related claim for which there is currently a lack of scienti c evidence) (4)(5)(6).
Previous research has identi ed the prevalence of health statements made on Twitter and the abundance of medically-relevant tweets posted by non-medical professionals (i.e. potentially non-credible sources) (7)(8)(9). For example, one study found that just 2% of tweets about colorectal cancer came from medical professionals (8). The credibility of Twitter content affects the in uence that a message can have on readers, and may affect how a message is engaged with, shared and spread through a social network (10). The perceived credibility of a message on Twitter is affected by whether or not the source is from a veri ed account, the number of shares, (10,11) and information included in the related Twitter bio highlighting relevant expertise (10). For example, an individual with a veri ed account who describes themselves as a medical professional may garner trust or credibility in the eye of the Twitter user.
Health messages on Twitter have been analysed across a multitude of health conditions, including diabetes, colorectal cancer and eating disorders, in an attempt to identify topics discussed, accounts that share educational information, and the social networks formed through this information sharing (8,12,13). Studies have also explored some key features of health-related conversations. For example, tweets on obesity have exhibited weight-stigmatising messages and derogatory, misogynistic sentiment, (14,15) while other studies have observed humour being shared around the topic, including weight-related puns, repartee and parody (16). However, little work to date has explored SM conversation and information-share around speci c health behaviours that contribute to obesity and non-communicable diseases (NCDs), such as physical activity, diet, or weight loss. Such information could help public health researchers and practitioners to identify effective, e cient, sustainable and scalable strategies and interventions for combating misinformation and promoting evidence based messages in NCD prevention on SM.
Thus, the aim of this study was to conduct an exploratory thematic and source analysis of physical activity, diet and weight loss tweets, in order to identify common content and sources of these tweets.

Data collection
Tweets which contained the terms, "physical activity", "weight loss", and "diet" were captured in real-time using Twitter's Application Programming Interface (API) over a 7-day data collection period from 14 th March 2017 -21 st March 2017. Twitter was selected because data on the platform is relatively open and publicly accessible. The reasons for choosing these terms were three-fold: i) terms representing contributing factors to NCDs such as obesity, ii) broad enough terms to capture a variety of tweets, but iii) using professional language to capture tweets from professional bodies. The search was restricted to collect English language tweets. Location was not added in the search criteria as location (determined by geotagging) can only be inferred from approximately 1% of tweets which would have signi cantly restricted the search (17,18).
Sampling strategy Figure 1 illustrates the sampling strategy for tweets, including: i) removal of retweets and 'quoted tweets' to reduce duplication of content; ii) removal of tweets that were direct replies to other users; and, iii) exclusion of tweets by users with empty bios, as this prevented su cient source analysis. This reduced the size of the dataset from n=381,713 to n=143,351. A random sample of 1% of tweets was selected for source and content analysis, comparable to sample sizes from previous research (10).

Coding strategy
Framework development A thematic coding framework was developed and tested by two independent coders, with three source codes and six content codes (see Figure 2). All tweets were coded independently by one researcher (NOK), and a random 10% sample was independently coded by a second researcher (AG) to ensure reliability. This approach in determining inter-rater reliability using a small sample of the dataset is one taken in previous research (14,19). A good inter-rater agreement was achieved for both source (82.8%) and content (80.0%) of tweets.

Source analysis
The source of tweets was determined by assessing Twitter bios (i.e. the section of a Twitter account which provides insight into who or what the account is). The categories were: professional public health source (i.e. public health bodies and agencies, charities, professionals working in health-related academia, research or policy); and, non-professional source (i.e. the general public, and those who were not operating from within the public health realm, such as news outlets). It became clear during the framework development process that there were several Twitter accounts to which it was di cult to assign either a professional status or general public status but which fell somewhere between the two categories. This group included individuals describing themselves with terms which are not protected titles, such as 'nutritionist', 'life coach', and 'personal trainer'. These terms may indicate to the general public that these individuals are experts, but to which a professional public health audience might have di culty in affording trust as credible sources of information; we called this group the 'information intermediaries'. These information intermediaries represent an important possible source of information; in that the audience must make their own determination of the trustworthiness of the information they tweet. No attempt was made to assess the claims within the bio.

Thematic analysis
The content of tweets was coded according to two overarching categories, with different sub-categories within each, and according to any additional characteristics that could be identi ed by researchers. The two main categories were: i) educational tweets, which attempted to disseminate health information relevant to the topic, whether scienti cally accurate or not; and ii) non-educational tweets, which did not attempt to inform. Within educational tweets, a speci c sub-category was assigned to each tweet, illustrating whether the educational tweet was: i) presenting news or research; or ii) providing guidelines and recommendations on the health topics, including the dissemination of guidelines and goals, and recommendations including exercise advice, weight loss tips, or recipes.
Non-educational tweets were sub-coded as: i) promotional, the promotion of products and services; ii) conversational, or general commentary; iii) behavioural, the reporting of behaviour or intentions to change behaviour; and iv) directional (i.e. not providing any content that could be coded but using Twitter as a platform to direct the audience to another platform). For the latter, coders followed links included in tweets to determine what type of platform was being used and this was also recorded (though speci c content from other platforms was not coded). Where tweets are presented in analysis, content is paraphrased where necessary to protect the identity of the source, however, where possible, the format of the tweet has been preserved (i.e. inaccurate grammar and spelling).

Overview
In total, 1,433 tweets were selected for analysis (1% random sample of cleaned data). Results from the source and thematic analysis can be found in Table 1. Just under a third of the sample comprised educational tweets (32.8%; n=470) and these tweets came mostly from a non-professional source (85.1%; n=400) as well as information intermediaries (11.7%; n=55). Most of the educational tweets from a nonprofessional source (74.0%; n=296) were providing advice in the form of recommendations or guidelines.
Additionally, many tweets promoting products or services were observed from non-professional sources and information intermediaries.

Source analysis
A very small proportion (1.3%; n=18) of tweets were tweeted by those considered professional individuals or bodies (i.e. those most likely to share credible health information). Examples of accounts that were assigned professional status include: a cancer research foundation, the Twitter account of a scienti c journal, and a US government-run nutrition campaign. One example of a Twitter bio coded as professional was "Associate Prof in Physical Activity and Health, and Associate Dean Postgraduate Research at [university redacted]." Around 7% of tweets were by information intermediaries (6.4%; n=91).
These included a variety of accounts where the owner had a self-proclaimed interest and expertise in health but without information on quali cations, such as: personal trainers, holistic healthcare providers, and nutritionists. One example of such a bio was "Holistic Health Enthusiast. Real Food Coach.
Functional Fitness Expert. Visit my site below for 7 Things to Ignite Your Health & reclaim Joy, Energy, & Life." Most tweets in the sample (92.3%; 1,324) were from the non-professional general public. Examples of accounts included individuals and businesses with no mention of healthcare quali cations or public health interests, such as: tabloid newspapers, online stores and eBay sellers. An example of a nonprofessional bio [paraphrased] was: "Cooking enthusiast. Writer. Book lover. Food lover. Dealing with cancer."

Thematic analysis
Educational tweets from a professional source Most tweets from a professional source (83.3%; n=15) were educational in nature and the majority of these served to disseminate news/research (60.0%; n=9), often with a link to the source of information. These included studies presenting a link between physical activity and reduced arthritic pain, the association between eating disorders and weight loss surgery complications, and a report which showed that the female diet was lacking in iron. For example, one tweet stated "Australian research shows that low carb diet reduces costs of medication for T2D [type 2 diabetes] by 50%".
There were also several educational tweets that provided recommendations or guidelines (40.0%; n=6). These tweets included the provision of links which directed the audience towards tips on what to eat after exercise, recommendations on what to add to your diet to help you lose weight, and improving nutrition to prevent weight loss and increase quality of life during cancer treatment. An example of recommendations and guidelines tweeted by professional sources included: "What is active design? Strategies for how health care establishments can encourage physical activity [link]" Educational tweets from information intermediaries and non-professionals

Information intermediaries
Most tweets from information intermediaries were educational in nature (60.4%; n=55). News and research tweets made up 36.4% of the educational tweets from information intermediaries. Examples of news and research topics addressed by this population included various diets such as the Mediterranean diet, plant-based diets and the 5:2 diet, sources of salt in different foods, and exercise during menopause. A further example was a tweet that shared a news article which discussed the association between weight loss and heart failure, "Weight loss, particularly with surgery, tied to lower risk of heart failure [link] #news". These tweets often provided sources to back-up their claims.
Topics addressed by this group included lists of recommended advice for weight loss, for example "Create a strong goal for yourself and work to accomplish it daily [link] …#FatLoss #WeightLoss". Many of these tweets focused on recommendations for rapid weight loss with links which directed readers to non-credible websites such as blogs or forums, "5 Breakfast Smoothies for Extremely Fast Weight Loss [image] [link]". There were also tweets that promoted fad diets. For example, one tweet promoted the adoption of a 'blood type diet', "Choose A Diet Based On Your Blood Type To Fight Fatigue -Type O should avoid dairy, wheat and eat lean meat... [link]". Tweets from information intermediaries providing recommendations and guidelines (63.6%) often did not provide substantiated evidence for claims.

Non-professionals (general public)
Tweets from the general public presenting news/research (26.0%; n=104) included links to articles providing information on how poor diet is related to heart disease, how those who cook at home are more likely to eat a healthier diet, and the bene ts of switching to a vegetarian diet. One further example of an educational, news or research-based tweet, includes "Avocado has good fats which are essential to a healthy diet [link]". Others shared articles regarding the dangers of diet pills and detox tea, or offered discussion of the harms of sugar in a variety of ways, including the context of damaging children's teeth. Whilst most of these tweets included information which could be validated with evidence and provided sound sources of their information, there were also tweets which made strong health claims that could be considered health misinformation. Examples include: information on how a particular diet 'heals' anxiety, an article from a well-known tabloid stating that probiotics only bene t those with a bad diet, and an article from a women's magazine detailing how late night eating affects the body.
Recommendations and guidelines from the general public were the most prevalent category, comprising over one fth of the entire sample (22.4%; n=296). Most of these tweets shared recommendations on how to lose weight such as the types of exercises to do or what kind of foods to eat, provided recipes and insights into calorie intake and calorie content, but a small number also provided insight into guidelines regarding physical activity and diet. For example, "It's up to us all to make sure our children get the recommended one hour of physical activity every day [link]". "Clickbait" was in abundance among the recommendations/guideline-style tweets shared by nonprofessionals, (i.e. 'content whose main purpose is to attract attention and encourage visitors to click on a link to a particular web page') (20). This included the presence of the 'top' tips, the 'best' foods, and the 'fastest' weight loss techniques, "Best Snacks for Weight Loss [image] [link]". Within these tweets, the use of listicles, (i.e. 'a piece of writing or other content presented wholly or partly in the form of a list'), was widespread, for example, "The 12 Best #Foods to Eat When You Hit a Plateau in Your Weight-Loss
Most promotional tweets focussed on the topic of diet, including foods, supplements, and teas to promote appetite suppression, weight loss or detoxing. ". There were also tweets promoting items or services to assist in physical activity, diet or weight loss, such as recipe books, workout DVDs, and well-known commercial aids such as Slimming World. One tweet also promoted a weight loss hypnosis CD, "New CD for Weight Loss -Hypnotherapy Self Hypnosis for Weight Loss -[link] via @eBay_UK".

Discussion
This study adds to the scant evidence base exploring the landscape in which misinformation on important risk factors for NCDs, namely, physical activity, diet, and weight loss, is shared on social media.
Findings show that only 3.2% of educational information shared on Twitter regarding these health behaviours originated from a professional source. Interestingly, information was shared by the general public (85.1%) and 'information intermediaries' (11.7%). The latter groups frequently provided recommendations for physical activity, diet and weight loss, and promoted products and services using 'clickbait' techniques such as 'top' tips and 'best' foods. This often included the broadcast of potentially unsafe health misinformation and promotion of possible unsafe products and methods for weight loss.
The spread of health information by non-professionals on Twitter has been noted in previous research, in areas including colorectal cancer and diabetes (8,13). Credibility of health information disseminated by all tweets may be called into question but, in particular, the credibility of tweets by information intermediaries and non-professionals is more questionable as the risk of health misinformation is inherently higher (8,10). Provision of information from a credible source ("Presenting verbal or visual communication from a credible source…" (21)) has been shown to be an effective behaviour change technique, particularly in digital interventions (21,22). However, on Twitter, credibility is essentially an individual judgement by the follower, and health misinformation from information intermediaries (who present themselves as professionals) is at particular risk of being viewed by the general public as evidence-based science. Authenticity is also an issue when judging credibility by an individual's description of themselves in their Twitter bio (23)(24)(25). For example, the words 'public health consultant' in a Twitter bio may be a false identity used to push an agenda. This is further complicated by the presence of automated bots on Twitter, which may exist to rapidly and intentionally spread health misinformation in order to promote products, services or sometimes discord (24,26,27). Therefore, the risk of health misinformation on Twitter may be even greater than stated within these ndings.
As well as credibility, the format in which information is relayed is also important to consider. Many tweets from non-professionals and information intermediaries were presented through biased marketing techniques (i.e. "try this SHOCK diet!", or, "you won't BELIEVE…"). This 'clickbait' style of disseminating information is becoming ever more present in the SM landscape, including in the dissemination of health information (28). A unique feature of SM, as opposed to traditional media platforms, is the provision of a universal platform on which all users can access an audience, regardless of knowledge, expertise, or intention. It is this equal footing which often results in 'fake news' going viral, or infestation by hive minds built upon misinformation and half-truths (5).
The high number of promotional tweets is unsurprising, given the global in uence of commercial interests, and the rising awareness in public health of commercial determinants of health (29,30). In particular, the heavy presence of the food and drink industry on SM platforms for marketing purposes is well documented (10,24,31). Whilst several tweets were promoting relatively harmless products such as recipe books and workout DVDs, there were also tweets promoting fad diets and potentially dangerous methods and products which promised rapid, and unhealthy, weight-loss, including detox teas and appetite suppressing pills. This nding highlights the challenging landscape which faces health promoters on SM; one where health misinformation is shared for pro t by a range of stakeholders, ranging from successful, well-known commercial organisations to potentially dangerous and illegal sole traders. It also demonstrates the important role that health researchers can have on SM, that is, that the general public could bene t from credible, accurate and evidence-based health information and the promotion of safe and healthy avenues for weight loss from trusted sources on SM platforms (10). However, as seen within this sample, this information is getting 'lost in the crowd', thus health researchers must consider methods which increase visibility of their information.
There are a number of study limitations which should be considered. Firstly, as tweets were collected in real-time, meta-data (such as number of retweets and likes) was not collected. This type of data can provide important insights into the relative 'in uence' of individual tweets. The analysis concerns the distribution of original tweets and does not consider information such as followers so provides no evidence about the distribution of audience exposure. Collection of historical tweets may provide insight into exactly what information is frequently viewed or shared by followers, which would help shed further light on the spread of misinformation around physical activity, diet and weight loss on Twitter. Secondly, Twitter was chosen for this analysis due to the level of retrievable data from publicly accessible accounts. However, this limits generalisability of the ndings. Some reports have suggested that individuals may be more likely to discuss food on Facebook or Pinterest rather than on Twitter (32). Despite its limitations, this study provides a much-needed insight into the landscape of physical activity, diet and weight loss information on Twitter.
Future research in this area would bene t from the collection of historical physical activity, diet and weight loss tweets, in order to determine the mechanism by which this information spreads through Twitter networks (e.g., who retweets/likes this information?). A multi-platform approach may better inform researchers on the full spectrum of physical activity, diet and weight loss conversation on social media. Additionally, novel methods could be adopted to combat the spread of misinformation. For example, the use of bots has been well documented in helping spread misinformation on social media, and some disciplines have successfully deployed the use of bots in combatting misinformation (33,34).

Conclusion
This study furthers our understanding of physical activity, weight-loss and diet-related information on Twitter by highlighting that informational tweets on these topics come primarily from non-professionals. The ndings demonstrate that the voice of professionals is drowned out on Twitter by the volume of tweets from non-professionals. Additionally, it provides insight into tell-tale characteristics of tweets containing potential misinformation (including marketing language and the use of clickbait-style techniques). Therefore, Twitter users may be at risk of receiving physical activity, weight loss and dietrelated information from non-professionals, which could include non-evidence-based information as well as misinformation. The study suggests that there is a need for public health researchers and practitioners to combat misinformation on such platforms about physical activity, diet and weight loss, and the need to develop strategies and interventions that increase the diffusion and visibility of evidence-based information. All data was publicly available and the study required no no human participation, thus ethics approval and participant consent is not required.

Consent for publication:
No consent for publication was sought, as all data was publicly available. However, all tweets included in the study were paraphrased and thus cannot be traced to their original source online.
Availability of data and materials:  Sampling strategy for tweets Coding strategy for analysing tweets* *De nitions of each term are as follows. Source was determined by the Twitter bios of users and the source of each tweet was assigned a discrete code. Professionals: those who describe working within a health/public health capacity, for example, academics, researchers, health charities and bodies, professional societies, and protected terms (e.g. dietitian). Non-professionals: general public and bodies not described as working within a health/public health capacity, for example, individuals, and news agencies. Information intermediaries: sources that fall into neither of the previous de nitions, unprotected terms (e.g. nutritionist), professionals that could be considered (or perceived) as working within health/public health. Tweet text was assessed, and additional items, such as URLs and images were assessed, where appropriate and each tweet was assigned one discrete code. Educational: tweet attempted to impart health information or knowledge: i) News and research: information which stated health-relevant ndings or reports: ii) Guidelines and recommendations: information advocating a behaviour or idea. Non-educational: tweet did not attempt to impart health information or knowledge: i) Promotional: advocating a product or service which required subscription or payment, ii) Conversational: general chatter intended for a speci c individual or mass audience, iii) Behavioural: reporting a physical activity, diet or weight loss relevant behaviour, or intention to change behaviour, iv) Directional: redirection to another SM platform.