A meta-analysis of Twitter assisted learning outcomes in terms of country, gender, and Twitter usage


 The use of social media such as Twitter has been widely used in education due to their positive learning outcomes. To summarize the effect of Twitter use on educational outcomes, this study included 22 high-quality peer-reviewed journal articles for meta-analysis. It is concluded that the Twitter assisted learning approach can lead to significantly higher learning outcomes than the non-Twitter assisted but the former cannot lead to significantly higher teaching effectiveness than the latter. The use of Twitter can improve learning outcomes in the USA and Sweden rather than Spain. Twitter can significantly improve learning engagement in the USA but not in Greece. Males tend to show significantly higher learning outcomes than females in the Twitter assisted learning context. Whether Twitter is used as a supplementary or an integrated tool can significantly improve learning outcomes. Future research may adopt more interdisciplinary methods and include more literature to summarize the effect of Twitter use on educational outcomes.


Introduction Positive ndings
Twitter assisted learning could improve learners' engagement in learning activities and communities.
Active engagement in learning could improve learning outcomes although the link between them proved weak. Tips through Twitter were useful to learners in Spain because they deem Twitter a tool for educational purposes (Fouz-González, 2017). With Twitter, Spanish students could actively participate in learning activities although they could hardly integrate the use of Twitter into learning interactions. Guidance should, therefore, be provided for students to improve the interaction. The restriction on the length of a message could encourage users to think and decide what to type next. The training schedule could focus on the design of learning activities (Feliz, Ricoy, & Feliz, 2013). With social media such as Twitter, American students, who consider Twitter a useful tool, could actively join large learning communities and enhance their learning interest (Hitchcock & Young, 2016). Greek Students' learning attitudes towards Twitter were positive, leading to their active engagement in learning activities (Katrimpouza, Tselios, & Kasimati, 2019).
The use of Twitter could improve users' academic achievements due to various factors. Integrating Twitter into a course coupled with teacher participation could greatly improve students' academic achievements in the USA (Junco, Michael Elavsky, & Heiberger, 2013). Twitter assisted learning approaches could greatly improve students' academic achievements by enhancing their engagement (Junco, Heiberger, & Loken, 2010). Twitter use could greatly improve Greek students' laboratory performance although it is not found signi cantly correlated with their personality traits. The use of Twitter could improve Greek students' academic achievements and social presence and enhance their self-e cacy (Loutou, Tselios, & Altanopoulou, 2018). In a classroom, American students who used Twitter frequently obtained signi cantly higher academic achievements than those who seldom used Twitter Some studies reported gender differences in the use of Twitter. For example, American men were more likely to examine tweets to share resources and criticize other tweets while women tended to write tweets and positively evaluated other tweets. This indicates that the teacher could use different teaching strategies toward different genders in the USA (Kerr & Schmeichel, 2018). Females' contributions to Twitter seemed signi cantly larger than males (Davidson-Shivers, Muilenburg, & Tanner, 2001). Males could produce signi cantly more voice messages than females (McConnell, 1997).

Negative ndings
Despite many studies revealed positive ndings regarding the educational effect of Twitter use, there are also many studies reporting negative results. For example, it was reported that the use of Twitter could not improve learning outcomes in the USA (Al-Bahrani, Patel, & Sheridan, 2017). Even Twitter messages were related to lecture notes, they failed to improve American users' academic achievements in terms of multiple-choice grades and free-recall performance. Worse, when learners create and send tweets frequently, the quality of their lecture notes will be reduced (Kuznekoff, Munz, & Titsworth, 2015). The use of Twitter in class could not signi cantly in uence American users' interest in politics and news reading although it might positively in uence their learning outcomes (Feezell, 2019).
We should try to see both sides of a coin when exploring the use of Twitter in education. Professional contents on the Twitter platform, as well as American students' learning attitudes, could greatly in uence the teachers' credibility. Twitter could be both an advantage of and an obstacle to learning and teaching (DeGroot, Young & VanSlette, 2015). There were several drawbacks, e.g technical issues, limitation to higher-order skill acquisition, and limited learning opportunities, in the Twitter integrated mobile learning system in Sri Lanka (Dissanayeke, Hewagamage, Ramberg, & Wikramanayake, 2016).
It was also found that Twitter may not be suitable for educational use. American Twitter users are younger and wealthier than those in other regions, leading to under-representativeness of Twitter population. This causes the claim that Twitter may be more properly used by corporations than by social science researchers in the USA (Blank, 2016). This argument may dampen the enthusiasm of Twitter use for educational purposes. Twitter may be not appropriate for educational purposes since tweets have numerous problems, e.g. lack of relational cues in tweets, limitation to contents, no hint of communicative behaviour, and insu cient timeline (Yoshida, 2021).

A research gap
Little is known about the effect of different Twitter usages on learning outcomes. Some studies adopted This study focuses on Twitter assisted learning outcomes in terms of country, gender, and usage, which is not required to be registered. It is conducted based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) (Moher, Liberati, Tetzlaff, Altman, & Group, 2009) and approved by the institutional review board which waives a review protocol.

Eligibility criteria
We selected and excluded studies based on inclusion and exclusion criteria. The studies will be included if they (1) focus on Twitter assisted learning outcomes in terms of country, gender, and Twitter usage; (2) are of higher quality; (3) divide participants into both control and treatment groups for a comparative analysis; (4) adopt a randomized controlled design through a comparative analysis between both control and treatment groups; and (5) are peer-reviewed academic works.
The studies will be excluded if they (1) focus on Twitter technology itself rather than Twitter use in education; (2) belong to reviews rather than empirical studies; (3) cannot provide enough information for meta-analysis even after contacting authors; or (4) are written in a language other than English.

Information sources
The information sources from multiple databases, e.g. EBSCOhost, Taylor & Francis Online, Wiley Online Library, and Sage. We also obtained information through corresponding with the authors in case the fulltext does not provide enough data for meta-analysis. We searched the above databases on January 27, 2021. The coverage of dates ranges from the commencement year until January 2021. i.e. EBSCOhost, Taylor & Francis Online, Wiley Online Library, and Sage. In the rst step, we searched the online databases using corresponding terms to obtain literature before we removed the duplicated results. In the second step, we invited two independent reviewers to screen the irrelevant results by perusing abstracts and titles. In the third step, two independent reviewers evaluated the eligibility of the full texts. In the fourth step, both reviewers presented the results of evaluation and negotiated on any disagreements. A third experienced reviewer would join and determine the selection if both reviewers could not persuade each other on any speci c literature selection. After the four-step selection process, we nally determined 22 peer-reviewed journal articles for the meta-analysis ( Figure 1).

Quality assessment
We The research design should be clearly described and the research methods should be appropriate for the topic being investigated. Two reviewers scored each selected article. The average score will be considered. We selected the top-scored 22 articles. The Interrater Cohen's kappa coe cient is 0.82.

Data extraction
The data were extracted by both reviewers from the eligible studies. To ful ll the meta-analysis, they extracted data such as authors, publication years, the total number of participants, means, and standard deviations of both groups.
They also extracted data regarding country, e.g. USA, UK, Sri Lanka, Sweden, Spain, and Greece, learning outcomes, and Twitter usage. Learning outcomes are classi ed into academic achievements, attitude, gender differences in learning outcomes, engagement, and teaching effectiveness. Twitter usage is classi ed into Twitter as a supplemental learning tool and Twitter as an integrated learning tool.
Similarly, both reviewers met together to decide on the nal extracted data after they nished data extraction. In case they could not reach an agreement on any data, a third reviewer would be invited to make a decision.

Statistical analysis
Stata MP/14.0 was used to analyze the extracted data and ful ll the meta-analysis. Z-statistics were adopted to analyze the effect of meta-analytical outcomes. Since the data type is continuous, we entered the numbers of participants, means, and standard deviations of both groups into Stata MP/14.0, labeled the data by authors and publication years, and selected country, learning outcome, and Twitter usage as a variable. The pooling model was either random (I-V heterogeneity) or xed (Inverse Variance). The effect size was expressed as Cohen d or standardized mean differences (SMD). d is considered very small if the value is 0.1, small if 0.2, medium if 0.5, large if 0.8, very large if 1.2, and huge if the value is 2.0. (Sawilowsky, 2009). SMD was calculated through the formula: SMD = difference in mean outcome between groups/standard deviation of outcome among participants. Due to various situations, different characteristics of participants, and different interventions, different studies tend to cause different effect sizes. Statistical heterogeneity occurs immensely and is unavoidable (Higgins & Green, 2011). We, therefore, quantify heterogeneity using I 2 through the formula below: where Q indicates the chi-squared result and df means the degree of freedom (Higgins & Green, 2011).
The heterogeneity will be considered not important if I 2 = 0% -40%, moderate if 30% -60%, substantial if 50% -90%, considerable if 75% to 100%. Roughly, we will use a random-effect model to conduct the metaanalysis if I 2 is larger than 50%, and a xed-effect model if it is smaller than 50%. If the value of I 2 is large, we will test the sensitivity using the in uence analysis in Stata MP/14.0. We will also test the publication bias using Begg's (Begg & Mazumdar, 1994) and Egger's (Egger, Smith, Schneider, & Minder, 1997) tests.

Results
Characteristics of included studies Table 1 summarizes the characteristics of included studies, involving author, publication year, sample size, country, learning outcome, and Twitter usage. The speci c data such as means and standard deviations of both control and treatment groups are provided in the data le.
Risk of bias within studies Both Egger's and Begg's tests were used to detect the publication bias. The data input format theta se_theta was assumed for both tests. Figure 2 presents an Egger's plot of publication bias. A dot in Figure  2 indicates an individual study. The uneven distribution of the dots along either side of the middle noeffect line indicates the presence of publication bias (t = 4.70, p < .01). Begg's test examines the publication bias through rank correlation between standardized intervention effect and its standard error, which also indicate the presence of publication bias (z = 3.15, p = 0.002).

Results for individual and synthetic analysis
This result section presents the ndings regarding pooled learning outcomes, their differences in different countries and, different usages of Twitter.

Results for learning outcomes
To determine whether a random-effect or xed-effect model we should adopt, we tested the heterogeneity of the effect sizes for academic achievements. The result shows that the effect sizes are heterogeneous (Q = 2129.93, df = 73, p <.01, I 2 = 96.6) ( Table 2). We, therefore, adopted a random-effect model to conduct the meta-analysis of academic achievements.
Through Stata MP/14.0, we obtained 74 effect sizes (SMD) in terms of academic achievements, so we analyze academic achievements independently to show them clearer in a forest plot ( Figure 3). As shown in Figure 3, the horizontal line indicates the 95% con dence interval. The middle line is referred to as a noeffect line because if the horizontal line crosses it, the effect will be considered not signi cant. The diamond indicates the pooled result of effect sizes. If the diamond crosses the no-effect line, the result will be considered not signi cant. Since the diamond does not cross the no-effect line and is located to the right of the no-effect line, we conclude that Twitter assisted academic achievements are signi cantly higher than the non-Twitter assisted (d = 0.380, 95% CI: 0.16-0.60, z = 3.31, p = 0.001) ( Table 2).
The effect sizes are also heterogeneous for other learning outcomes such as attitude (Q = 518.43, df = 17,  Table 2). To keep the analysis method consistent, we adopted a random-effect model to conduct the meta-analysis regarding other learning outcomes.

Results for learning outcomes in different countries
To  (Table 3). We, therefore, adopted a random-effect model to conduct the meta-analysis regarding learning outcomes in different countries.
As shown in Table 3 We, therefore, conclude that Twitter assisted learning can lead to signi cantly higher learning outcomes than non-Twitter assisted learning in the USA, Greece, and Sweden but no signi cant difference is revealed in Spain.

Results for academic achievements in different countries
To meta-analytically examine academic achievements in different countries, we obtained a total of 73 effect sizes. We nally selected 69 effect sizes after removing 4 results due to the confusing research venue (USA and UK mixed).

Results for learning attitudes in different countries
We obtained 6 effect sizes for learning attitudes in Sweden and 10 in the USA. A total of 16 effect sizes are summarized for the analysis of learning attitudes in different countries.
As shown in Figure 6, the effect sizes in Sweden (I 2 = 59.9, p = .029) and the USA (I 2 = 95.8, p < .01) are all signi cantly heterogeneous. We, therefore, adopted a random-effect model to conduct the meta-analysis. In Sweden, students hold signi cantly more positive learning attitudes toward Twitter assisted learning than the non-Twitter assisted (d = 0.40, 95%CI = 0.19 -0.61) ( Table 3) since the diamond is located to the right of the no-effect line. In the USA, there are no signi cant differences in learning attitudes towards both Twitter and non-Twitter assisted approaches (d = 0.13, 95%CI = -0.07 -0.34) ( Table 3) since the diamond crosses the no-effect line. The overall results indicate that students hold signi cantly more positive learning attitudes towards Twitter assisted learning than the non-Twitter assisted (d = 0.29, 95%CI = 0.10 -0.49) ( Table 3) since the diamond is located on the right side of the no-effect line. Therefore, we conclude that in Sweden users hold signi cantly positive attitudes towards the use of Twitter in education but no signi cant difference is found in the USA.

Results for learning engagement in different countries
We retrieved 1 effect size in Greece and 5 effect sizes in the USA and adopted a xed-effect model to conduct the meta-analysis since the results are not signi cantly heterogeneous (I 2 = 0, p = 0.942 for USA and no result for Greece). One effect size in Greece and 5 effect sizes in the USA were obtained for the meta-analysis of learning engagement in different countries ( Figure 7). As shown in Figure 7, the diamond regarding the effect size in Greece crosses the no-effect line. We, therefore, conclude that there is no signi cant difference in learning engagement between Twitter and non-Twitter assisted learning (d = 0.58, 95%CI = -0.11 -1.26). On the contrary, in the USA, the pooled diamond does not cross the no-effect line and is located on the right side of the no-effect line, so we report that Twitter assisted learning can cause signi cantly more engagement than non-Twitter assisted in the USA (d = 0.39, 95%CI = 0.15 -0.63).

Results for teaching effectiveness in different countries
We retrieved 5 effect sizes for meta-analysis of teaching effectiveness in Spain and 1 effect size in Sri Lanka. Considering the signi cantly heterogeneous result in Sri Lanka (I 2 = 57.0%, p = 0.01) and the insigni cantly heterogeneous results in Spain (I 2 = 29.5%, p = 0.225), we adopt the same random-effect model to conduct the meta-analysis. We report that teaching effectiveness of the Twitter assisted approach is signi cantly (d = -0.30, 95%CI = -0.56 --0.03) ( Figure 5) lower than the non-Twitter assisted since the pooled diamond does not cross the no-effect line and located to the left of it in Spain. In Sri Lanka, Twitter assisted teaching effectiveness is signi cantly higher than the non-Twitter assisted (d = 1.05, 95%CI = 0.22 -1.87) ( Figure 5). The overall effect reports that there is no signi cant difference in teaching effectiveness between the Twitter and non-Twitter assisted approaches (d = -0.14, 95%CI = -0.52 -0.24) ( Figure 5) because the pooled diamond crosses the no-effect line.

Results for gender differences in different countries
To determine gender differences in learning outcomes in different countries, we retrieved 7 effect sizes in the USA and 1 effect size in Spain. The results in the USA are signi cantly heterogeneous (I 2 = 99.9%, p < .01, no result for Spain). We, therefore, adopted a random-effect model to conduct the meta-analysis. . We, therefore, adopted a random-effect model to conduct the meta-analysis.
As shown in Table 5, either when Twitter is used as a supplementary tool (d = 0.501, 95% CI: 0.095 -0.907, z = 2.42, p = 0.015) or as an integrated tool (d = 0.459, 95% CI: 0.216 -0.702, z = 3.70, p < .01), the learning outcomes are signi cantly improved than the non-Twitter assisted learning approach. Thus, we concluded that Twitter assisted learning could lead to signi cantly higher learning outcomes than the non-Twitter assisted whether it is used as a supplementary or an integrated tool.

A sensitivity analysis
To determine whether or not the results were stable, we conducted a sensitivity analysis through Stata MP/14.0. Extreme estimates of effect sizes tend to skew the average estimated values and remain out of the con dence interval ( Figure 9, a dot indicates an individual study. The lower con dence interval limit is 0.27, and the upper con dence interval limit is 0.70. All the meta-analysis estimates remain between the upper and lower con dence interval given a named study is omitted. We, therefore, conclude that the meta-analytical results are stable.

Summary of evidence
Generally, positive evidence has been revealed regarding the use of Twitter in education. This metaanalysis reports that the Twitter assisted learning approach can lead to signi cantly higher learning outcomes than the non-Twitter assisted in terms of academic achievements, learning attitudes, and learning engagement. But the former cannot lead to signi cantly higher teaching effectiveness than the latter.
The use of Twitter can improve learning outcomes in the USA and Sweden rather than Spain. In Sweden, Twitter assisted learning is more positively evaluated than the non-Twitter assisted while no difference was found in the USA. Twitter can signi cantly improve learning engagement in the USA but not in Greece. Surprisingly, Twitter can decrease the teaching effectiveness in the USA but signi cantly improve it in Sri Lanka. Males tend to show signi cantly higher learning outcomes than females in the Twitter assisted learning context. Males assisted with Twitter perform signi cantly better than females in both the USA and Spain. Whether Twitter is used as a supplementary or as an integrated tool can signi cantly improve learning outcomes including academic achievements, learning attitudes, learning engagement, and teaching effectiveness.
The ndings are consistent with previous studies. For example, a Twitter-based mobile approach was used in an Agriculture knowledge pedagogy, where students generally obtained satisfactory learning outcomes. However, the teaching effectiveness, as well as the higher-order skills, was not highly evaluated due to limitations of Tweet language on the Twitter platform such as the word number limit and unfamiliar English technical terms for Sri Lankan students (Dissanayeke, Hewagamage, Ramberg, & Wikramanayake, 2016).
Another support regarding the lower teaching effectiveness sources from Spain. In the teaching period, Tweets' remaining tractable for students to review may cause the di culty of checking whether students nish the assignment in time or not. It also becomes di cult to obtain the information regarding the time and its span students have learned based on the tweets. The limitation to training stimuli, merely 22 tweets, may have exacerbated Twitter assisted teaching effectiveness. Rewarding the participants may have also drawn some participants who are interested in rewards rather than learning, leading to poor teaching effectiveness (Fouz-González, 2017). Consequently, Twitter assisted teaching effectiveness may have been weakened, which requires a rigid teaching design based on Twitter. Due to confusion between Spanish and English, Spanish learners of English tend to insert epenthetic vowels prior to consonant clusters (Fouz-González, 2017), which may have negatively in uenced their learning outcomes.
Previous studies also support that men tend to achieve more success in Twitter assisted learning. For example, males tend to produce slightly more tweets than females (Feliz, Ricoy, & Feliz, 2013). Females tend to be less engaged in Twitter assisted learning compared with males, who may be more interested in technology-based learning. Consequently, females obtained signi cantly lower academic achievements than males in terms of the nal course grade, gap-closing measure, and post-and pre-test scores (Al-Bahrani, Patel, & Sheridan, 2017). Men seem to more readily share resources with peers through tweets, where they like to show off their academic, professional, or other personal success via critical languages while women tend to a rm other tweets (Kerr & Schmeichel, 2018). The different language style requires men to invest more time and energy in tweet production, while it requires women to invest more time in message browsing. Obviously, production needs more knowledge and creativeness than browsing. Unsurprisingly, men tend to obtain more knowledge through Twitter assisted learning compared with women.
It is also found that Twitter can greatly improve students' learning enthusiasm, promote students' interaction and re ection, train students' skills of producing concise sentences, and build a real learning environment through learning feedback (Zhu, 2010). In Twitter assisted teaching, teachers can record students' feedback, trace students' learning progress, and obtain their learning performance. In Twitter assisted learning, students can discuss with peers to solve di cult problems, share their own opinions, and retrieve a sea of information from the platform. This will de nitely improve their engagement in learning, intensify their interest, cultivate their positive attitudes toward Twitter use in learning, and nally greatly improve students' academic achievements and enhance the teaching effectiveness.
However, Twitter use in education may also bring about negative results. Plentiful information and advertisements on the Twitter platform may distract students and teachers. They may also be indulged in non-academic information. They may also feel it hard to concentrate on a given topic in case they are confronted with excessive information. They may pursue the effect of entertainment rather than knowledge acquisition. Pieces of knowledge carried by Twitter may not bene t the higher-order thinking skills and the organization of structured knowledge, resulting in easy attrition of acquired knowledge and disorganized knowledge stored in learners' brain. Teachers may be reluctant to change their teaching styles and methods when they are required to teach via Twitter. They may also feel it awkward to deliver knowledge and di cult to focus on a topic if they are passively required to apply Twitter to their teaching practice. This may be an important reason for the decreased teaching effectiveness in some studies (e.g. Fouz-González, 2017).
Whether Twitter is used as a supplemental or as an integrated learning tool, it can exert a positive in uence on learning outcomes to a certain degree. Besides, teachers and students should attempt to use Twitter to (1)  Secondly, the study may not include all studies in the meta-analysis. Those written in a language other than English are excluded. Thirdly, the included studies themselves may have limitations. Fourthly, we may not include all eligible studies due to the limitation to library resources.

Conclusions
In general, Twitter use in education has been widely accepted and produced positive learning outcomes although there are still controversies in some countries regarding some aspects. Males tend to obtain more positive learning outcomes than females. Future research may adopt more interdisciplinary methods and include more literature to summarize the effect of Twitter on educational outcomes.

Declarations
Compliance with Ethical Standards: The study complies with Ethical Standards.