The results of the analysis are presented in two parts; the first part shows the results of the various methods used in data acquisition. The second part also focuses on the results of the regional differences in information and misinformation of COVID-19 in Ghana. Figure 2 shows the grid and their central point used.
The python-based data scraping script requires a buffer to scrape that geotagged data. The grid provided 12 points across Ghana and optimization techniques helped in the creation of an 80km buffer around each point and that provided the output from the data scraping. The python script was able to scrape the needed data as intended. The results were written in Comma Separated (CSV) format. In total, 1,167 tweets were scraped from Twitter using the python script through the Twitter API. Below is a sample of the scraped data.
The data scraped was cleaned to make it easy for readability and also suitable for analysis. There were 653 duplicate tweets and this was because the 80km buffer employed caused some of the buffer zones to overlap. All the overlapping data were removed for the data to fit within their designated grid. From the initial 1,167 tweets, the 653 tweets were removed, leaving 514 clean tweets, from which the various analyses were performed.
The clean tweets were categorized into Accurate Information, Misinformation, and other information after the validation process was completed. From the 514 total tweets, 368 tweets, representing about 71.5% were accurate information, 71 tweets, representing 14% were misinformation and lastly, 75 tweets, representing about 14.5% were categorized as other information (Table 1).
Table 1
Data categorization types
Info Type
|
Count of Info Type
|
Accurate Information
|
368
|
Misinformation
|
71
|
Other Information
|
75
|
Total
|
514
|
In Table 2, we present samples of tweets that we classified as accurate information, misinformation, and other information.
Table 2
Samples of information, misinformation, and others
Tweet categorized as Accurate Information
|
Tweet categorized as Misinformation
|
Tweet categorized as Other Information
|
#"FACT: COVID-19 is NOT airborne.
The coronavirus is mainly transmitted through droplets generated when an infected person coughs, sneezes, or speaks.
To protect yourself:
keep 1m distance from others
disinfect surfaces frequently
wash/rub your hands
avoid touching your face"#
~ @AbdulRasheedAw
|
“Akua Donkor - "COVID-19 is a hoax and non-existent. They don't want us to speak that's why they've ordered us to wear masks so they can win the elections” “~ @OseiAkoto_Kanu
|
Oh! No, I pity Secondary School Students; they will write an Essay on this COVID-19 when school resumes? ~ @iam_clinton
|
The accuracy or otherwise of tweets were classified after validation from factcheck.org and snopes.com to confirm their authenticity.
The 514 geotagged tweets were plotted to show their spatial distribution across Ghana. Figure 4 represents the exact location where the tweets were sent from and in the various regions of Ghana.
Generally, tweets came from all regions of Ghana. Table 3 represents the regional breakdown of tweets by total and percentages within the study period.
Table 3
Regional breakdown of the information
Region
|
Tweet Count
|
% Misinformation
|
% Accurate Information
|
% Other Information
|
Total
|
Ahafo
|
8
|
25
|
63
|
13
|
100
|
Ashanti
|
46
|
22
|
65
|
13
|
100
|
Bono
|
27
|
15
|
74
|
11
|
100
|
Bono East
|
24
|
8
|
75
|
17
|
100
|
Central
|
15
|
7
|
93
|
0
|
100
|
Eastern
|
35
|
6
|
80
|
14
|
100
|
Greater Accra
|
59
|
8
|
80
|
12
|
100
|
North East
|
20
|
20
|
60
|
20
|
100
|
Northern
|
49
|
12
|
76
|
12
|
100
|
Oti
|
15
|
27
|
73
|
0
|
100
|
Savannah
|
51
|
14
|
65
|
22
|
100
|
Upper East
|
27
|
11
|
74
|
15
|
100
|
Upper West
|
54
|
22
|
48
|
30
|
100
|
Volta
|
29
|
14
|
83
|
3
|
100
|
Western
|
39
|
8
|
85
|
8
|
100
|
Western North
|
16
|
13
|
63
|
25
|
100
|
Table 3, shows a regional count of tweets as well as the percentage tweets categorization as accurate information, misinformation, and other information. Greater Accra region recorded the highest number of tweets with a count of 59 while Ahafo came last with a total of 8 tweets.
The breakdown of tweets per region has been represented graphically in Fig. 5.
Analysis of misinformation tweets as a percentage of total tweets shows that 27% of tweets from the Oti region are misinformation, one (1) out of every four (4) tweets from the Ahafo region were misinformation. Ashanti and the Upper West regions have 22% of their tweets being misinformation. Eastern and Central regions have the lowest level of misinformation, with 6% and 7% respectively.
Central region and Oti region recorded zero (0) other information, whilst Volta region recorded one (1) other information. Upper West recorded the highest other information with 16 tweets followed by the Savanna region with 11 tweets.
Regional tweets were grouped as high (40 ≤ tweets), medium (20–39), and low (1–19). High-tweet regions include Greater Accra, Upper West, Savanna, Northern, and Ashanti, medium-tweet regions are Western, Eastern, Volta, Bono, Upper East, Bono East, and North-East regions while the low tweet regions include Western North, Central, Oti, and Ahafo regions.
Table 4
Range of tweets per region for the density analysis
Class
|
Range of Tweets per region
|
|
High
|
40 ≤
|
Greater Accra, Upper West, Savanna, Northern, and Ashanti
|
Average
|
20–39
|
Wester, Eastern, Volta, Bono, Upper East, Bono East, and North-East
|
Low
|
1–19
|
Western North, Central, Oti, and Ahafo
|
The reddish color in the map shows regions with the highest number of geotagged tweets that were scraped from Twitter. The yellow-orange color shows regions with the average number of tweets. Lastly, the light pink color shows the regions with the least number of tweets.