Global distribution of articles and the challenge of information retrieval
The general search in the All Databases dataset resulted in 18,875 articles published in journals. As we can observe in Fig. 1, regional databases (citation indexes from Korea, Russia and Ibero-America “SciELO”) present a reduced part of the production (1.37%).
The great amount of production came from WoS Core Collection, from which we differentiate three types of records, distributed in two databases. One database is the Emerging Sources Citation Index-ESCI (with 265 records or 1.40%), that contains journals of regional importance and in emerging scientific fields, but that were not accepted yet to the Core Collection. It means that the Impact Factor is not calculated to ESCI journals, representing a huge negative effect by authors from countries where evaluative processes give special importance to this indicator.
The other type of record was created using the same search expression we used to conceive the Core Collection dataset, that is to say, those representing the intersection between both datasets (with more than 13 thousand and 70.17%). Finally, we show the records that are also present in Core Collection, but can not be retrieved with our search expression in that database (the yellow bar in Fig. 1, representing 27.06%).
Figure 2 reveals differences over time, mainly after 2003, when both numbers of articles show growth, but more significantly to Core Collection dataset, meaning that the terms in our search expression settle. This is the year when the first SARS crisis occurred, resulting in that the queried terms retrieved 56.43% of the articles in Core Collection before that (1945-2002), but 77.75% afterward.
The differences above are important to be taken into account when investigating the coronavirus research in WoS, but we decided to pay deeper attention just to the 10,026 articles queried in Core Collection, from the last two decades. One reason is simply due to the fact that the authors have appointed the terms in the title and/or abstract of their articles and the other is the completeness of metadata in Core Collection when downloading data from its interface, compared to the data delivered by WoS platform.
Scientific collaboration and impact
Our objective is to analyze the composition by countries and organizations of publications on coronavirus in the period 2001-2020 and in the most recent period January 2019 - March 2020, covering the SARS crisis of 2003 (2002-2004) and that of 2019-2020. In this way, we will be able to capture the most prolific countries and the composition of organizations around coronavirus research, attending to their productivity, the degree of international collaboration and the number of citations accumulated.
The network map generated by the VOSviewer software for the 10,026 documents by countries for the period 2001-2020 (fig.3) offers us a fairly clear image of three main clusters. The first one is the USA in green, involved in the 34.22% of the documents. It is followed by China in yellow, with the 25.24%. Finally, it is the cluster formed by the different European countries in red, where contributions from Germany (6.74%), UK (6.35%), the Netherlands (5,52%) and France (4.25%) stand out above the rest. To these, we must add a fourth cluster made up of different Asian countries such as Japan (4.59%), South Korea (4.02%) or Taiwan (3.76%), among others.
Paying attention to the links between countries we can identify, based on co-authorship, certain recurring relationships. It highlights the high interaction between European countries and their preferential relationship with the USA over China. In fact, the different European countries and the USA appear as preferential partners in the network, with each other and with the rest of the countries for the period 2001-2020. By covering two decades of publications, we understand that its composition suggests a relatively established structure of leadership and recurring relationships in research around the coronavirus.
On the other hand, trying to capture the characteristics of the publications related to the COVID-19, we focus on the period between January 2019 and the twelve first weeks of 2020. Since the search strategy used uses the different names of the virus, it is assumed that a large part of the 2019 documents does not refer to the recent pandemic that began in the months of November and December of that year in the Chinese region of Wuhan and spread through different areas of China, Asia, Europe and the rest of the world during the first trimester of 2020. Despite this, we consider this sample as related and representative of the first research incomes of the second wave of SARS.
The map generated by VOSviewer (fig. 4) for the 917 documents for the period 2019-2020 shows fewer countries involved than those accumulated for the last two decades and a somewhat different distribution of the clusters. Compared to the 2001-2020 map (fig. 3), China is involved in the 32.50% of the documents and gains a more central position in the network in close relationship with the USA, which participates in the 29.44%. Again, both countries are the main producers of documents and relationship nodes in the network, together with a group of European countries led by the United Kingdom (5.67%), Germany (5.45%), France (5.13%), and the Netherlands (4.47%). Likewise, there are differentiated groups of countries whose main partner is the USA but which do not maintain relations with each other. One of them, in green color, is the one formed by Canada, involved in the 5.02% of the documents. In this group, we also have Italy (3.27%), India (2.73%) and Iran (0.98%). The other is the one formed by Saudi Arabia, with an important contribution of documents (7.85%), Egypt (2.94%) and Lebanon (0.44%). We can also identify a representation of the Asian cluster, in blue in the 2001-2020 network map (fig. 3), in which South Korea, involved in the 7.42% of the documents, gains prominence above Japan, involved in the 2.73%.
The number of documents provided by each country, the number of citations accumulated by these documents and the total link strength (TLS), that is, the total sum of all links generated by each document, are considered for both periods. A higher value of the TLS in proportion to the number of documents is an indicator of greater international co-authorship in the publication of scientific documents and of a greater recurrence of international collaboration in research in a given country. On the other hand, a lower TLS value suggests a greater weight of internal scientific production.
For the 2001-2020 period (Fig. 5), the value of the TLS in proportion to the number of documents provided is especially low in countries such as Japan, South Korea, Taiwan or Brazil, with 0.46, 0.35, 0.28 and 0.38 links per document and 35.87%, 25.81%, 19.89% and 31.79% of documents resulted from international collaboration respectively. At the other extreme, we find especially high values in Egypt, with 1.77 links per document and 94.62% of documents fruit of international collaboration, Switzerland, with 1.76 links per document and 80,26% of international collaboration, having WHO’s headquarters or Saudi Arabia, whose organizations have been characterized in recent years by the hiring of foreign research personnel in compatibility with their origin affiliations (Bhattacharjee, 2011; Messerly, 2014), with 1.31 links per document and 71.39% of documents fruit of international collaboration.
There is a positive correlation between the number of accumulated citations of the documents published by each country and the greater international collaboration in scientific production, based on the value of the TLS, appears. It stands out the highest proportion of citations collected in relation to the number of documents in countries such as the Netherlands, Canada or Switzerland, which in turn stand out for a large number of accumulated links. In any case, it should be kept in mind that this indicator would only serve as a gateway for a detailed study of the specific publications that accumulate the most citations. It would be fallacious to assume an equal distribution of the number of citations according to the number of documents published.
The USA appears as the country with the most documents contributed to the sample (3,431), the most accumulated citations (121,219) and the strength of the link (1,957). China appears in second place with 2,531 documents that accumulate 69,852 citations and generate a total of 1,133 links. Both countries present figures above the rest of the countries in the sample, although their size and composition make them hardly comparable to the rest. In this sense, if we compare their figures with the group of the 28 countries of the European Union (fig.6), we see how they are in the second position after the USA and above China, with 2,863 documents that accumulate 100,109 citations and generate a total of 1,636 links.
A previous study about MERS-CoV research publications from 2012 to 2015 in Scopus databases (Zyoud, 2016) found that the Netherlands produced the greatest proportion of publications with international research collaboration (72.7%) followed by the UK (71%) and Germany (69.1%) out of the total number of publications for each country. In our study, covering a wider timespan, these countries also present high values (62.03%, 64.67% and 63.91% respectively). Therefore, they are among the most contributing countries to the sample behind the USA and China (fig. 5). Despite this, the data for the set of EU countries present values more similar to the USA and China. This is explained by less international collaboration in the rest of the countries of the European Union that are not highlighted and the lower proportions presented by Germany, the United Kingdom, the Netherlands or France alone. Similarly, it is to be assumed that the level of international collaboration for the different United States of America or Chinese regions does not present the same distribution. Comparatively, China has a lower percentage of documents resulting from international collaboration (33%), the proportion of links (0,45) and citations (27,6) per document than the USA, with the 41% of internationally co-authored documents and 0,57 links and 35,33 citations per document, and the EU, also with the 41% of documents co-authored with non-EU countries and 0,57 links and 34,97 citations per document.
Comparing these figures for the two last decades with results of other previous studies related to the first SARS crisis, the same countries are among the most productive varying their positions depending on the databases. For instance, the USA, Canada, the United Kingdom, and Germany accumulated the 63% of publications about SARS in Science Citation Index (SCI) database from the beginning of the outbreak, in March, until July 8, 2003. The USA had about the 30% of the total share, followed by Hong Kong with the 24% (Chiu et al, 2004). Another example with a short time-span, Sept-2002 to Aug-2004, and a sample of 1,646 documents from MEDLINE database sets China as the most contributive country accounting the 48% of the total share followed by the USA with the 15% (Yang, et al, 2005). A wider study between 2003 and 2008 with 2,874 papers retrieved from SCI database shows China as the most contributive country followed by the USA. China increases the number of publications over the years getting closer to the first position and leading the public health response (Kostoff, 2011).
Our study covers a more extensive time-span (2001- March 2020). We show that the most contributive countries, with the highest number of publications, around the two SARS crisis are also the two most contributive countries in SARS-related publications: China and the USA. Kostoff (2011) also suggests that the countries hit hardest by SARS increase the number of publications concerning the topic. According to this, our sample time-span specifically points to the first publications related to the second SARS crisis (COVID-19) outbreak.
The data for the 2019-2020 period (fig. 7) presents a somewhat different setting. China leads the ranking with 298 documents submitted, closely followed by the USA with 270 documents. The rest of the countries have considerably lower values, being the third position for Saudi Arabia with 72 documents and the fourth for South Korea with 68 that break into the first positions of the ranking above the contributions of the European countries: the United Kingdom with 52 documents, Germany with 50 documents and France with 47 documents.
If we put the number of documents provided in relation to the total number of links generated with other countries, we see proportionally higher values than in the sample for 2001-2020, going from an average of 0.96 links generated by documents in the total period to 1.16 links for the 20 countries with the largest contribution of documents to the sample for each period. So, there is a higher level of international collaboration in the 2019-2020 period, with 64.84% of the documents, than in the 2001-2020 period with 53.46% of internationally co-authored documents. The most notable exception is China, which falls from 32.75% to 27.52% of the documents (0.44 links to 0.38 links per document). Japan increases the number of documents internationally co-authored from 35.87% to 40% in 2019-2020, despite the ratio decreases from 0.41 to 0.24 links per document, meaning that there are fewer countries involved. On the other hand, Italy increases from 0.66 to 0.76 links per document but keeps around 40% of international documents in both periods, so more countries are linked in the most recent publications. Other countries that offer a low proportion of links throughout the whole period show a slight rise in 2019-2020, such as South Korea that goes from 0.34 to 0.54 links per document and 25.81% to 29.41% of documents fruit of international collaboration, or Brazil that goes from 0.38 to 0.43 links per document and 31.79% to 34.78% of international documents.
In the same way as for the period 2001-2020, for the period 2019-2020, a positive correlation can be seen between a higher level of international collaboration (TLS) and the number of accumulated citations, although the value of these is lower due to the recent publication of the works. Despite this, the exception of China stands out, which despite offering a proportionally low value of 0.45 links per document and 27.52% of internationally co-authored documents, accumulates a greater number (733) and proportion (2.46) of citations of its registers than the other countries.
The 2019-2020 sample shows even sharper ratio differences when we focus on the comparison between the values of China, the USA and the countries of the European Union as a whole (fig. 8). The number of citations in proportion to the documents provided increases in China and the USA (1.91 citations and 0.88 links per document with the 56.82% of international collaboration) compared to those of the European Union, which increase to a lesser extent (1.23 citations and 0.91 links per document with the 55,96% of international collaboration). Contrasting these three specific cases, the proportion of citations accumulated per document is not related with the number of links generated and the international collaboration percentage for documents dated in the 2019-2020 period.
Our explanatory hypothesis for this significant difference in citations is, on the one hand, the initial impact of the COVID-19 pandemic in China and that the data treated in our study reflects the first studies carried out. On the other hand, there is the more than likely presence of works that have served as a reference for further research, such as publications of Chinese origin describing the Covid-19 person-to-person transmission (Chan et al. 2020).
We understand these data as a conjunctural description of the first impact of the COVID-19 pandemic on international scientific research to be compared in the future with the total number of records for 2020, which it is supposed to show a greater number of accumulated citations and a wider number and distribution of links between countries.
We found interesting to add each organization's country to see how the network changed when compared to the rest. Thanks to this, we can map which countries have the most collaborative organizations about coronavirus and when they have started. Also, thanks to the number of collaborations, we can identify if individual organizations have decided to work on this topic or if it was a country's decision and if different organizations from the same country do the same. Also, we can observe if in international collaborations, organizations assume a central role due to political decisions from their countries or if it is an initiative of researchers of the organizations. Some of the organizations from these networks are supranational or depend on scientific political decisions from different countries.
In this figure, the network shows the main organizations that have collaborated in the scientific production from 2001-2020. We can observe the total production (the size of the node) and the number of collaborations (the size of the link) with other organizations. It is divided into several clusters of different colors. The cluster in the middle of the map is represented by organizations with a central role in collaborations. Many of them are based in the United States, and as we observe, collaborate with different parts of the world, mainly with the European Union and China. The red cluster contains Chinese organizations. They have a major number of publications as the nodes show, but they do not have a central role in scientific collaboration. This means that many researchers from these organizations are leaders in the study of COVID-19, maybe for a top-down scientific decision from their government or that the topic is central in that society (bottom-up). In the case of China, we hypothesize that it is both.
In the periphery of this network, we can observe how several organizations compose small clusters of scientific collaborations. At the bottom of the map, for example, there is the pink cluster composed by Korean universities with a medium number of publications and a very active collaboration between them, but with few cooperation with organizations from other countries. The same situation is for the yellow cluster, composed by Brazilian and Italian organizations, with the difference that they work in a more isolated dynamic.
In this graph, we can observe with more detail what the previous network introduced. The University of Hong Kong and the Chinese Academy of Science have the largest number of publications with a big number of link strengths. Many of these publications were produced by collaboration between them. For this reason, in the network the two nodes appear very close. Also, these two organizations collaborate with other centers but they do not have a main role like the organization that we meet in the second position, the Center for Disease Control and Prevention of the United States. This research organization has a high number of documents and of total link strength, many of them with different international organizations. For this reason, its position on the map is in the middle. A similar situation is for the Ministry of Health of Saudi Arabia. Its position is close to the middle of the map and has the third position in total link strength. It is the first non-Chinese organization with more scientific collaborations.
The University of Hong Kong has the leadership in the number of documents (456), total link strength (572), and as we observe in this last graph, in the number of citations from 2001 to 2020. Its leading position, with 26,101 citations, is evident. The number is more than double than the Center for Disease Control and Prevention (10,583 citations), and the Chinese Academy of Science (10,500 citations), respectively in the second and the third position with a similar number of citations. The first European organizations are the University of Utrecht, that has a high number of documents (189) and citations (8,102). It is an organization that is close to the middle of the previous network, which means it has a high number of scientific collaborations with the rest of international organizations too. Thanks to this graph it is possible to understand that the impact of these organizations follows a close relation with the number of publications and the number of collaborations. It is not a surprising that the University of Hong Kong, the organization that has the leadership in the last twenty years, is the world benchmark organization for coronavirus research.
In a previous study conducted by Kostoff and Morse (2011) mapping 2,874 SARS publications between 1998 and 2008, the organizations with more registers were, in this order, the University of Hong Kong, the Chinese University of Hong Kong, the Chinese Academy of Sciences, the Centers for Disease Control and Prevention USA, the National Taiwan University, and the University of Toronto. Four of them are from China, two from the USA, two from Singapore, one from Canada, and one from Taiwan. These results are pretty similar to our results for the last 20 years.
In this study it was important to compare the results of the last two decades with the results of the last fifteen months, 2019 and the first twelve weeks of 2020, when COVID-19 has became a global pandemic with more than a million of contagions and several thousands of dead until April 2020. The interests of researchers and organizations have moved to study coronavirus and to find a solution.
In this network, we can observe how the clusters of Chinese organizations (blue, pink, and red) have moved in the middle of the map. This means that the number of collaborations has increased and also it is more distributed the research on COVID-19 in different Chinese centers and not only in a few like in the last two decades. Moreover, the relevant role of these Chinese organizations has increased in relation to the rest of the world, with different Chinese clusters that are connected with different other clusters. The map has changed also if we compare the position of the US cluster with the previous map. Now, US organizations move to a more peripheral zone of the map and have lost their leader place in the middle of it. Also, we can observe how more US organizations have appeared in these last two years with a stronger collaboration with Chinese organizations, a situation that previously did not happen.
The most relevant data that we can observe in this comparison is how China has increased the number of publications distributed in different scientific organizations, and not only in a few centers. In these last months, there are eleven Chinese organizations that are leading the area with several publications. This affects the main role to the University of Hong Kong and the Chinese Academy of Science, which however occupy the first two positions, but their number of publications is not so high compared with the other organizations. On the other hand, for the number of collaborations with foreign organizations, we discover that the trend has moved to more collaboration than the last two decades. Outside China, it appears a South Korean organization, the Seoul National University in the fourth position, and the Ministry of Health of Saudi Arabia with a relevant role in collaborations. The first European university is the University of Utrecht, confirming its leading role as in the global trend (2001-2020), also highlighted due to cooperation with foreign organizations. As we have observed in the previous network, the US organizations have completely disappeared in the first positions. The first one that appears, the National Institute of Allergy and Infectious Diseases, is in the twelfth position. This data confirms the little importance that the US government has given to COVID-19 in the last months, compared to the non-western world, fundamentally Asia, which give a relevant place to this topic.
Mapping the movement to open access by coronavirus publications and references
Coronavirus is a big issue for contemporary society that affects many countries. The availability of scientific production openly accessible related to it is, beyond a desired phenomenon, an indicator that the openness of science is a reality that can be framed if needed. Open Access is an essential instrument based on the new strategies to share knowledge and cooperative work using digital technologies showing the world that investigators are achieving attempts that otherwise would not be possible without collaborative networks and technological tools combined together (Belli et al., 2019). The spreading of the OA movement in several countries, exemplified by the growth of regional and national initiatives, such as the creation of OA digital journal libraries and the establishment of supportive governmental policies (Minniti et al., 2018), provides evidence of the significant role OA is playing in reducing the scientific gap between countries and improving their participation in the so-called “global knowledge commons” (Chan et al. 2011).
In this section, our primary objective is contributing to the mapping of OA coronavirus publications and their cited references, observing how it improves along the period. For this, it is important to analyse if the OA blooms of the research on coronavirus is funded or not. Moreover, we observe if it is due to international collaboration or not. Another aspect of interest is related to the obsolescence of the literature that supports the research under consideration – we hypothesise it increases due to higher productivity in times of epidemic/pandemic. Finally, we explore how the most prolific countries move forward and widen access to coronavirus scientific knowledge. The cited references, in turn, indicate whether the greater availability of openly accessible literature is actually being used for its own advancement.
We can observe the number and percentage of articles in OA (Tab. 1), concluding that it is growing almost every year, especially in 2020, when it grows 22.4% in relation to the previous year. When comparing the percentages among different OA types, we see that Bronze predominates in the first three years, when Green takes the lead until 2011 and 2013 again. In 2012 and from 2014 to 2019 Gold-DOAJ is the most preferred, sharing the first with Bronze – the latter really taking the lead – in the present year. This huge increase is due to the announcement made for many commercial publishers to give free access to COVID-19 and coronavirus related articles published in their subscription journals. It is also interesting to highlight the actual prominence of DOAJ journals to widen access to coronavirus scientific knowledge. In the end of this section we compare the relative importance of the other types against Gold-DOAJ, in relation to the different countries.
Tab. 1 Annual distribution of articles, according to the presence of not in OA journals, and percentage in OA (distinguishing the OA types) – 2001-2020.
Figure 13 presents the temporal evolution of OA in coronavirus literature as well as its obsolescence. As we can observe, the total amount of articles shows some variation, decreasing between one and another epidemics. The joint analysis of the presence of OA among cited references and articles is presented in Fig. 13-A shows that from 2010 to 2019 the percentage of OA increases practically in a constant manner, from ~55% to ~75% among articles, and ~10% to ~25% among cited references. In the previous interval, its variation was somewhat erratic. In the current year improves significantly in publications, but decreases in citations. On the right side, the percentage of OA articles is considered between funded or not research, and internationally collaborated or not (Fig. 13-B). Funding is more associated to OA publication than those publications that do not mention acknowledgment to funding, especially three years with highest gap between both (~37%): 2004, 2007 and 2011. As observed about Table 2, through these years the Green type is growing while Bronze is dropping. After that the DOAJ-Gold takes the scene until 2019, and the gap of percentage of A articles between research funded or not narrows crescently. In due proportion, the presence of foreigners in the collaboration is also significant to OA publications (the average gap between 2010 and 2014 is around 15%). Finally, it is important to highlight that the lines converge in 2020, when the differences among categories become null, reaching its maximum value, and surpassing 90%.
Figures 13-C and D focus on the cited references. The former, measures the average percentage of OA in relation to the presence (or absence) of funding, showing no significant difference between the trends. By the way, the presence of foreign co-author shows some positive difference, meaning that more cited references relate to OA journals, mainly in the 2012-2019 period. However, we have observed that the journals in OA are more scarce, mainly until the middle of the period, when they leave the level of 10% and reach about 25% in 2019. If the higher, recent and crescent availability of articles influences this tendency among cited references, the age of the references must show that it is getting younger - as to say, that the Price Index is increasing. As pointed out by Larivière, Archambault and Gingras (2007), the scarcity of scientific production implied a decrease in the Price Index (as they observed in the two world wars). So, in the opposite way, Figure 13-D permits to observe three peaks, coincident with periods when the scientific production about the epidemics were high: one in 2004, following SARS crisis; the second between 2014 and 2017, after the MERS-CoV, and now, during the SARS-CoV-2 pandemic. In this specific case we call the attention to the increase of cited references to OA journals by not funded articles.
Considering the scientific production of the 20 most prolific countries, in each of the periods, we have considered the average percentage of OA in cited references versus percentage of OA in articles. In order to analyse the changes in both OA variables, by the countries, we have ran a cluster analysis to group them according to similar behavior profiles, considering the relative growth, between the periods, in each variable (Fig. 14). Also, the changes between periods to all countries in each cluster are presented below (Tab. 2), where the other variables can be analyzed: international collaboration, funding and Price Index.
The seven OA clusters were identified with markers in the scatter plot with different geometric shapes, while the periods are differentiated by different shades of the same color (Fig. 14). In a general manner we observe that in the last two years the group of countries grows in both variables, with the exception of Belgium (alone in the cluster 2), that decreases the percentage of OA articles in the recent period. This can be due to the few number of articles in the recent period, denoting outlier behavior, that is also the case of Singapore, Taiwan and Vietnam (for this reason they do not present data bars in Tab. 2). Taiwan is the other cluster (7) with just one country, due to its remarkable increase in the percentage of cited references in OA journals, but have to be analyzed carefully.
The fourth cluster is formed by Canada and Italy (shades of purple), increasing at least 60% their percentage in each of the variables. Also, in Table 2 we observe that Canada presents an increase in the percentage of international collaboration (we have observed it presents increase in its TLS in Figs. 6 and 7) and funding, while Italy increases in the Price Index. It is interesting to compare the previous cluster with the third one (shades of green), formed by Brazil and Singapore, that present similar patterns of increase in cited references, less in articles. However, in percentage of international collaboration and funding Singapore improves better and Brazil with the second highest increase in the Price Index (Tab. 2). The situation is not so different to cluster 6 (shades of red), differing due to its lower increase in percentage of publications in OA in the recent period – being formed by India, Spain Switzerland, USA and Vietnam. In Table 2 we observe that Spain and Switzerland increases their percentage of funded research and India and USA in international collaboration.
Cluster 5 (shades of brown) is composed by Egypt and Saudi Arabia, which present the biggest percent of cited references with OA journals, showing no increase in the recent period. Egypt decreases its percentage of international collaboration, while both of them increase modestly the percentage of funded research. But their percentage of cited references in the Price Index deserves attention, due to a decrease of more than 12%, meaning less attention to recent articles in the recent period. They are probabaly more focused in the literature respective to the previous epidemics. Finally, cluster 1 (shades of blue) is the biggest one, with Australia, China, the majority of European countries, Japan and South Korea. They present increase in both variables of OA, tending to improve better in the cited references. About the other variables (Tab. 2) we highlight a general increase in international collaboration (excepting China, that retroceeds, as observed in Figs. 6 and 8) and funding (with exception of Sweden), while the Price Index presents few increases (Australia and South Korea, while Sweden have to be considered carefully due to its few publications in the recent period).
As we observe, OA is playing an important role in all countries and getting more attention in recent years. But this growth comes together to the tendency to cite more recent literature just in the case of Australia, Brazil, India, Italy and South Korea. One variable whose the change between periods correlates, despite negatively with these, is the funding, that increases less in all case (except India that decreases).
Tab. 2 Cluster of countries and change between periods, in the percentage of articles and cited references, according to different variables
Subtitle: in order to guarantee consistent percentages in articles cited references, we have discarded articles with less than 10 references.
Finally, we focus on the relative importance of the other types against Gold-DOAJ (Fig.15). Considering that part of the articles in the other types can be openly accessible after payment by the author (Robinson-Garcia, Costas and van Leeuwen 2020) it is important to compare the share of this part of the production with the share in DOAJ journals. The same authors find that the Green type is strongly represented in Europe and North America, while in South America Gold has a comparable level of importance. We can observe in Fig. 15 that the majority of the European countries, USA and Canada are distributed along the x-axis (related to the complete period), with a ratio of at least 1.5 (Spain and USA almost 3.0). Japan is the positive outlier related to this direction, showing that the DOAJ-Gold type is not its usual option (they prefer to publish in hybrid journals). On the opposite side we find Brazil, Egypt and Saudi Arabia, to whom the DOAJ-Gold is the favourite choice. China is in the middle, not so far from Taiwan and Australia.
Considering the y-axis, we highlight those countries that increase the ratio in the recent period, which are Brazil and Italy. In the opposite situation, we find Egypt, Canada, Japan and Spain, whose ratio decreases significantly.
 The explanation of this came through consultation to a Clarivate specialist, that explained that beyond the record “Title” and “Abstract” fields, the “Topic” field searches also the record “specialist indexed” field, that can be found in each of all the specialist databases, independent of which database is subscribed by the organization. One record retrieved in our All Database dataset appeared, for example, in CABI (CAB Abstracts® and Global Health®), where there was “coronavirus” in the “Broad Descriptors” field - explaining why it was not retrieved through our query in Core Collection. And finally, due to the fact that CABI is not subscribed to the organization where the download was executed, it was not possible to identify the Accession Number (UT), and neither its respective Broad Descriptors, non-existent any of the terms from our search expression.