Preprints intend to accelerate the access to preliminary data for the scientific community, mainly to receive rapid feedback prior to entering a peer-review process, which is needed for publication in the majority of indexed journals (10). We found that preprints that were eventually published had significantly more abstract and PDF downloads than preprints that were not published, which reflects that presumably preprints with a higher quality were more often used by researchers.
Thus, medRxiv and bioRxiv, widely-known preprint servers (11), have a disclaimer on their homepage that states: “these are preliminary reports that have not been peer-reviewed. They should not be regarded as conclusive, guide clinical practice/health-related behavior, or be reported in news media as established information”.
We found that in spite of the fact that the time to acceptance in peer-reviewed journals during the COVID-19 crisis was almost four time shorter than the time to acceptance in 2019 for the same journals, only 14.4% of the preprints in our sample were published. Furthermore, in the same period, i.e. January to March 2020, we identified 1340 articles in PubMed related to therapeutic interventions against COVID-19, which is consistent with the idea that most of the preprints did not meet the requirements to be published. Interestingly, a recent preprint estimated that the average time to publication in journals during the COVID-19 pandemic is approximately two months (4,12).
We found that half of the preprints that were subsequently published had significant modifications in the result section, which suggests that preprints can change importantly after peer-review, raising concerns on the possibility of significant errors in the data analysis of preprints that are not peer-reviewed and published, as previously reported (11,13).
In comparison to our findings, in previous infectious outbreaks such as Zika or Ebola, the publication rate for preprints on a peer-reviewed journal was of around 60% and 48%, respectively (14). However, only 174 and 75 preprints were posted during the Zika (Nov 2015 to Aug 2017) and the Ebola (May 2014 to Jany 2016) outbreaks, respectively (14). Until May 30, 2020, 3544 preprints about COVID-19 have been posted on medRxiv and 842 preprints have been posted on bioRxiv, evidencing the drastic increase in preprint production during the COVID-19 pandemic.
To our concern, we found that during the COVID-19 pandemic multiple preprints have been used in the development of clinical guidelines (15), public health policies, and the development of scientific articles (7,13,16).
Nonetheless, some preprints might contain useful information, for example, a study showed that the infectivity index, R0, calculated using data available on preprints was not different to the one estimated in peer-reviewed articles (16), and preprints on the viral sequence and structure have allowed for early investigation of potential therapeutic options or vaccines (4,17). However, preprints should be used responsibly, as they contain preliminary information that needs to be confirmed through a peer-review process.
Lack of a peer-review process in preprints may be an important implication, due to the fact that the basic screening process employed by preprint servers may not be enough to avoid the dissemination of flawed information (18). For example, a preprint that was posted on bioRxiv suggested significant molecular similarities between SARS-CoV-2 and HIV (19). Even though this preprint was later withdrawn, by the time that happened, it had already sparked controversy and conspiracy theories.
In spite of the fact that peer-review intents to be an exhaustive and thorough process that improves the quality of a manuscript, articles published on a peer-reviewed journal should not be taken as non-refutable knowledge. To illustrate this, a couple of peer-reviewed articles have been recently withdrawn from two prestigious journals due to significant concerns on primary data validity (20,21).
We acknowledge that our study has limitations
Main limitations to our study include the fact that we only included preprints on pharmacological interventions against COVID-19, and that we only used medRxiv and bioRxiv as preprint servers to obtain our sample. However, due to the large sample of preprints we included in our study, and the low publication rate we identified, a significant difference in other aspects seems unlikely.