The origin of the SARS-CoV-2 is still unclear. Simply said, its emergence was due to the
acquisition of the polybasic furin cleavage site at the S protein in one of its closest relatives. The
discovery in Laos of bat Rhinolophus coronaviruses that contain receptor binding domains almost
identical to that of SARS-CoV-2, and despite not having that polybasic furin cleavage site, they can
therefore infect human cells (1), is the cornerstone to identify the SARS-CoV-2 progenitor.
However, it is not yet known where the furin site inserted at the S1/S2 junction in the S protein of
the pandemic virus come from, nor the how and when of such acquisition. The CGG-CGG encoded
arginine dimer is rare in coronaviruses (2), however, an arginine dimer with this code is present at
the SARS-CoV-2 acquired furin site: the PRRA four amino acid residue motif. Then, my question
was if the SARS-CoV-2 S gene insert encoding the furin site would match to human transcripts.
Here, I address this issue by using NCBI and GISAID databases, the NCBI Human Genome
Resources, sequence analysis tools and in-house developed bioinformatic tools. I found that the
possible 12-nucleotide fragments which properly inserted in the S gene encode the SARS-CoV-2
furin site 100% match to several NCBI RefSeq human transcripts. Taking this into account and the
expression patterns of these genes, together with further evidences found here, such as the codon
optimization of that SARS-CoV-2 furin site arginine dimer and other PRRA-like insertions in the S
protein, results strongly suggest that recombination between the SARS-CoV-2 precursor genome
and human host cell mRNA produced the recombinant pandemic virus, along with the origin of the
SARS-CoV-2 furin site, during undetected human-to-human virus transmission.