Ohno’s famous 1984 paper claimed to show how a frameshift mutation might have given rise to a totally novel protein such as a nylonase enzyme. Ever since that time it has been widely accepted that this was the correct explanation for the origin of the NylB enzyme. [12, 13, 14, 15, 16, 17] It has been widely assumed that this happened in an extremely short timeframe, soon after the invention of nylon in 1935. By extension, it might be erroneously assumed that the NylB frameshift claim might help explain other nylonases such as NylA and NylC. Most broadly, Ohno’s frameshift paper is considered by many to be the best proof of the rapid evolution of a de novo gene/protein.
Many readers have not realized that Ohno’s 1984 claims were not supported by any type of evidence – his model was entirely speculative. Ohno presented his assertions very forcefully, as if they were facts. It seems that many readers of that paper got the impression that Ohno actually had observational evidence for the existence of his specified precursor protein and his specified frameshift mutation.
Experiments by Kato et al. in 1991 and Prijambada et al. in 1995 failed to confirm that NylB evolved via frameshift mutation, and in fact they argue against Ohno’s hypothesis. In the case of Kato’s experiment, it suggests NylB nylonase evolution is feasible by as little as two amino acid changes in an ancestral homolog rather than a frameshift mutation affecting 400 + amino acids (as in Ohno’s hypothesis).
In the case of Prijambada et al.’s experiment where nylon digesting ability was evolved via directed evolution in the lab, the presence of a NylB homolog in strains of Pseudomonas aeruginosa PAO1 suggests the lab-based directed evolution of nylon digestion from PAO1 involved point mutations of a pre-existing NylB homolog in PAO1, not a frameshift mutation.
Careful reading shows that Ohno’s proposed precursor protein and his proposed frameshift mutation were only inferred. Therefore, at that time Ohno did not even have a testable hypothesis. In addition to the experimental evidence that could reasonably be deemed sufficient to refute Ohno’s hypothesis, now, in the age of bioinformatics, we can do what Ohno could not do – we can further test his model using bioinformatic tools.
If the NylB frameshift hypotheses were correct, then a protein database search should reveal evidence for the existence of Ohno’s hypothetical precursor protein, which should have a history and should have protein homologs. On the flip side, there should be evidence that the NylB protein is a unique protein, with no history and no protein homologs.
Conversely, if the NylB frameshift hypothesis were wrong, then a protein database search should reveal evidence that the hypothetical precursor protein never existed, has no history, and has few if any homologs. At the same time there should be evidence that the NylB protein is not unique, and so has a history and numerous homologs.
Table 2 shows NylB is in the family of beta lactamases, and NylA is in the family of amidases. Therefore, NylB and NylA are both clearly members of very well-known protein families (independent of the BLAST and SPARCLE results).
Although it could be argued that the absence of PR.C in the databases might be due to the fact it might exist but simply has not yet been found, the most conclusive proof that the NylB frameshift hypothesis is false is that the NylB gene is not at all unique – it is found in many organisms, in many habitats, and has a vast number of homologs.
The searches for PR.C were conducted across a broad spectrum of strategies that would result in varying levels of false negatives and false positives. Even using the most generous search strategies, there was no credible evidence of the PR.C sequence.
Our BLASTP searches for PR.C included extremely relaxed constraints, such that false positive hits for PR.C would be expected. Furthermore, it would be expected that if PR.C existed long before 1935, while NylB only emerged after 1935, then PR.C should have many more homologs than NylB. The relative representation of BLASTP hits for PR.C versus NylB hits argues strongly against PR.C’s actual existence. Thus, the unstable numbers of BLASTP hits (ranging from 0 ot 9) for PR.C with weak e-values can be reasonably regarded as both spurious and unconvincing of PR.C existence.
Our search results indicate that homologs of NylB and various other 6-aminohexanoate hydrolases are very abundant. Some organisms with these homologous proteins have been experimentally shown to have the ability to digest nylon, [22, 24] but most were not enzymatically tested. While sequence-based gene predictions cannot prove that all such NylB homologs can necessarily degrade nylon, such predictions point to a family of proteins that have very significant homology. All of the genes with the NylB designation (from our UNIPROT-developed list which also had available CDD pages) had beta lactamase domains (Supplementary Table S1). Beta lactamases are considered one of the most ancient proteins. [25] The divergence within the NylB class of enzymes was often very substantial. This precludes the possibility that all such enzymes arose from an isolated frame shift mutation that arose sometime after 1935. It should be obvious that a single frameshift mutation, in just a few decades, could not possibly have proliferated via horizontal gene transfer across a very large number of unrelated organisms found all around the world.
Ohno’s hypothesis was based upon Kinoshita’s NylB protein sequence. We computed the sequence similarity of NylB to the architecture of the beta lactamase protein family. Based on the bits score assigned to this particular NylB gene by CDD (accession COG1680), we found these sequences were strikingly similar. According to CDD, the probability that this similarity to a COG1680 beta lactamase would arise by chance is 2− 130. Given the degree of non-random similarity of NylBs to beta lactamase domains (Supplementary Table S1), there is no doubt of the homology of NylB with beta lactamases, and this is illustrated by the fact that several entries in GenBank list the same protein as a NylB homolog and beta lactamase simultaneously (Supplementary Table S6).
The NylB frameshift hypothesis was premised upon numerous assumptions that we now know are incorrect, and so his hypothesis is falsified on several levels:
- The widely held assumption that all nylonase enzymes evolved since 1935 was incorrect.
- Ohno’s assumption that the NylB protein was a new and unique protein was incorrect.
- Ohno assumed a hypothetical but specific precursor protein that now appears to have never existed, and thus the hypothetical frame-shift mutation appears to have never happened.
- Ohno claimed that a random string of amino acids could reasonably be expected to give rise to a specific, functional, beneficial, and stable enzyme. Having all these things happen by chance is so incredibly unlikely that it is hard to imagine. This is especially clear in light of the fact that CDD database indicates that that the probability of NylB would be so similar to beta lactamase by chance would be essentially impossible (2-130).
- Lastly, it was established 8 years after the frameshift hypothesis was published that NylB was not operating independently, but was clearly part of a catabolic chain, functioning in coordination with three other nylonases on the same plasmid (NylA, NylB′, and NylC). Indeed, NylB was shown to be co-regulated with NylC, sharing the very same promoter.[10] Ohno had no access to this data at the time he published his hypothesis.
Ironically, Ohno pointed out that the level of divergence of paralogous pair of NylB and NylB′ in KI72 suggests that this paralogous pair must have existed prior to 1935. [4] A similar level of divergence exists in the paralogous NylB and NylB′ proteins in Bacilus cereus even though the NylB in Bacilus cereus is around 75% divergent from the NylB in KI72. These considerations themselves cast serious doubt on any post-1935 gene duplication hypothesis. Furthermore, it appears that in the case of Pseudomonas NK87 (with a functioning nylonase NylB), that having a paralog is unnecessary for the evolution of NylB nylonase activity. Importantly, since the frameshift hypothesis only applies to KI72, it cannot account for the presence of the NylB paralogs in Bacilus cereus that are over 75% divergent from their counterparts in KI72, nor the NylB orthologs in the Pseudomonas strains.
Taken collectively, our findings very clearly refute Ohno’s frame-shift hypothesis. However, our findings are consistent with Yomo et al.’s hypothesis that the NylB gene and its homologs have been around for a long time.
Overlapping reading frames are known to exist in biology, [26, 27] and Okamura [28] has speculated that several such human genes may have originated via frame shift mutations. However, the larger question of de novo origination of genes and proteins in general and the role of frameshifts specifically in creation of de novo proteins is beyond the scope of this paper. Our present focus is specifically on whether a frameshift mutation after 1935 was the mechanism that created NylB.
We extended our search to look for homologs of other nylonases such as NylB′, NylA, and NylC (all of which were assumed to have evolved since 1935). While Kinoshita did not detect physiological amidase activity for NylA, [1, 9] our analysis clearly shows that NylA has amidase homology. Similarly, we found that NylC was homologous to a rare peptidase. We found several proteins had dual classifications such as beta lactamase and 6-aminohexanoate hydrolase (NylB), or amidase and 6-aminohexanoate cyclic hydrolase (NylA). In addition to experiments with proteases like Trypsin,[6] experiments have shown that even triacylglycerol lipases can act as nylonases.[29] Thus it appears that the term “nylonase” could be applied to members of the protease, beta lactamase, amidase, peptidase, and lipase enzyme families. This is in broad agreement with some of Yasuhira et al. and Negoro’s findings that NylB and NylB′ are in the beta lactamase family, NylA is in the amidase family, and some nylonases share some passing similarities to lipases. [13, 23, 30] In every case the proteins were found in various organisms and in various natural habitats - along with a great many homologs. We conclude that all of these nylonases and their close homologs existed prior to 1935, although in some cases there may have been adaptive modifications after 1935. It appears that these various naturally occurring enzymes that happen to be able to degrade nylon have historically acted upon alternative nylon-like substrates.