Molecular Mimicry of Pathogenicity of Neisseria

Neisseria, a genus from beta-proteobacteria class, is of potent clinical importance. This genus contains both pathogenic and commensal strains. Gonorrhea and meningitis are two major diseases caused by pathogens belonging to this genus. With increased use of antimicrobial agents against these pathogens they have evolved the antimicrobial resistance (AMR) capacity making these diseases nearly untreatable. The set of anti-bacterial resistance genes (resistome) and genes associated with signal processing (secretomes) are crucial for the host-microbial interaction. With the virtue of whole genome sequences and computational biology it is now possible to study the genomic and proteomic riddles of Neisseria along with their comprehensive evolutionary and metabolic proling. We have studied relative synonymous codon usage, amino acid usage, reverse ecology, comparative genomics, evolutionary analysis and pathogen-host (Neisseria-human) interaction through bioinformatics analysis. Our analysis revealed the co-evolution of Neisseria genomes with the human host. Moreover, co-occurrence of Neisseria and humans has been supported through reverse ecology analysis. A differential pattern of evolutionary rate of resistomes and secretomes was evident among the pathogenic and commensal strains. Comparative genomics supported the presence of virulent genes in both pathogenic and commensal strains of select genus. Our analysis also indicated a transition from commensal to pathogenic Neisseria strains through the long run of evolution.


Introduction
Neisseria is a large microbial genus from Beta-proteobacteria class mainly colonizing in the mucosal surfaces of human and other animals like guinea pig and dog (Elias et al. 2015). Several strains of this genus have been isolated from brain, oral cavity, respiratory tract and uro-genital tract of humans (Elias et al. 2015). This genus consisted of both commensal and pathogenic strains. For instance N. lactemica (Snyder and Saunders, 2006), N. polysaccharea (Albenne et al. 2004) are reported to be commensal strains whereas N. meningitis (Stephens et al. 2007) and N. gonorrhoeae (Kellogg et al. 1963) are potent pathogens. N. meningitis has been reported to cause signi cant mortality and morbidity among both young adults and children worldwide through the spread of epidemic or sporadic meningitis and septicemia (Rouphael et al. 2012). A sexually transmitted infection (STD) resulting into infertility (if remain untreated) is caused by asymptomatic infection of N. gonorrhoeae ascending the genital tract and disseminates to distal tissues (McSheffrey and Gray-Owen, 2015). The global rate of both these diseases are increasing with the development of multidrug resistance capacity of the causative pathogens. Continuous exposure of broad spectrum and last-resort antimicrobial agents like Penicillin, Colistin and Carbapenems has developed the anti-microbial resistance (AMR) property among Neisseria. This has recently afforded these bacteria a 'superbug' status (McSheffrey and Gray-Owen, 2015). The evolution of Neisseria genomes as AMR strain is a journey of only ~80 years (Yang et al. 2020) and already it has become resistant towards most of the broad-spectrum antibiotics. Recently it was reported that, N. gonorrhoeae showed refusal to Sulfonamides, Penicillin, TEMtype β-lactamases, Tetracycline, Spectinomycin, Cephalosporins and Colistin (Yang et al. 2020). Cipro oxacin-resistant N. meningitidis was found in 5-month-old boy in the United States (Taormina et a;2021). These clinical reports have raised a concern and possible threat of untreatable Gonorrhoea and Meningitis in near future.
The virulent genes causing the disease are transmitted from pathogens to hosts. Whenever a gene is transferred from one organism (donor) to another (host), its codon usage which is adapted to the genomic context of the donor might not work optimally in the host genome hampering its expression (Grantham et al. 1980, Garcia-Vallve et al. 2003. The transferred gene must express itself properly to show its effect on the host and any mutation that leads to excess expressivity of that gene in the population will be favoured. This would increase the xation probability of that transferred gene in the host population. Mostly the transferred genes undergo an amelioration process where their codon usage composition drastically shifts from the donor's genomic context towards recipients' genomic context (Amorós-Moya et al. 2010). Such genes are often present in pathogenic islands with medical signi cance and play pivotal roles in host-microbe interaction (Schmidt and Hensel, 2004). Since the genus Neisseria is composed of both obligate and facultative pathogens as well as commensal bacteria, understanding the mechanisms that shapes the evolution of pathogenic and AMR genes will shed new light on the dynamics of their host-microbe interaction (Nakamura et al. 2004). Moreover, the commensal strains of Neisseria are not free living but are host dependent. A huge chunk of the pathogenicity and AMR related proteins are detected in those non-pathogenic strains (Calder et al. 2020). This has raised a question of their genomic evolutionwhether there was a transmission from pathogens to commensalism or vice versa.
Two species N. avescens and N. mucosa were reported to be opportunistic pathogens (Huang et al. 2014;Mechergui et al. 2014). In healthy humans, they remain as non-pathogens though in immune-compromised patients they behave as pathogens. N. chenwenguii, N. cinerea, N. lactamica, N. musculi, N. polysaccharea, N. sicca, Neisseria sp. KEM232, N. sub ava and Neisseria sp. 10022 are largely entitled as commensals (Calder et al. 2020;Kim and Seong, 2018). Although there are some reports that, N. lactamica and N. sicca being occasional pathogens, further con rmations are needed in those aspects (Hansman 1978;Gris et al. 1989). Complete genomes of all these aforementioned species are now available in the public domain database. Computational biology targeting codon usage, mRNA expression pattern, evolutionary analysis and network modelling have become popular tools to study differential microbial genomics as well as hostmicrobial interaction. In this study we have exploited these well-established methods to explore the genomic riddles of Neisseria and their interaction with the human host.

Sequence retrieval and Phylogeny construction
Complete genome sequences of twenty-one Neisseria type strains were downloaded from Integrated Microbial Genomics (IMG) database (Markowitz et al. 2012). Each of these strains represented 21 different species of this genus. Necessary information about their habitat, pathogenicity, total number of protein coding genes, and KEGG Ontology (KO) were obtained from the same IMG database. Five housekeeping genes (dnaA, ftsZ, secA, atpB, gyrB) were used for generating MLSA phylogenetic tree. For Multi-locus sequence alignment (MLSA) phylogeny the translated protein sequences of aforementioned housekeeping genes were concatenated in dnaA-ftsZ-secA-atpB-gyrB order and were aligned in ClustalW (Thompson et al. 2003). The aligned le was imported in Mega X software (Kumar et al. 2018) and a 1000 bootstrap Neighbour-Joining (NJ) phylogeny was generated.

Prediction of Neisseria Resistome
The Comprehensive Antibiotic Resistance Database (CARD) was used for the prediction of antibiotic resistance genes from the studied strains (McArthur et al. 2013). We exploited the Resistance Gene Identi er (RGI) to predict the resistomes (set of antibiotic resistance proteins) from the complete protein sequence of the select Neisseria strains. We choose the 'perfect and strict hits only' and discarded the loose hits to minimise the false negative result in our analysis. High quality coverage 'excluding nudge' was set as parameter criteria for resistome prediction. To further validate the CARD result we used 50/50 blastP algorithm. Previous reports on N. gonorrhoeae mentioned nine proteins as antibiotic resistance (Unemo and Shafer 2014). Those proteins were used as a database and blastP was run against our considered proteomes. Proteins with best hits (98-100% similarities) were considered as resistance genes.
Prediction scheme for identi cation of Neisseria secretomes Secretomes can be de ned as a set of proteins involved in the cellular cross talk and communication with the surrounding environment as well as host. A multi-step prediction scheme as described by Cornejo-Granados et al. (2017) and Roy et al.
SecretomeP 2.0 identi ed the non-classical secretory components. Lipoprotein type of signal peptides (Lipo type) were predicted through LipoP 1.0. Proteins that were found to be "cytoplasmic" in LipoP 1.0 were discarded from further studies.
The twin-arginine (RR/KR) signal peptides (TAT), which are involved in the TAT signalling pathway, were identi ed through TatP1.0. Positive sequences from all these four servers were fed into TMHMM (v2.0) (http://www.cbs.dtu.dk/services/TMHMM/) for predicting the number of transmembrane (TM) helices. Proteins with no TM helix were directly considered to be part of the secretome set. Proteins with one or more than one TM helices were further screened through Phobius signal peptide predictor (https://phobius.sbc.su.se). Sequences with positive Phobiuos result further enriched the secretome set of Neisseria.
Identi cation of potentially virulent secretory proteins VirulentPred server (Garg and Gupta, 2008) was used to identify the potentially virulent proteins among the predicted secretome set of select Neisseria. Cascaded SVM classi er which is employed in the mentioned server has been reported to be highly accurate in predicting virulent genes among prokaryotes. This approach helped us in prediction of those Neisseria signal peptides which are directly associated with the infection cycle in the human host.

Reverse ecology analysis between Neisseria and Homo sapiens
In simple terms, reverse ecology is the study of ecology from the genomic data without any prior knowledge or assumption regarding the ecological interactions among considered organisms (

Codon usage analysis
The differential use of synonymous codons among different organisms is termed as codon usage bias (CUB). We calculated the relative synonymous codon usage (RSCU) of select Neisseria to explore their CUB pattern. Codon usage signature of a pathogen can be shaped by multiple determinants including the mutational pressure and translational selection constraint exerted by the host. Hence, pathogenic microbes are reported to co-evolve and mimic the codon usage pattern of respective hosts for e cient utilisation of host tRNA pool along with other resources (Butt et al. 2016).
Adaptation within the host is important for pathogenic tness and survival within the host body since high competence inside host body will increase the magnitude of the infection caused by the microbe (Butt et al. 2016). To assess the coevolution pattern between Neisseria (both pathogenic and nonpathogenic) and H. sapiens, we compared the CUB from both. A similar pattern of CUB would indicate co-evolution between the studied pathogen and their respective host.
Along with CUB, other important factors related to codon usage indices like, GC, GC3, Fop (frequency of optimal codons), tRNA adaptation index (tAI), effective number of codons (ENc), codon adaptation index (CAI) were calculated for the select Neisseria strains. Spearman Rank correlation was calculated with SPSS ver 26.0 among aforementioned codon usage indices. This gave an overall idea about the genomic composition and condo usage pattern of Neisseria.
Biosynthesis of amino acids requires energy. Protein energy cost (PEC) can be de ned as the total amount of energy (in terms of ATP or GTP) consumed for the biosynthesis of each amino acid (Akashi and Gojobori, 2002). We used Dambe software (Xia and Xie, 2001) to assess PEC for the select strains. PEC for the secretomes of Neisseria was further correlated with their CAI to reveal whether the secretomic energy consumption is directly or inversely related with their expression pattern.

Protein-Protein Interaction study
Human proteins related to gonorrhoea and meningitis were obtained from DisGenet server (Piñero et al 2016) with a cut-off value 0.7. The interaction between those genes and pathogenicity related Neisseria genes was obtained from STRING server (Snel et al. 2000). The enrichment analysis was continued till the PPI network showed signi cantly higher interaction than expected. The STRING network was exported to Cytoscape 3.8.2 (Smoot et al. 2011) for network generation.
Comparative genomics CMG Biotools (Vesth et al. 2013) was used for the whole genomic comparison among select microbes. Blast metric was performed based on the 50/50 blastP program. This analysis was important from two aspects. First, it showed the interproteomic similarities among different species of Neisseria. Second, the duplication level within a proteome could also be revealed (Vesth et al. 2013).
Pan-Core genome analysis is important for comparative genomics analysis since it exposes the unique and shared protein set among studied organisms. Pan genome can be described as the total genomic set of all considered genomes whereas the core genome is the set shared among all investigated strains (Vesth et al. 2013). Along with the pan-core genome analysis a core genome-based phylogeny was also generated and compared with other phylogenies that we build in this study (as mentioned above).

Assessment of evolution of Neisseria
The ratio (ω) between the rate of non-synonymous substitutions per non-synonymous site (Ka) to the rate of synonymous substitutions per synonymous site (Ks) (Nekrutenko et al. 2002) was used as an evaluator for the evolutionary rate of Neisseria. We used the Codeml program included in the PAML software package (ver. 4.5) to assess the ω value (Yang et al. 1997). ω > 1 indicated positive Darwinian selection and ω < 1 stood for negative purifying selection. We divided our select strains into pathogenic and non-pathogenic sets. The evolutionary analysis was performed separately for pathogenic and non-pathogenic strains.

Codon usage analysis of Neisseria
The genomic constitution of Neisseria revealed that, this genus is neither GC biased (GC% ranged from 48-52%) nor AT biased (AT ranged from 49-53%). A signi cantly negative correlation (r= -0.67; p< 0.01) between ENc and CAI was found among all studied strains. The value of ENc ranges from 20-62 and a negative correlation between ENc and CAI indicates the pivotal roles of codon usage indices other than compositional constraint. To evaluate the variation at the third position of the codons with the expression level of the genes, A3, C3, G3 and T3 was correlated with CAI. A positive correlation (r=0.86; p<0.001) between CAI and C3 was observed among all select strains. To assess the codon bias nature of Neisseria, Fop was correlated with CAI. A positive correlation between CAI and Fop (r= 0.84, p<0.001) was found.
Evaluation of protein energy cost (PEC) among Neisseria Spearman Rank correlation between PEC, CAI and Fop was calculated for both potentially highly expressed (PHX) genes and PLX (potentially lowly expressed). A positive correlation between CAI and Fop (r= 0.84, p<0.001) along with negative correlation among PEC and CAI (r= -0.76, p<0.001) as well as Fop and PEC (r=-0.68, p<0.001) was obtained for PHX proteins. On the contrary, positive correlation was found between Fop and PEC (r=0.34, p<0.01) as well as CAI and PEC (r=0.36, p<0.01).

Host-microbe interaction strategy of Neisseria
While assessing the relative synonymous codon usage pattern (RSCU) among select Neisseria strains, we found fourteen codons (ATC, TAC, TTC, GCC, CTG, TCC, TGC, CAC, AAC, ACC, GGC, GTC, CCC, GAC) were optimally used in Neisseria. Interestingly those are optimally used in human also indicating towards a host-microbe interaction strategy for Neisseria in terms of codon-optimization and codon co-evolution. Further study on the host-pathogen interaction was performed using Reverse ecology analysis. The complementation indices among considered microbes were found to be very low. However, all select strains showed complementation (0.26-0.44) with humans.. A metabolic reconstruction was done on the basis of this reverse ecology analysis (Fig. 2c). There were 17677 edges in the constructed metabolic network directly connected to each other (blue circles). Only forty "seed" (red circles) were found in the metabolic map (Supplementary le 1). The term "seed" represents the exogenously acquired compounds required for the metabolism of an organism. "Edges", on other hand, represents the compounds or chemical resources being used by all studied organisms.
In the next subcluster (sub-cluster a) a hybrid (opportunistic pathogen) Neisseria strain (both pathogenic and commensal), N. avescens ATCC 13120 grouped with commensal N. sub ava ATCC 49275 (66.3% similarities). Similarly, the following sub-cluster contained one non-pathogen N. sicca FDAARGOS_260 and one opportunistic pathogen N. mucosa ATCC 19696 sharing 81.9% proteomic identity. Potent pathogen N. animalis NCTC 10212 was close to them within the same cluster (43.5-50.0% similarities). Another commensal strain N. chenwenguii 10023 was also placed within the same cluster (45.0-49.7% proteomic identity). Neisseria sp. KEM232 was described as a novel species of N. chenwenguii (Al Suwayyid et al. 2020) however, our pan genomic dendrogram and blast matrix showed that they were distantly related with only 30.1% proteomic similarities. To further investigate this issue, we performed Average Nucleotide Identity (ANI) score analysis. It has been suggested that, ANI score >95% between two genomes indicates they are the same species. The ANI score between Neisseria sp. KEM232 and N. chenwenguii 10023 was only 77.86%. Moreover, the ANI score between Neisseria sp. KEM232 and N. elongata glycolytica ATCC 29315 was 80.4%. The same score between Neisseria sp. KEM232 and N. elongata M15910 were 80.6%. This result not only supported our blast matrix and pan-genomic dendrogram but also suggested a taxonomical reconsideration for Neisseria sp. KEM232. MLSA phylogeny showed an identical clustering pattern among considered strains (Fig. 3d).

Evolutionary analysis of Neisseria
The rate of evolution among protein coding genes varies tremendously. Evolutionary analysis based upon ka/ks (or ω) value revealed a differential decoration among diverse sets of genes. It was found that PHX genes (mean ω 0.07) were less evolved (p<0.001) and more conserved than PLX genes (mean ω 0.32). The dN/dS analysis of Neisseria secretomes revealed their faster evolution than PHX in all investigated strains (p<0.001). Moreover, the pathogenic secretomes (mean ω 0.21) were evolving at a faster rate (p<0.001) than commensal strains (mean ω 0.14).

Genomic constitution of Neisseria is multi-factorial and complex
The GC/AT content of organisms is one of the most highly variable traits (Botzman and Margalit, 2011). Variation in nucleotides can be observed in protein coding genes, non-protein coding genes, synonymous sites as well as nonsynonymous sites of genomes (Reis et al. 2004). The genomic constitution of Neisseria revealed that this genus is neither GC biased nor AT biased. Previously it has been reported that, mutational biases determine the nucleotide composition which means GC biased mutation pattern resulted in GC rich organisms and AT biased mutation developed AT rich organisms (Hershberg and Petrov, 2010). However, recently the in-effectivity of mutational bias in directing the nucleotide composition has been revealed. This happens in organisms which are both AT and GC rich (Hildebrand et al. 2010). This also happened in the case of Neisseria. In such organisms, mutations are more likely to happen from G/C towards A/T due to the rapid deamination of cytosine to thymine (C > T/U) (Bohlin et al. 2017). However, increased AT content indicates genomic disability (Yakovchuk et al. 2006). In such situations, another selectively neutral pressure termed as 'amelioration' acts as a major force which even out the differences in base composition (Lawrence and Ochman, 1997). Such cases are bene cial for pathogens with an AT rich host (human for our case) (Bohlin 2011). Thus, the genomic organization of Neisseria does not directly depend solely upon its nucleotide variation. Instead, it is a multi-factorial process that is resulted through a complex combination of both neutral and selective processes. This was also validated by the signi cantly negative correlation (r= -0.67; p< 0.01) between ENc and CAI. The value of ENc ranges from 20-62 and a negative correlation between ENc and CAI indicates the pivotal roles of codon usage indices other than compositional constraint. To determine those factors, we used the Spearman rank correlation method. A positive correlation (r=0.86; p<0.001) between CAI and C3 was observed among all select strains. Preference towards Cytosine (C) in AT/GC unbiased organisms further strengthens the role of mutational pressure on these genomes. The third position of the codon is a hotspot for random mutation without a drastic effect on amino acid usage due to the redundancy of codons. Thus, it can be posulated that, deamination of C to Uracil (U) in Neisseria not only increases the AT richness of the genome but also helps this genus to aptly use the host translational machinery using the human tRNA pool. The presence of active cytidine deaminase responsible for C>U mutation has been reported in pathogens like E. coli and S. typhimurium (Henderson and Paterson, 2014). Such activity may also be present in Neisseria with a pivotal role in their genomic constituency. This arrangement of codon usage indices was found to be consistent among the secretomes and pathogenicity related genes present in Neisseria. Moreover, no signi cant difference was found in this pattern between the pathogenic and nonpathogenic Neisseria strains.

Optimal codons are mounting the quantity of energy economic amino acids in Neisseria
Another factor found to play an important role in Neisseria genomes was Fop. A positive correlation between CAI and Fop (r= 0.84, p<0.001) supported the previous statement. This value indicated higher usage of optimal codons in potentially highly expressed genes than lowly expressed genes. Twenty-nine codons were found to be optimal codons. Among them nineteen were GC rich codons and fteen ended with Cytosine (C). This leads to an important aspect regarding the amino acid usage of select genomes. GC rich codons code for more energy economic amino acids rather than AT rich codons (Bohlin et al. 2017). Thus, the higher usage of such GC rich optimal codons in PHX genes indicated less biosynthetic energy cost for respective translated proteins. To further assess these ndings the correlation between PEC, CAI and Fop was calculated for both PHX and PLX. A positive correlation between CAI and Fop (r= 0.84, p<0.001) along with negative correlation among PEC and CAI (r= -0.76, p<0.001) as well as Fop and PEC (r=-0.68, p<0.001) was obtained for PHX proteins. On the contrary, positive correlation was found between Fop and PEC (r=0.34, p<0.01) as well as CAI and PEC (r=0.36, p<0.01). This result indicated that, although the Neisseria genomes are not generally biased towards either AT or GC rich codons (Fig. 1a), natural selection is discriminating among the synonymous codons and preferring GC rich codons in PHX genes. This enhances the translational elongation rates as well as reduces misincorporation of amino acids during protein synthesis (Akashi and Gojobori, 2002). Previously, Akashi and Gojobori (2002) reported a relation between the protein energy cost (PEC) and tRNA adaptation in differentially expressed genes. Hence, we correlated PEC and tAI. We found a negative correlation (r=-0.75, p<0.001) between them among PHX. This further validated the translational e ciency of potentially highly expressed genes along with an indication that afterwards these genes will be translated into energy economic proteins with higher expression level. An overall amino acid usage (Fig. 1b) calculation indicated alanine, valine, glycine, serine, asparagine, proline, glutamic acids and threonine were highly used in Neisseria. The overall usage of costly aromatic amino acids like phenylalanine, tyrosine and tryptophan were comparatively lower than the aforementioned amino acids. No signi cant difference was found in this pattern between the pathogenic and commensal Neisseria strains.

RSCU pattern indicated towards co-evolution Neisseria for better host adaptation
Previous studies have shown the relation between codon adaptation and ecological preferences (Peden 1998). A relation between the codon adaptation and co-evolution has also been drawn. To assess the co-evolutionary pattern between Neisseria and their host Homo sapiens, their RSCU pattern was exploited. We found fourteen codons (ATC, TAC, TTC, GCC, CTG, TCC, TGC, CAC, AAC, ACC, GGC, GTC, CCC, GAC) were optimally used in both Neisseria and human. Moreover, ~96% pathogenic island genes in Neisseria were under the PHX category. This suggested elevated translational e ciency of those genes in the host body. The translational selection pressure towards these fourteen most adapted codons aided the microbes to live in the host environment and e ciently utilize their metabolic resources (Botzman and Margalit, 2011). Thus, the codon usage is playing a pivotal role in enhancing the cellular tness of Neisseria within the host body mostly by mimicking the codon usage pattern of humans.

Co-existence of Neisseria with human host
The genus Neisseria composed of both pathogenic and non-pathogenic commensal bacteria. According to the ecological principles, co-existence can be ruled either via competition or complementation (Carr and Borenstein 2012; Levy and Borenstein 2012). The reverse ecology analysis among select Neisseria and their host (human) revealed inter-species speci c and intra-species-speci c competition among members of Neisseria (Fig. 2a, 2b). The pathogenic strains were exerting more competition on commensal strains. Both types of strains were found to exert a moderate competition against humans dictating an e cient distribution of host-derived resources among pathogenic and commensal Neisseria (Fig. 2a). However, the competition exerted by humans on Neisseria was diminutive. This has turned humans into the perfect host for this microbial genus.
The complementation indices among considered microbes were found to be very low. Thus, co-existence of different Neisseria strains in a small niche can be least expected. This also explains the broad range of distribution (for example, brain, oral cavity, respiratory system, reproductive system, urinary tract etc.) of Neisseria within the human body. However, all select strains showed complementation (0.26-0.44) with humans. This metabolic reconstruction clearly (Fig. 2c) depicted that, large number of resources are shared and utilized e ciently between humans and Neisseria. This suggested the co-inhabitation of Neisseria within the human body is ecologically favorable.

Differential evolutionary pattern indicated transition from commensalism to pathogenicity among Neisseria
The rate of evolution among protein coding genes varies tremendously. Evolutionary analysis based upon ka/ks (or ω) value revealed a differential decoration among diverse sets of genes. It was found that PHX genes were less evolved (p<0.001) and more conserved than PLX genes. The 'knock-out rate' prediction proposed that most of the PHX genes are essential or housekeeping genes with important functionality (Hust and Smith, 1999). These essential genes evolve more slowly than other non-essential genes (Wilson et al. 1977). Similar results were also found previously in Escherichia coli, Helicobacter pylori and even in Neisseria meningitis (Jordan et al. 2002). Moreover, secretomes of pathogens continuously struggle with the host immune system and try to beat it which resulted in their faster evolution (Ehrlich et al. 2008;Saha et al. 2019). This differential evolutionary pattern for pathogens indicated the possibility for emergence of pathogenicity from commensalism among Neisseria.
Another aspect of our ka/ks analysis was based on pathogenicity related (PI) genes. We found a set of pathogenic genes were present in non-pathogenic Neisseria strains which was unexpected.  (Snyder and Saunders, 2006). However, no clear explanation for this result is still stated. Hence, we calculated the evolutionary rates of PI genes from both pathogenic and non-pathogenic strains to reveal whether a transition from pathogenicity to commensalism has occurred during evolution of Neisseria or vice-versa. The PI genes were highly (p<0.001) evolved in pathogenic strains rather than their nonpathogenic counterparts. The ω value of pathogenic PI genes ranged from 0.32-0.45 whereas the same for nonpathogenic strains ranged from 0.05-0.08. Their difference was statistically signi cant (p<0.001). Hence the transition from commensalism to pathogenicity in Neisseria is evident from this result. This type of transition was previously reported in Mycobacterium avium complex (Saha et al. 2019). Moreover, nine protein coding genes have been reported to be associated with antimicrobial resistance for N. gonorrhea. Orthologs of those genes were found in all considered strains. Evolution analysis among them predicted their higher evolution in pathogens rather than nonpathogens. The mean ka/ks value for each of the nine genes were lowest when only non-pathogenic strains were studied. The rate of evolution increased when we considered both pathogen and non-pathogens (strains from cluster III from pan-genomic dendrogram) together and the value was highest after only pathogens were considered (Fig. 4).
This supported our aforementioned hypothesis for transition from commensalism to pathogenicity among Neisseria. With the emergence of pathogenicity this genus became exposed to both narrow-as well as broad-spectrum antibiotics and in the long run their anti-microbial resistance property evolved.

PPI study of Neisseria-Human interaction
Protein-protein interaction (PPI) analysis has become a major tool in system biology with its ability to handle a broad range of data related to biological processes, cell signaling and developmental strategies (Rao et al. 2014). In this study we have studied the PPI network among N. gonorrhea (N_gon), N. meningitis (N_men) and Homo sapiens. The PI protein related PPI network of considered pathogenic strains have been given in Fig. 5a and 5b. The COG based clustering of both the networks showed "cellular processing and signaling" category (red circle) contained most connected proteins. Those proteins were also connected with others associated with "information storage and processing" (yellow circles) and "Metabolism" (blue circles) categories. Few proteins (crimson circles) were proteins with uncharacterized COG category and their connectedness was less than other proteins. Overall, the PPI score was 1.0e-16. Similar pattern of clustering was observed for NM where the pink circles were protein for "cellular processing and signaling", green circles were for "information storage and processing", yellow circles were for "Metabolism" and red circles were unknown categories. The PPI enrichment score for N. meningitis was 1.0e-15. These values for both the PPI networks indicated a stable and promising interaction among the pathogenic proteins.
Another aspect of this study was to analyze the human-Neisseria interaction. The human PPI network associated with Gonorrhea and Meningitis were predicted. A huge number of proteins with tight inter-connection were found to be linked directly or indirectly with both these disorders. Twenty human proteins were found to be directly associated with Gonorrhea having DSI (disease-signi cant index) more than 0.7. Their KEGG enrichment analysis revealed their functionality with oocyte meiosis, cell cycle, Epstein-Barr virus infection, dopaminergic synapse, acrosomal vesicle formation, Hippo signaling pathway, long-term depression, sphingolipid signaling pathway, p53 signaling pathway, FoxO signaling pathway and autophagy (Fig. 5c). Ten potent human proteins were found to be directly related to Meningitis with DSI value more than 7. KEGG enrichment of those proteins revealed their pivotal role in tryptophan metabolism, prion diseases, complement and coagulation cascades, Systemic lupus erythematosus (SLE), Seleno-compound metabolism, amoebiasis and axon development (Fig. 5d). The PPI analysis among NG and human revealed acrosomal vesicle formation, Hippo signaling pathway, Epstein-Barr virus infection, long-term depression and p53 signaling pathway related proteins interacted with NG PI proteins with P-value 1.0e-16. NG causing Gonorrhea, a sexually transmitted disorder (STD) is thus interacting with human proteins that are directly related to the development of urogenital tract, oocyte meiosis, placenta and sperm formation and development (Soncin and Parast 2020; Caini et al. 2014). The same analysis with NM and human proteins revealed a strong biological interaction (P-value 1.0e-16) between NM PI protein and human proteins related to prion diseases, axon development, tryptophan metabolism, SLE and blood brain barrier formation. Clinical reports have been found that patients with SLE and prion diseases are more prone to Meningitis (Al Mahmeed et al. 2020; Batra et al. 2016).
Thus, the PPI network analysis further established the complex machinery of Human-Neisseria interaction.

Conclusion
This study investigates different genomic and proteomic aspects of Neisseria along with their interaction with humans. The codon usage analysis revealed that this genus is neither biased towards GC rich codons nor towards AT rich codons. Similarities between Neisseria and human in terms of synonymous codon usage analysis indicated towards co-evolution of microbes and hosts. Moreover, CAI, tAI and Fop were found to be major indices governing the codon usage of Neisseria. The amino acid usage study showed the preference of energy economic amino acids among Neisseria. The reverse ecology analysis supported the co-occurrence of Neisseria and humans. The complementary effect of humans on Neisseria was evident from this analysis. Reverse ecology-based networking showed a strong metabolic interaction between human and considered Neisseria strains. Comparative analysis revealed considerable proteomic similarities between pathogenic and commensal strain. This also supported previous reports for presence of virulent genes in nonpathogenic Neisseria strains. The pan genomic dendrogram suggested a taxonomic reconsideration of Neisseria sp. KEM232. The evolutionary analysis supported the less evolved and more conserved nature of potentially highly expressed genes. Moreover, the higher evolutionary rate of both secretomes and resistomes among pathogenic Neisseria proposed the transition of commensal to pathogenicity in Neisseria. The human-pathogen interaction was studied mainly for N. gonorrhea and N. meningitis. A strong biological interaction was established between the host and pathogen. The GO enrichment analysis and KEGG pathway study indicated most of the interacting proteins were associated with biological processes like cell signaling, cell cycle and developmental pathways. Pathogenicity related genes of N. meningitis was found to interact with human proteins associated with tryptophan metabolism, prion diseases, complement and coagulation cascades, Systemic lupus erythematosus (SLE), Seleno-compound metabolism, amoebiasis and axon development. Virulent genes of N. gonorrhea interacted with human proteins related to development of urogenital tract, oocyte meiosis, placenta and sperm formation.
Thus, the genomic and evolutionary study on Neisseria revealed a considerable similarity with the genomic pattern of their host, human indicating a codon co-evolution strategy taken up by this genus.  * Still not con rmed as pathogen or commensal. Figure 1 <p>(a) Heatmap of overall codon usage among 21 <em>Neisseria</em> strains. The color code has been indicated in the gure. Both AT and GC rich codons are preferred. At third position C is preferred than other nucleotides. (b) Heatmap of overall amino acid usage among 21 <em>Neisseria</em> strains. The color code has been indicated in the gure. Energy economic amino acids were more preferred than aromatic energy costly amino acids.</p> <p>Evolutionary analysis of resistome among nine protein coding genes associated with AMR. The evolutionary rate of these genes is highest among pathogens and lowest in non-pathogens.</p> Figure 5 <p>(a) Network analysis among <em>N. gonorrhea</em> signal peptides. P-value &gt;0.8 was only considered. The clustering was done based on COG analysis. The COG based clustering of both the networks showed "cellular processing and signaling" category (red circle) contained most connected proteins. Those proteins were also connected with others associated to "information storage and processing" (yellow circles) and "Metabolism" (blue circles) categories. Few proteins (crimson circles) were proteins with uncharacterized COG category and their connectedness was less than other proteins. (b) Network analysis among <em>N. meningitis</em> signal peptides. P-value &gt;0.8 was only considered. The clustering was done based on COG analysis. The pink circles were protein for "cellular processing and signaling", green circles were for "information storage and processing", yellow circles were for "Metabolism" and red circles were unknown categories. (c) PPI analysis among Gonorrhoea related human protein and <em>N. gonorrhea </em>pathogenic proteins.