The final dataset of human protein interactions comprises of 311035 interactions among 16806 human proteins. After integrating the MTB-human PPIs and HIV-human PPI with the human PPIs four groups of human proteins were identified and they are as follows: (a) The human proteins that interact with HIV as well as MTB proteins (denoted as CHPs; 103 human proteins), (b) The human proteins that only interact with MTB proteins (denoted as MHPs; 167 human proteins), (c) The human proteins that interact only with the HIV proteins (denoted as HHPs; 1744 human proteins) and (d) The human proteins not interacting with any of the two pathogens were considered as untargeted Human proteins (denoted as NtHPs; 14827 human proteins). All the four protein sets were subjected to network, sequence, and structural analysis and the results are given below.
The human proteins targeted by both the pathogens have the highest centrality values than the proteins targeted by either one or none of the two pathogens
The degree, betweenness, eigenvector, and closeness centrality value distributions for CHPs, MHPs, HHPs, and the NtHPs, are shown in Fig.1. As can be seen from the figure, the CHPs are associated with the highest centrality values among the four groups (P << 10-16). This indicates that the proteins commonly interacted by MTB and HIV proteins are perhaps the essential proteins. In order to confirm this, we calculated the percentage of essential proteins among these four categories. In general, we found that the proportion of essential proteins correlates with the centrality values in all the four categories. The CHPs with the highest centrality values are mostly the essential proteins (88.34%) followed by HHPs (72%) and MHPs (67%) and NtHPs (47%) (Table S1).
Proteins with high centrality values are mostly conserved and also the targets of pathogens. We calculated the evolutionary rates (dN/dS) for the four categories of proteins using Mouse and Chimpanzee orthologues and compared them with the other another categories. We found that CHPs, the highly central proteins, are associated with slow evolutionary rates than other human proteins. (P << 10-16, Fig 2) in other words, they are constrained to evolve slowly.
Commonly targeted proteins are the products of the abundant and widely expressed genes
Expression breadth is simply the number of tissues in which a protein is expressed. Proteins that express in almost every tissue are considered housekeeping proteins. We used gene expression profiles for 44 normal human tissues and found that CHPs are expressed across many tissues as compared with MHPs, HHPs, and NtHPs. (P << 10-16, Fig 3-A)
Protein abundance is defined as the number of protein copies in a cell and is correlated to the expression level of the corresponding gene. We found that the CHPs express abundantly as compared to other human proteins. (P << 10-16, Fig 3-B) Above results indicate that common human targets of HIV and MTB are housekeeping proteins that are abundantly expressed.
Commonly targeted proteins are the components of the innate immune response against the HIV and MTB
GO functional enrichment analysis that includes Cellular components (CC), Molecular function (MF), and Biological process (BP) was carried out for the human proteins using the Funrich Tool [44]. CHPs are enriched with proteins that are found in most of the extracellular and intracellular components that form the first line defence as components of the innate immune response against the two pathogens. (Fig 4, Fig S1, S2, S3).
Commonly targeted proteins are conformationally versatile.
A node in the human protein-protein interaction network is an ensemble of all possible splice variants corresponding to a protein. It has been shown that the hub proteins are enriched with splice variants than non-hubs [45]. We used Biomart [46] to get the splice variant counts for the human proteins and this exercise revealed that CHPs are associated with a higher number of splice variants than other human protein categories (P << 10-16) (Fig 5-A)
Intrinsically disordered regions (IDRs) in proteins lack stable 3D structures but adopt transiently formed multiple conformations and this conformational versatility potentiates the proteins harbouring them to interact with multiple partners and hence such proteins mostly form hubs in protein-protein interaction networks [47]. We, therefore, analysed the % of disordered residues in the four categories of human proteins. The disorder was predicted using the IUpred2A tool [48]. We found that highly connected CHPs are significantly more disordered than the other categories (P = 0.0197, Fig 5-B).
We also calculated the number of binding sites in the disordered region by using Anchor [48]. And we found that binding sites in the disordered regions of the CHPs are presumably promiscuous. (P = 0.003483, Fig 5-C), enabling CHPs to interact with more proteins. Splice variants, disorderness, and the number of binding sites all together represent conformational variability of proteins, and hence their ability to bind to multiple partners. Here CHPs show the highest propensity to bind multiple partners, hence making them more prone to get targeted by pathogen proteins.
HIV and MTB drug targets and their CHP interacting partners
We further investigated the number of AIDS and tuberculosis drug targets among the HIV and MTB proteins using the data available in the DrugBank [49]. We found 157 drugs against 8 HIV proteins and 136 drugs against 81 MTB proteins (Table 1). All the drug targets of HIV interact with 70 CHPs. For MTB, of the 81 drug targets, three interact with 3 CHPs (Fig. 6, Table S2). Taking this information together it is likely that the 157 drugs against HIV and 4 drugs against MTB that are linked to CHPs could be useful toward development of a cocktail of drugs for treatment of coinfection.
Discussion
Our study reveals that the human proteins commonly targeted by both HIV and MTB have the highest centrality values as compared with the human proteins targeted by either of the two pathogens. Since coinfection is more pathogenic than the mono infections, the present network analysis further reinforces the lethality-centrality principle in the context of viral and bacterial infections. Furthermore, the CHPs are abundantly expressed across multiple tissues, indicating their housekeeping nature. CHPs are associated with conformational flexibility, constrained evolution, and involvement in various pathways. They are found in most the cellular components and are the components of innate immune response against the two infections. Our study has also identified the CHPs that interact with the drug targets of the two pathogens. We find that 157 drugs and 4 drugs target HIV and MTB proteins respectively. These subset of HIV and MTB proteins interact with CHPs. This information we believe helps in the development of a treatment regime comprising of a judicious cocktail of drugs to treat patients with co-infection.