Web Interface
HighAltitudeOmicsDB (Figure1) is a user-friendly, free-to-access resource which requires no prior registration. It is a comprehensive, non-redundant, manually curated resource of genes/proteins whose expression level are experimentally validated to be associated with high-altitude stress. The database surveyed using “browse” and “Search” options.
The “Browse” option allows the user to choose easily single or multiple genes/proteins of the database from a pull-down menu. Alternatively, the user may upload a file containing the protein official symbols or alternatively type the protein-official symbols. Clicking the adjacent ‘Browse’ button connects to a tabular format which hyperlinks the individual protein page. If the user-list contains protein symbols which are not in the database, a separate table highlighting the same is also provided (Figure 1)
The Search Option of the database offers multiple options to explore the database based on user research interests. Search by chromosome allows to click on the any human chromosome number and identify the proteins of HighAltitudeOmicsDB which lie on the respective chromosome (Figure 2). Search by ‘duration of experiment’ allows to identify the list of genes/proteins whose expression changes in hours/days/weeks/months/years. Searching by ‘Tissue of expression’ opens a pull-down menu from which the user can choose the tissue of interest. Searching by ‘Ethnicity’, ‘source organism’, ‘level of regulation’, ‘geographical location’ similarly opens a pull-down menu from which the user may choose the ethnicity, source organism, up/down regulation and location respectively and get a tabular list of genes/proteins which are hyperlinked to the respective detailed information page of the protein (as discussed in following sections).
Additionally ‘Associated as Biomarker’ option leads to a tabular list of proteins which have been proposed/validated as molecular biomarkers for HA-stress (Figure 2). The protein symbols are hyperlinked to the respective protein page which provides a link to the Pubmed which validates the protein as a biomarker. Additionally to fetch proteins which are DE in an altitude-dependent manner, a user-interactive slider (ranging from 2200mt to 9800 mt) is provided. The user may set the slider values and fetch genes/proteins which are associated with a defined altitude-range. This has been combined with (AND/OR) options with time of exposure to HA and level of regulation (Up/Down). The user may thus be able to make combination queries like up/down-regulated proteins expressed in days at an altitude range of 2200mt to 4500 mt. The list of these proteins can be downloaded in Excel /CSV format for further analysis.
The webserver also allows to explore the proteins of HighAltitudeOmicsDB associated with a particular Transcription Factor (TF), microRNA (miRNA), disease, drug, GO or Kegg pathway (Figure 3).
The details of the protein and its association with HA is provided in the detailed information page which may be divided into six sections.
This is the first section of the database that gives general information about the protein like Protein Official Symbol, Aliases, Chromosomal location, Length, Uniprot Id, EC number, Pfam Id, PDB Id, InterProID, dbSNP Id which allows cross-linking to additional databases easy and quick. The Uniprot Id is hyperlinked to the Uniprot database (Figure 4(i)).
- Interactions and Semantics
The top-50 direct protein interactors of each protein is identified from STRING database using cut-offs described in the methodology section. The network is displayed in a user-interactive format with translation, zoom-in and zoom-out features. The nodes are color coded (yellow: the protein being studied; blue : the top-50 interactors) (Figure 4). The edges are also color coded (yellow: interactions between the protein being studied and its 50 direct interactors; blue: interactions among the top-50 interactors). The network may easily be downloaded in .sif format which can be easily visualised in network visualization software like cystoscape, bina etc. The list of interactions between them and their combined score is realidy provided in a tabular format which can be downloaded in excel/CSV format. The table is also provided with a ‘search’ option to easily search the protein of interest.
The pai-wise GO semantic similarity score is calculated between the protein being studied and its top-50 interacting proteins as described in methodology section. The results are visualised as 51 X 51 matrix. The GO semantic similarity score > 0.8 is highlighted in red-color in the matrix. If any protein among the top-50 interactors is also a part of HighAltitudeOmicsDB, the protein symbol in the matrix is hyperlinked to the respective detailed protein information page within the database. This helps to identify any functional hubs of proteins that would be associated HA stress and hence could shed light om molecular basis for acclimatization/adaptation (Figure 4).
- Association with High Altitude
For each protein, its association with HA stress is compiled in a tabular format. The details are presented as the human protein symbol, source organism (organism in which the study was performed), tissue of expression, Level of Hypoxia, Altitude, Duration of experiment, Level of expression, Fold Change, Experiment Details, Geographical location, Ethnicity, Control group expression, control group details and reference paper (Figure 4 ). The association of the protein as a biomarker is also compiled i.e. if the protein is ever experimentally validated to be a biomarker, the entry in the column will be “Yes” otherwise “No”. The papers are hyperlinked to PubMed which allows ready access to the original publication. In this format, the expression changes of a protein in different durations, tissues and altitude-conditions can be easily and quickly explored, compared and analysed.
- Association with TFs and miRNAs
Transcription Factors and miRNAs are two most important transcriptional and post-transcriptional regulatory molecules fine-tuning the expression of genes. Thus the list of TF and miRNAs that are known the regulate the protein being studied is presented in a tabular format. The TF association table lists the TF symbol (hyperlinked to Genecards Database), its entrez id, symbol and entrez id of the protein being studied, type of association, link to publication which ascertained this association and the database from which the association is extracted. The tables are downloadable in Excel/csv format and provided with ‘search’ option to explore the table with a user-defined keyword (Figure 5).
Similary the miRNA-gene association table lists the miRNA miRTarBase Id, miRNA, symbol and entrez id of the protein being studied, experiment (luciferase reporter assay/western blot/ PCR/ Immunohostochemistry etc), support type and link to respective publication (hyperlinked to PubMed) which ascertained this association. The tables may be downloaded in Excel/csv format. The table is also provided with ‘search’ option to explore the table with a user-defined keyword.
- Gene Ontology and KEGG Pathway annotations
The Gene Ontology annotations are presented in a tabular format. The GO ID, GO Term and GO type are listed. The GO ID is also hyperlinked to QuickGO which provides detailed GO annotations. The KEGG pathway annotations are also compiled and presented as KEGG ID and KEGG Term. The KEGG ID is hyperlinked to KEGG database that provides additional details about the respective pathways (Figure 5).
Both these tables can eb downloaded in Excel/CSV format and have an in-built ‘search’ option for keyword search.
- Association of proteins with other diseases and drugs
This section provides details of drug, tissue, and disease-association of HA-associated genes/proteins. The information is represented in two tables belonging to each category respectively (Figure 5). The first table shows information about protein and its associated drug. This type of information can help the users to guide/design any protein-based drug-targeting experiment. These two tables are equipped with the “search” option which help in easy search of user-defined terms across lengthy tables. The tables can also be downloaded in Excel/CSV format.
Web Statistics
HighAltitudeOmicsDB contains ~1300 associations of 820 proteins that have been found differentially expressed at High altitude. A detailed review of the database shows that all proteins were sourced from experimental studies in 25 tissues (Figure 6a). These tissues are source from 7 animal species i.e. Human, Sheep, Rat, Mice, Yak, Bird, Toad (Figure 6b). Humans as source organisms can be further characterized in terms of their ethnicity i.e. Americans, Tibetans, Han-Chinese, Italians, Nepali, Ladakhi, and Germans. The time of exposure is dependent on the source organism and it ranges from 0.5 hours to 110 Days for the native population.
The database contains two types of functional annotations- GO and KEGG pathway enrichment. The GO enrichment shows ‘Metabolic Process’ (GO: 0042572), ‘Outer Dynein Arm Assembly’ (GO: 0036158), ‘Response To Reactive Oxygen Species’ (GO: 0000302) as the top biological processes (Figure 7a). ‘Metabolic process’ is highly associated with weight loss due to the adaptation mechanism at high altitude ([16]). At high altitude, induction of hypobaric hypoxia activates HIF protein that further regulates genes responsible for mediating changes in cellular metabolism/energetics leading to weight loss due to increases in energy expenditure ([17]). The second biological process ‘Outer Dynein Arm Assembly’ is the process for axonemal assemblies. The increase in the length and density of axoneme-like cilia due to hypoxia has been associated with cell death ([18]). Lastly, ‘Response To Reactive Oxygen Species’ is the reflection of the redox status of the cell, and disturbances in redox status due to hypobaric hypoxia can lead to oxidative stress and DNA damage ([3]). Similarly, terms like ‘Fructose-Bisphosphate Aldolase Activity’, ‘Oxidoreductase Activity’, ‘Acting On Paired Donors’, ‘Incorporation Or Reduction Of Molecular Oxygen’, ‘Oxidoreductase Activity’, ‘Acting On Peroxide As Acceptor’, ‘Electron Transfer Activity’ and ‘ATP Binding’, etc. are found to be top molecular functions of proteins present in the database (Figure 7b). All the molecular functions are direct steps or feedback mechanisms associated with oxidative phosphorylation (aerobic respiration). Mitochondria plays important role in oxidative phosphorylation and recent clinical studies have revealed a high percentage of mitochondria are present in gastrocnemius muscle tissue of high lander that helps to adapt to high energy expenditure environment ([19]). ‘COP9 signalosome’ and ‘Actomyosin’ are the two cellular components terms that are found most enriched in differentially expressed protein sets present in the database (Figure 7c). COP9 Signalosome is the part of the ubiquitin proteasomal degradation complex that controls the expression of pVHL, HIF-1α, and other oxygen responsive transcription factors regulated during hypobaric hypoxia ([20]). Whereas Actomyosin is a cytoskeleton of actin-myosin fiber complex present in different muscle tissues like skeletal muscle. The muscle fiber-type composition of both adult animals and humans is markedly altered during chronic exposure to high altitude.
The KEGG pathway enrichment shows ‘hsa00910: Nitrogen metabolism’ as the most enriched pathways in the differentially expressed HA protein set (Figure 7d). Nitrogen metabolism is a process of nitrogen oxides production and these oxides such as nitrous, nitrite, nitrate have been found to play important role in high altitude acclimatization responses ([21]). Thus the proteins in the database are associated with hallmark responses to hypobaric hypoxic stress responses, which supports the comprehensiveness of the database.