ImmPort SGA dataset and data processing
We accessed the NIAID's ImmPort platform127 to obtain the relevant dataset, SDY18718, focusing on miRNA expression in Small-for-Gestational-Age (SGA) cases. The original study8 involved whole blood collection from N = 29 women, including N = 16 with normal birth outcomes and N = 13 with SGA births. Samples were collected at three gestation time points: (time-point A), (time-point B), and (time-point C) weeks. Plasma purification was followed by Nanostring nCounter miRNA assays covering 800 miRNAs. Normalized miRNA count data, available on ImmPort, underwent analysis using the R package DESeq2 (ver.1.36.0)128 to identify differentially expressed miRNAs in SGA vs. Controls at each time point and across all samples (time-independent). Principal Component Analysis (PCA) plots, based on normalized miRNA counts, and heatmaps, using log2(fold-change) values from DESeq2 analysis, were generated with ggplot2 (ver. 3.4.3) and pheatmap (ver. 1.0.12) R packages, respectively.
GeneLab simulated spaceflight datasets and data processing
We accessed datasets OSD-55 and OSD-336 from NASA's Open Science Data Repository (OSDR) and GeneLab for our analysis. OSD-336 involved a simulated space environment experiment on mice, with miRNA-seq data available on GeneLab. The method and data analysis for this experiment are briefly outlined below. OSD-5532 was a previous in vitro experiment simulating modeled microgravity (MMG) on human peripheral blood lymphocytes (PBLs) using a Rotating Wall Vessel (RWV) bioreactor. In the original publication33, PBLs from 12 healthy donors were split for MMG and 1 g control conditions. At 24 hours, RNA was isolated from 107 PBLs from each replicate, and miRNA profiling was performed using the Whole Human Genome Oligo Microarray (Agilent). The miRNA microarray data was deposited on the Gene Expression Omnibus (GEO) website (http://www.ncbi.nlm.nih.gov/geo/) with accession number GSE57400. Data processing involved "Analyze with GEOR" on GEO, applying log transformation and determining significant miRNAs by Benjamini and Hochberg (FDR) with a cutoff < 0.05. Principal Component Analysis (PCA) plots and heatmaps were generated using ggplot2 (ver. 3.4.3) and pheatmap (ver. 1.0.12) R packages, respectively.
Murine simulated space environment experiments
We utilized murine data from our prior experiments reported in Malkani et al.27 and Paul et al.29. C57Bl/6J wildtype female mice (15 weeks +/- 3 days old) were obtained from Jackson Laboratories and housed at Brookhaven National Laboratory (BNL, Upton, NY). After quarantine and acclimation to a 12:12 hour light:dark cycle with controlled temperature/humidity for a week, mice were cage acclimated (n=10 mice per group; 2 mice per cage) 3 days before hindlimb unloading (HU). Food and water were given ad libitum, and standard bedding was changed once per week. The normally loaded (NL) mice, used in parallel experiments not reported here, underwent the same acclimation. HU was conducted for 14 days, with irradiation on day 13 (0.5 Gy GCR, 1 Gy SPE, 5 Gy Gamma, and 0 Gy Sham control). SPE simulated irradiation consisted of protons or different energies ranging from 50 MeV to 150 MeV. On day 13, mice were transported to NASA Space Radiation Laboratory (NSRL). They were individually placed in HU boxes and exposed to GCR or Sham control (no irradiation) in the plateau region of the Bragg curve at room temperature. Dosimetry was performed by NSRL physics staff. The radiation dose simulated the exposure an astronaut might receive during a Mars mission, modeled as a single 25-minute exposure rather than the actual chronic exposures over 1.5 years. A 60x60 beam was utilized at the NSRL for irradiation. Sham controls underwent the same procedures without irradiation. Blood tissues were collected 24 hours post-irradiation and post-euthanasia. Plasma was separated, flash-frozen, and stored at -80°C. A cellular fraction aliquot was stored for RNA analyses, while the remaining fraction underwent flow cytometric preparation. All organs (i.e., heart, liver, and soleus muscle) were flash frozen at dissection and stored at -80°C. Body weight tracking was performed on days -3, 0, 7, and 14.
All experiments were approved by Brookhaven National Laboratory’s (BNL) Institutional Animal Care and Use Committee (IACUC) (protocol number: 506) and all experiments were performed by trained personnel in AAALAC accredited animal facilities at BNL, while conforming to the U.S. National Institutes of Health Guide for the Care and Use of Laboratory Animals. All methods were carried out in accordance with the relevant guidelines and regulations and are reported in accordance with ARRIVE guidelines.
Simulated Galactic Cosmic Radiation (GCR) Exposure
The animals were irradiated at the NASA Space Radiation Laboratory (NSRL) at Brookhaven National Laboratory (BNL). Positioned in the plateau region of the Bragg curve, mice received irradiation at room temperature. NSRL physics staff conducted dosimetry using a 60cm x 60cm beam. All mice were exposed to 0.5 Gy of simplified simulated Galactic Cosmic Radiation (GCR). The irradiation utilized ions, energy, and doses determined by a NASA consensus formula for five ions: protons at 1000 MeV, 28Si at 600 MeV/n,4He at 250 MeV/n, 16O at 350 MeV/n, 56Fe at 600 MeV/n, and protons at 250 MeV in the following proportions – 1000MeV protons at 34.8%, 250 MeV protons at 39.3%, 28Si at 1.1%, 4He at 18%, 16O at 5.8%, and 56Fe at 1%. This simplified mixture mirrors the ion proportions in space, making it relevant to exploratory class missions31. While low Linear Energy Transfer (LET) particles (Protons and Helium) dominate, high LET ions generally have a greater relative biological effect (RBE).
miRNA extraction from murine tissues
MiRNA extractions from plasma was carried out using the Qiagen miRNeasy serum/plasma kit (Cat# 217184). Quantitation of miRNA samples was done using a NanoDrop 2000 Spectrophotometer (ThermoFisher Scientific).
miRNA sequencing on murine samples
For miRNA library construction and sequencing, plasma-derived miRNAs from the aforementioned mouse experiments were isolated using the QIAgen miRNeasy kit (#217004). Total RNA quality and quantity were assessed with a Bioanalyzer 2100 (Agilent, CA, USA), ensuring a RIN number > 7. A TruSeq Small RNA Sample Prep Kits (Illumina, San Diego, USA) protocol was followed, utilizing approximately 1 μg of total RNA to prepare a small RNA library. Single-end sequencing with 50 bp was conducted on an Illumina Hiseq 2500 at LC Sciences (Hangzhou, China), adhering to the vendor's recommended protocol. Raw miRNA-sequence data is available on NASA Open Science Data Repository with the following identifiers: heart-related data: OSD-334, DOI: 10.26030/cg2g-as49, liver-related data: OSD-335, DOI: 10.26030/72ke-1k67, plasma-related data: OSD-336, DOI: 10.26030/qasa-rr29, and soleus muscle-related data: OSD-337, DOI: 10.26030/m73g-2477.
Analysis of miRNA sequencing from murine samples
Raw reads underwent preprocessing with ACGT101-miR software (LC Sciences, Houston, Texas, USA), eliminating adapter dimers, junk, low complexity, and common RNA families (rRNA, tRNA, snRNA, snoRNA), as well as repeats. Unique sequences, spanning 18~26 nucleotides, were aligned to miRBase 22.0 by BLAST search for identification of known miRNAs and novel 3p- and 5p-derived miRNAs. Allowances for length variations at both ends and one mismatch inside the sequence were applied. Sequences mapping to species-specific mature miRNAs were recognized as known miRNAs, while those on the opposite arm of annotated miRNA-containing arms were deemed novel 5p- or 3p-derived candidates. Unmapped sequences were subjected to BLAST against specific genomes, and hairpin RNA structures were predicted using RNAfold software based on predefined criteria. Known miRNAs were identified using the same criteria. Differential expression analysis of miRNAs, utilizing normalized deep-sequencing counts, employed Fisher exact test, Chi-squared 2X2 test, Chi-squared nXn test, Student t test, or ANOVA, with significance thresholds set at 0.01 and 0.05 for each test.
miRNA pathway analysis
To determine Hallmark34 and MitoPathway39 pathways being regulated by the miRNAs, we performed miRNA gene set analysis utilizing the RBiomirGS129 v0.2.12 R package from the processed miRNA analysis for all conditions in the plasma, PBLs, and SGA data. From the pathways we chose an FDR < 0.25 cutoff for significantly regulated pathways. We plotted the specific pathways as lollipop plots with R package ggplot2 (ver. 3.4.3).
Determining common miRNAs between SGA and spaceflight data
To identify common miRNAs between the SGA and spaceflight datasets, we overlaid significantly regulated miRNAs (adj. p-value < 0.05) from the SGA dataset (SY1871) with simulated space environment data (OSD-55 and OSD-336). We narrowed down the list to include only miRNAs that exhibited consistent regulation in the same direction between OSD-336 and SY1871. This refined process resulted in the identification of 13 common miRNAs, forming the miRNA signature associated with both SGA and spaceflight.
Conserved miRNA analysis between humans and mice
Selected precursor and mature miRNA sequences from human and mouse were extracted from miRBase v.22.1130. After BLASTN alignment, conservation between mouse and human sequences was determined as the percentage of aligned nucleotide identities in mature and pre-miRNA sequences.
Analysis on pathway, disease, and gene targets for miRNAs and and network generation
To predict the functions and diseases associated with the 13 common miRNAs, we employed miRNet131 and visualized the results using ggplot2 in R (v3.4.3). For identifying gene targets of each miRNA, we utilized six different miRNA-gene target databases: miRmap132. miRwalk133, miRnet, miRDB134, miRTarBase135, and mirDIP136. To ensure robust predictions, we considered only the gene targets shared by three or more databases (Table S1). Further refinement narrowed down the essential gene targets for the miRNA signature associated with SGA and spaceflight to those common across 10 or more of the 13 miRNAs, resulting in 45 genes. Upset plots were made with ComplexUpset (ver.1.3.3) R package. Using Cytoscape's137 ClueGo/CluePedia plugin138, we created a network map illustrating the connectivity between the genes and miRNAs. To explore the global pathways regulated by the 45 genes, we employed CluePedia, presenting the key pathways as a connected network. Additionally, a more detailed pathway interaction analysis was conducted using the GeneMANIA plugin139 in Cytoscape. GeneMANIA settings included 0 related genes, 50 attributes with automated weighting, revealing the top 50 pathways associated with the 45 genes.
Meta-analysis of GeneLab RNA-seq data from 817 samples across 27 datasets encompassing 10 different mouse tissues
FASTQ files were programmatically downloaded from GeneLab (https://genelab.nasa.gov/) and processed using the MTD pipeline140. A meta-analysis of the GeneLab data was performed in two steps. Firstly, we performed batch correction using Combat-seq141 to standardize data across different datasets. Subsequently, DESeq2128 was used to perform differential expression analysis between the spaceflight and ground control mice while adjusting for age, sex, tissue, sacrifice site, and mission duration. Then, we visually confirmed that the miRNA targets (top 45 or all) across datasets. The following GeneLab datasets were used: OSD-98, 99, 100, 101, 102, 103, 104, 105, 137, 161, 162, 163, 168, 194, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 253, and 254.
Inspiration4 (I4) astronaut sample collection
Detailed methods for sample collection and data processing are described in Overbey et al.71. and Kim et al.142. Blood samples were collected before (Pre-launch: L-92, L-44, and L-3) and after (Return; R+1, R+45, and R+82) the spaceflight. Chromium Next GEM Single Cell 5′ v2, 10x Genomics was used to generate single cell data from isolated PBMCs. Subpopulations were annotated based on Azimuth human PBMC reference.
In summary, blood samples were collected before (prelaunch; L-92 days, 44 days, and 3 days) and after (return; R+1 day, R+45 days, and R+82 days) the spaceflight. Top45 or all miRNA target genes suggested in the paper has been used for identifying DEGs from i4 PBMCs and subpopulations. The Seurat method was used to normalize RNA count data and calculate average expression of each gene. We used the average expression of the genes from four astronauts comparing post-flight (R+1) vs pre-flight (L-92, L-44, and L-3) to identify DEGs with the Wilcoxon signed-rank test (adjusted p-value < 0.05). Average expression of the miRNA target genes (Top45 or all) was used to generate heatmaps plotted by the pheatmap R package.
The procedure followed guidelines set by Health Insurance Portability and Accountability Act (HIPAA) and operated under Institutional Review Board (IRB) approved protocols and informed consent was obtained. Experiments were conducted in accordance with local regulations and with the approval of the IRB at the Weill Cornell Medicine (IRB #21-05023569).
Cumulative plots for the miRNA targets, and statistical test
Cumulative plots for miRNA gene targets were constructed by retrieving information on 8mer sites matching the seed region of miRNAs from the TargetScan v8.0 database143. mRNAs were categorized based on miRNA seed sequences (8mer, or 'no site'), and the cumulative plot was generated using mRNA log2 fold change values with the ecdfplot method from seaborn144. To assess the distribution differences in mRNA fold change values between each seed match and the 'no site' scenario, the Kolmogorov–Smirnov test was calculated using the kstest function from scipy in Python145.
Machine learning analysis for small molecule drug predictions for targeting miRNAs
We used the sChemNET machine learning framework82 to predict small molecules that might affect miRNAs or their gene targets. sChemNET is a deep learning approach that takes as input the chemical structure of the small molecule in the form of a binary fingerprint and predicts a score that a given small molecule might affect a given miRNA. We trained sChemNET on Homo sapiens small molecule-miRNA data from the SM2miR database, from which we used 1,102 associations between 131 small molecules and 126 miRNA targets. Unlabelled small molecules were obtained from the Drug Repurposing Hub (see details in 82).
We also performed an enrichment analysis of the drug’s mode of action and indications from the obtained drugs with sChemNET. To this end, we trained sChemNET for each miRNA using all the available labeled small molecules and a randomly selected set of 2,400 unlabeled small molecules. Then, sChemNET was used to rank the remaining set of unlabeled small molecules based on the average prediction score of 20 random independent repetitions. Small molecules amongst the 98th percentile score were then kept as predictions for the miRNA. We then retrieved the mode of action and indication information of small molecules from the Drug Repurposing Hub database. The enrichment score was calculated based on p-values calculated using Fisher's Exact Test and adjusted with Benjamin-Hochberg correction for multiple testing.