Data collection
Single-cell transcriptomic data were retrieved from Gene Expression Omnibus (https://www.ncbi.nlm.nih.gov/geo/) database. RB samples diagnosed for 4 months or 2 years and Normal samples were obtained from GSE159977 dataset. Besides, bulk transcriptomic data were retrieved from GEO databases (GSE97508, GSE110811, GSE24673, GSE208143). GSE97508, GSE208143 and GSE24673 datasets were merged as train dataset. Batch effects from non-biological technical biases were corrected using the “ComBat” algorithm of “sva” package. GSE110811 dataset was used for validation.
ScRNA-Seq Analysis
The analysis of the single-cell dataset was conducted using Scanpy (version 1.9.1) within Python (version 3.8.13). To ensure data quality, rigorous quality control procedures were applied to exclude the cells with undesirable attributes while the retaining cells meeting predefined criteria. These criteria encompassed gene counts ranging from 0.02 to 0.98, and mitochondrial gene counts below 20%. Additionally, Scrublet (version 1.0) was employed to identify and eliminate doublet cells. Principal Component Analysis (PCA) was executed via Scanpy to reduce the dimensionality of the cell data. To mitigate batch effects, the "harmony_integrate" function in Scanpy was employed. Clustering and visualization were performed using Leiden and UMAP algorithms. Finally, cell clusters were annotated with known cell types based on cell type-specific markers.
Disulfidptosis‑related genes (DRGs) scoring
The "AUCell" R package, designed to assess the active status of gene sets in single-cell RNA data, was employed to compute activity scores for DRGs in each individual cell. Following this, cells were categorized into two distinct groups based on their DRG-AUC (Disulfide-Related Genes Area Under the Curve) values: those with high DRG-AUC and those with low DRG-AUC, with the median value being utilized as the threshold for classification.
Chromosome copy number variation analysis
The approach described in inferCNV and implemented as infercnvpy (https://github.com/icbi-lab/infercnvpy) was employed to detect potential changes in chromosomal copy numbers. Control cells were selected from healthy tissues, including cone cells and immune cells. The differentiation between normal cone cells and malignant cells was accomplished through assessment of iterative clustering impact and computation of the copy number variation score.
Drug Response Prediction
We utilized the scDrug Python package, built upon the CaDRReS-Sc model, to predict IC50 values and tumor cell killing ratios associated with potential drug responses. This framework incorporates pre-trained GDSC and PRISM models, which have undergone rigorous training using gene expression data to establish robust pharmacogenomic relationships, ensuring precise drug response predictions. Effective data visualization using the seaborn package allowed us to focus on IC50 values and tumor cell killing ratios across different subpopulations. Our drug selection criteria were based on minimizing IC50 values and maximizing tumor cell killing ratios.
RNA velocity analysis
Velocyto allows for the estimation of single-cell RNA velocity by distinguishing between unspliced and spliced reads in bam files. The process involves generating Loom files using velocyto.py and subsequently importing this data into scVelo for further analysis. The 'scv.pp.moments' function is utilized to calculate the first and second order velocity vector moments, while the 'scv.t.velocity' functions are employed to compute dynamic velocity. The RNA velocities are then embedded into UMAP using the velocity graph.
Developmental trajectory inference
Monocle2 ((version 2.14.0) was employed for trajectory analysis to uncover alterations in cell states. The "estimateSizeFactors" and "estimateDispersions" functions in Monocle2 were utilized to identify the "dispersion" genes, which were subsequently employed for cell ordering. Dimensionality reduction was conducted through the 'reduceDimension' function, with the 'DDRTree' reduction method being applied.
Consensus cluster analysis
The Disulfidptosis-related genes were uniformly clustered using the ConsensusClusterPlus R package. This clustering analysis utilized the "pam" algorithm with "euclidean" distance measurement. Subsequently, 1000 bootstraps were executed, with each bootstrap comprising 80% of the patients from the training dataset. The range of cluster numbers, denoted as "k," was set from 2 to 9. The optimal "k" was determined based on both the cumulative distribution function (CDF) and the area under the curve (AUC) analysis. Additionally, Principal Component Analysis (PCA) was conducted to assess the validity of the molecular subtype distributions.
Weighted correlation network analysis
We initiated the process by constructing a clustering tree for the samples to detect and eliminate any outliers. Subsequently, the gene expression data from the training dataset underwent analysis using Weighted Gene Co-expression Network Analysis (WGCNA). The adjacency matrix was transformed into the Topological Overlapping Matrix (TOM). A threshold of R² = 0.85 was established as the criterion for module construction. Dynamic Tree Cut was employed with parameters set to deepSplit = 2 and minModuleSize = 100 to generate these modules. Key modules were selected based on the correlation between their members and the significance of the included genes.
Selection of characteristic genes
Predictive models were created using common genes identified from both scRNA and WGCNA. These models, which included Random Forest (RF), Support Vector Machine (SVM), Extreme Gradient Boosting (XGBoost), and General Linear Model (GLM), were implemented within the R software. Model performance assessment included the generation of reverse cumulative distribution curves for residuals and Receiver Operating Characteristic (ROC) curves. Model selection was based on scores derived from reverse cumulative residuals and Area Under the Curve (AUC) values obtained from the ROC curves.
Cell culture
Human retinal pigment epithelial cells (ARPE-19) were obtained from ATCC (Manassas, VA, USA). RB cell lines WERI-Rb1 and Y79 were purchased from Procell Life Science & Technology Co., Ltd (Wuhan, China) and National Collection of Authenticated Cell Cultures (Shanghai, China), respectively. All cells were cultured in RPMI-1640 medium (Gibco, Grand Island, NY, USA) at 37°C and 5% CO2 atmosphere. The medium was supplemented with 1% L-glutamine (Sigma-Aldrich, St. Louis, MO, USA), 1% penicillin/streptomycin (Sangon Biotech, Shanghai, China) and 10% fetal bovine serum (FBS; Gibco).
Quantitative real-time PCR (qRT-PCR)
RNA extraction was carried out applying TRIzol regent (Thermo Fisher Scientific, San Jose, CA, USA) to isolate RNA from cells. PrimeScript™ RT reagent Kit (Takara, Dalian, China) was applied to synthesize cDNA with RNA as template. Genomic DNA was eliminated with gDNA Eraser. PCR reacted was conducted to examine CDT1 expression applying TB Green® Premix Ex Taq™ (Takara). Primer sequences (5’-3’) were shown as follow: CDT1: forward-GACATGATGCGTAGGCGTTTT and reverse-GAGCTGGTAATCTGACCTCCT-3; GAPDH: forward-CCACCCATGGCAAATTCCATGGCA and reverse-TCTAGACGGCAGTCAGGTCCACC. The relative expression of CDT1 was analyzed by 2−ΔΔCT method.
Colony formation assay
Cells at logarithmic phase was seeded into a 6-well plate with density of 1 × 103 cells/plate. After 2 weeks of incubation at 37°C and 5% CO2, cells were fixed with 4% paraformaldehyde, and then stained with 0.5% crystal violet. Finally, the number of colony was counted.
Transwell migration and invasion assays
Transwell assay was conducted to examine migration and invasion of cells applying the 8 µm migration Transwell chambers (Corning, NY, USA). For cell migration assay, cells were added to the upper chambers at a concentration of 5× 104 cells/200 µL of FBS-free medium, whereas the lower chambers contained 700 µL of medium and 10% FBS. For invasion assay, cells were seeded into the upper chambers coating with 25 µL Matrigel (Sigma-Aldrich, St. Louis, MO, USA). After 24 h of culture, the migrated and invaded cells were stained with 0.5% crystal violet reagent, which were then observed under an optical microscope.
Statistical analysis
Each assay was performed for 3 times. Data were analyzed by SPSS 22.0 statistical software (IBM, Armonk, NY, USA) and expressed as mean ± standard deviation. Two-tailed Student’s t test and one-way ANOVA were used to analyze the statistical difference. P < 0.05 was considered as a significant difference.