Cell lines and Culture
Fibroblasts were isolated by taking skin biopsies from the nether region from subjects without known psychiatric disorders. Fibroblast cultures were established following standard procedures [63] and stored as frozen aliquots in liquid nitrogen. 6 fibroblast cell lines matched for sex, age and passage number were thawed out and grown to confluence in T75 culture flasks in standard culture media (DMEM containing 10% fetal bovine serum (FBS) and 1x Penicillin-Streptomycin).
Upon reaching confluence, 5x10^4 cells were plated per line into 13 different 6 well plates (1 well per line per plate). All 6 lines were collected in the same experiment for the RNA-seq experiment. Due to the labor-intensive nature of the ATAC protocol and the need to process cells fresh, the 6 lines were split into 2 batches, so 3 lines per batch were processed.
Assessment of Circadian Expression in vitro
In order to collect RNA or cells every 4 hours for 48 hours, cells were split into two batches, which were reset 12 hours apart (see supplementary Fig. 9). Cells were reset 12 hours before the first collection to exclude the acute effects of dexamethasone and variation in synchronization conditions [59]. 5 days after being plated the cells from batch one were synchronized by treatment with 100 nM Dexamethasone for 30 min. Cells were then washed with PBS and switched to collection media (DMEM containing 5% FBS and 1x Penicillin-Streptomycin). Lower concentration of FBS was used in this media to stop the cells from growing during the experiment, in order to keep all time points at approximately the same culture density. 12 hours later cells from batch 2 were synchronized and switched to collection media and the RNA/cell collection was started (from batch one).
RNA and Cell collection
For the collection of RNA, cells were lysed using 350uL RLT lysis buffer from the Qiagen RNeasy mini kit. Lysed cells were then scraped off the plate, transferred to a Qiaschredder (Qiagen 79656) and centrifuged for 2 min at max speed to further homogenize. Cell lysates were kept in -80 until extraction.
For the collection of cells for the ATAC protocol, cells were dissociated using 500uL of prewarmed TrypLE (ThermoFisher 12604013) and left for 5 min at 37℃. TrypLE was inactivated using 500uL of DMEM. Cells were then counted using the Logos Biosystems LUNA-FL automated cell counter, and 50x10^4 cells were used as input for tagmentation. Tagmented DNA for library preparation was collected following the previously described protocol [64].
RNA extraction
RNA from cell lysates was extracted using the Qiagen RNeasy mini kit (Qiagen 74106). Cell lysates were extracted in a randomized order to prevent batch effects in downstream analysis. In order to collect total RNA including small RNAs, the standard extraction protocol (Purification of Total RNA from Animal Cells using Spin Technology) was adjusted by making the following changes: (i) adding 1.5 volumes of 100% ethanol, instead of 70%, after the lysis step (step 4 in handbook protocol) and (ii) adding 700 mL of buffer RWT (Qiagen 1067933) instead of the provided RW1 (step 6 in handbook protocol).
RNA and ATAC sequencing
For the RNA sequencing, library preps were made using the Lexogen QuantSeq 3’ mRNA-Seq Library Prep Kit and sequenced with 65-base single end reads, and sequenced at a targeted depth of 3.8M reads per sample, which is well above the recommended minimum 1M reads per sample read depth for these types of libraries. ATAC seq libraries were generated following the previously described protocol [64] and sequenced with 75-base double end reads, and sequenced at a targeted depth of 61M reads per sample. Library preparation and sequencing was performed at the UCLA Neuroscience Genomics Core (https://www.semel.ucla.edu/ungc). All samples were sequenced on a Illumina HiSeq 4000 sequencer.
RNA-seq data processing and analysis
Fastqc [65] software was used to assess the quality of the read files. Low quality reads were trimmed using TrimGalore and Cutadapt.
Alignment of reads was performed with the STAR[66] software and to human gene ensembl version GrCh38. STAR was indexed to the genome using the –runMode genomeGenerate function. For aligning, STAR was run with the parameters –outFilterType BySJout –outFilterMultimapNmax 20 –alignSJoverhangMin 8 –alignSJDBoverhangMin 1 –outFilterMismatchNmax 999 --outFilterMismatchNoverLmax 0.1 --alignIntronMin 20 --alignIntronMax 1000000 --alignMatesGapMax 1000000.
Samtools was used to index the aligned files from STAR.
Read counts were associated with genes using featureCounts software with the NCBI GRCh38 gene annotation file.
Analysis of the RNA-seq data used the R packages limma, Glimma and edgeR, as previously described (Law et al., 2018). Genes with low read counts were removed and reads were normalized by CPM. GeneIDs were converted to Gene Symbols using the package Homo.sapiens.
WGCNA [24] software was used to classify genes with similar temporal expression patterns. WGCNA was run using a power value of 12 obtained from diagnostic plots and with the "signed" argument. MetaScape was used for Gene Ontology analysis of the resulting gene sets from WGCNA.
Following the method described in [67], MetaCycle [30] was used to run circadian detection tools such as ARSER, JTK [27], LS and metacycle. In order to integrate results from the different individuals, the function meta3d was used from the Metacycle R package. RAIN [31] was run separately.
According to the EdgeR user guide, cubic splines were generated using the splines package in R, with the ns function and 5 degrees of freedom. Resulting p-values were corrected using a false discovery rate of 0.05. Significant genes were then compared to a previously published dataset of circadian human skin gene expression, resulting in 267 genes that were classified according to their time series using the “dtwclust” R package. The resulting clusters are available in the supplementary material. This analysis was also performed with only the known circadian clock genes that had consistent expression patterns across the cell-lines.
ATAC-seq data processing and analysis
Sequenced open chromatin data from the ATAC-seq assay followed the standard ENCODE Pipeline for the identification of open chromatin regions (OCRs) of the genome. The steps included using fastqc to evaluate the quality of the sequenced library. Followed by trimming of low quality reads with Trimgalore and Cutadapt. Alignment of the raw reads data to human gene Ensembl version GRCh38 was performed using bowtie2 [68] with a 2kb insert size and allowing up to 4 alignments. Reads within black-listed regions alongside PCR duplicates were removed with samtools. MACS2 [69] software was used to identify OCRs with parameters -g hs -q 0.01 –nomodel –shift − 100 –extsize 200 –keep-dup all -B. PCR. Quality control metrics for the ATAC-seq dataset such as peak counts, PCR bottlenecking coefficients, fraction of reads in peaks and enrichment of transcription starting site are provided in the supplementary material.
To compare the ATAC-seq signal across timepoints and subjects, we created a consensus bed file using the bedtools[70] function merge function, combining all the overlapping peak regions across timepoints and subjects into a single file. The Featurecounts software was then used to assign read counts to those regions. Read counts were normalized by RPM.
WGCNA [24] software was used to classify peaks with similar temporal accessibility patterns. WGCNA was run using a power value of 12 obtained from diagnostic plots and with the "signed" argument, identifying 4 modules after merging.
HOMER (Heinz et al. 2010) findMotifsGenome.pllp program was used to identify enriched transcription factor motifs first individually in the peaks that belonged to the largest WGCNA modules, as well as in the resulting set of grouping all the modules that displayed a similar increasing or decreasing pattern of accessibility.
Stratified linkage disequilibrium score regression analysis
Following the procedure described in [71], we applied an extension to sLDSR, a statistical method that partitions SNP-based heritability(h2) from GWAS summary statistics [37]. We ran sLDSR (ldsc.py –h2), using an ancestry-match 1000 Genomes Project phase 3 release reference panel, for each annotation of interest while accounting for the full baseline model, as recommended by the developers ([37]; [72]), and an extra annotation of all the ATAC-seq detected in our in vitro model (n = 3126 for peaks that were decreasing in accessibility, n = 4415 for peaks increasing in accessibility), as well as extension of these regions by 1kb and 10kb genomic windows in both directions.