Genomic and epigenomic insights into the mechanism of cold response in upland cotton (Gossypium hirsutum)

Functional genome research, including gene transcriptional and posttranslational modi�cations of histones, can bene�t greatly from a high-quality genome assembly. Histone modi�cation plays a signi�cant role in modulating the responses to abiotic stress in plants. However, there are limited reports on the involvement of dynamic changes in histone modi�cation in cold stress response in cotton. In this study, the genome of an elite accession, YM11, with considerable cold stress resistance was de novo assembled, which yielded a genome of 2343.06 Mb with a contig N50 of 88.96 Mb, and a total of 73,821 protein-coding gene models were annotated. Comparisons among YM11 and �ve Gossypium allopolyploid cotton assemblies highlighted a large amount of structural variations and presence/absence variations. We analyzed transcriptome and metabolome changes in YM11 seedlings subjected to cold stress. Using the CUT&Tag method, genome-wide H3K3me3 and H3K9ac modi�cation patterns and effect of histone changes on gene expression were pro�led during cold stress. Signi�cant and consistently changing histone modi�cations and the gene expressions were screened, of which transcription factors (TFs) were highlighted. Our results suggest a positive correlation between the changes in H3K4me3, H3K9ac modi�cations and cold stress-responsive gene activation. This genome assembly and comprehensive analysis of genome-wide histone modi�cations and gene expression provide insights into the genomic variation and epigenetic responses to cold stress in upland cotton.


Background
Upland cotton (Gossypium hirsutum, 2n = 4x = 52), which constitutes > 90% of annual ber production worldwide, is the most important ber-producing species (Hu et al. 2019).In China, over 60% of cotton production comes from the northwest region, of which the most important cotton planting region is Xinjiang.Xinjiang produces high-quality cotton annually above 350×10 4 t, accounting for 67% of China's output (Lu et al. 2018a; Lu et al. 2022a).In addition, cotton is an ideal research system for studying polyploidization (Paterson et al. 2012), which has bene ted from genome sequencing and assembly technology.To date, the reference genomes of several upland cotton accessions have been released ( Li et al. 2020).These accessions focused on a genetic standard line (TM-1) and elite varieties (ZM24,  primarily cultivated in the Huang-Huai-Hai Region and Yellow River basin of China.However, genomes corresponding to upland cotton with cold resistance features that are suitable for the Xinjiang climate are still not characterized. Atmospheric abiotic stresses, such as temperature uctuations and cold and heat stress, cause yield cuts and quality losses because of the immobile growth characteristics of crops (Farooq et al. 2018;Drobek et al. 2019).Cold stress can affect the morphological development of plants and lead to growth retardation, leaf yellowing, withering and even death by changing cell morphology, subcellular structure and protoplasm, dysregulation of respiration, enzyme activity and metabolic imbalance (Mahajan et al. 2015;Mehrotra et al. 2020;Guo et al. 2021).To address this unfavorable condition, plants have evolved sophisticated genetic and epigenetic regulatory systems to quickly respond.In cotton, cold stress is a main factor hampering attempts to increase seedling percentage and ber quality.The suitable temperature range for cotton growth and development is 20-30 ℃.Low temperature in the early growing season can cause serious damage to cotton seedlings, and the plants may not fully recover even after the temperature returns to normal (Bange et al. 2004).Moreover, encountering low temperature during reproductive growth will lead to the abortion of cotton pollen and ber development disability, which will seriously affect cotton ber yield and quality production (Thakur, 2008).Cotton in the Xinjiang region is especially susceptible to cold damage during the seedling stage and ber establishment stage due to the short frost-free period and frequent late spring coldness (Rihan et al. 2017).Reducing or avoiding the damage of low temperature to cotton has been one of the urgent problems to be solved (Zafar et al. 2018).
Low temperature can induce genome-wide transcriptional changes (Roelofs et al. 2010).In cotton, the membranes play a direct role in adapting to cold stress by enhancing microsomal delta 12 fatty acid desaturase expression (Kargiotidou et al. 2008).Li et al. (Li et al. 2019) reported transcriptomic datasets for cotton in response to low temperature exposure, laying a foundation for genome-wide differential expression analysis.Furthermore, Shen et al. (Shen et al. 2020) identi ed 7525 differentially expressed genes (DEGs) by sequencing the mRNA of a cold treatment time course of 12 h, 18 h, 30 h, 42 h and 54 h, which were enriched in energy metabolism, such as the tricarboxylic acid cycle and glyoxylate cycle.In addition, DEGs associated with hormone metabolism were annotated.Recently, C-repeat-binding factors (CBFs) and GARP family genes associated with cold stress were functionally characterized in cotton (Liu et al. 2021a;Liu et al. 2021b).
Histone modi cations, as posttranslational epigenetic marks, can allow alternative nucleosome con gurations at transcription sites of transcription factors with dynamic access to an otherwise tightly packaged genome, which play important roles in plant development, function, growth, and survival to environmental cues (Deal et al. 2011;Lu et al. 2018b;Kidokoro et al. 2022).In Arabidopsis, three types of H3K4 methylation marks were found at gene bodies (H3K4me1) and promoters (H3K4me2 and H3K4me3) of actively transcribed genes (Zhang et al. 2009).Furthermore, histone modi cations were found to transcriptionally activate euchromatic regions or genes associated with histone methylation (H3K4) and acetylation (H3K9/H3K23/H3K29), while transcriptional repression of histone methylation (H3K9 and H3K27) was observed under abiotic stress (Dhar et al. 2014).Recent studies have demonstrated that histone modi cations, such as H3K4me3, H3K9ac, H3K9me2, H3K23ac, H3K27ac, H3K27me3, and H4ac, are correlated with gene expression responsible for abiotic stresses, such as temperature threats (Miura et al. 2020).
To counter cold stress, plants develop and coordinate elaborate widespread alteration of chromatin structural changes and thus regulate gene expression (Hu et al. 2011;Kim et al. 2015;Probst et al. 2015).Various differential chromatin and histone modi ers, such as H3K27ac and H3K27me3, increase DNA accessibility, thereby promoting the transcription of cold-responsive genes in indica rice (Dasgupta et al. 2022).Under cold stress, H3K4me3 is recognized by the PHD nger of SAP and MIZ1 domain-containing ligase 1 (SIZ1) and accumulates signi cantly in the WRKY70 promoter in Arabidopsis, and WRKY70 is upregulated simultaneously (Miura et al. 2020).Kwon et al. (2009) reported that cold stress caused a decrease in H3K27me3 deposition that could be maintained even after normal growth temperature recovery, suggesting that H3K27me3 may serve as a memory marker for recent cold-induced transcriptional activity in Arabidopsis (Chang et al. 2020).Other studies reported that cold stress treatment in maize led to the upregulation of HDACs, which resulted in the deacetylation of H3 and H4 and the activation of heterochromatic tandem repeats, which caused a reduction in DNA methylation and histone dimethylation (H3K9me2) at the targeted region of the genome (Ding et al. 2012;Hu et al. 2012).
Additionally, H3K4me3 and H3K27me3 marks are associated with active genes in response to cold stress, which may be explained by enhanced chromatin accessibility and thus facilitate the access of regulatory proteins required for gene expression (Zeng et al. 2019).
It is important to strengthen the cold stress resistance of cotton for the development of the cotton industry.However, it is still largely unknown how histone modi cation-mediated transcriptional regulation mechanisms are associated with the expression of cold-responsive genes in cotton.In this study, YM11, an accession with elite cold tolerance and adaptation to the climate of Xinjiang, was selected for genome de novo assembly.In addition, RNA-seq and CUT&Tag (Cleavage Under Targets and Tagmentation) were adopted to explore the transcriptional and epigenetic changes (H3K4me3 and H3K9ac) in addition to genetic factors to understand the molecular mechanisms underlying the cold stress response, guiding the study of cold resistance and improving the stable yield production of upland cotton.

Plant materials and cold stress treatment
After pregermination, the sprouted seeds of YM11 (pure lines) were uniformly planted in seedling trays and then placed in a greenhouse at a 12 h/12 h day/night regime, 32 ℃/27 ℃ accordingly.The planting medium was a mixture of peat soil, perlite and vermiculite at a ratio of 3:1:1.During the three-leaf stage, the seedlings were uniformly placed in an arti cial climate chamber BMJ-250C (Boxun, Shanghai, China) at 4 ℃ with a light to dark cycle of 14 h: 10 h and illumination intensity of 1500 ~ 2000 lx for cold stress treatment.The sampling time was set at 0 h and 6 h.Transcriptome sequencing (RNA-seq) and metabolome analysis were performed at the two time points with three and six biological replicates, respectively.Approximately 2 g of leaves from each biological replicate were frozen in liquid nitrogen immediately, ground into ne powder, and then transferred to a -80°C refrigerator for storage.In particular, two replicated samples (~ 1 g) at both 0 h and 6 h from the same tissue used in RNA-seq were selected for the CUT&Tag assay.In addition, approximately 5 g young leaves of YM11 were sampled to extract genomic DNA for genome assembly, and a collection of YM11 root, stem, leaf, ower and boll mixed samples from the eld were used to conduct RNA-seq for transcriptome-assisted genome annotation.Short-read, long-read library construction and sequencing.
High-quality genomic DNA of the fresh leaves was isolated using a Genomic DNA DP360 Kit (Tiangen Biotech, Beijing, China) and was used for 350 bp paired-end library construction according to the manufacturer's recommendations (Illumina, San Diego, CA, USA).After fragment size and concentration checking by an Agilent 2100 bioanalyzer (Agilent Technologies, Palo Alto, CA) and qPCR quanti cation, the library was loaded onto Cbot for cluster generation and Novaseq 6500 for short-read sequencing.For long-read sequencing, we isolated high-integrity DNA from young leaves according to the protocols of Mayjonade et al. (2016), followed by DNA purity and concentration assessment using a NanoDrop NP-1000 spectrophotometer (NanoDrop Technologies, Wilmington, DE, USA) and Qubit (Carlsbad, CA, USA), respectively.DNA (~ 15 µg) was sheared to a size range of 10 kb to 30 kb by using Covaris g-TUBE (Covaris, MA, USA).The HiFi sequencing library was prepared using the SMRTbell Express Template Prep Kit 2.0 according to PacBio's standard protocols.BluePippin (Sege Science, MA, USA) was used to conduct size selection (10 kb-30 kb).Finally, the library was sequenced on a PacBio Seqel II instrument (circular consensus sequencing (CCS) model) using four ow cells for 30 h (Adsen Biotechnology Company, Urumqi, China).
Living seedlings at the three-leaf stage were used for Hi-C library preparation as previously described (Belton et al. 2012).Brie y, the fresh leaves were immediately cross-linked by immersion in 3% formaldehyde for 15 min, followed by ne powder grounding and nuclei isolation.Then, the cross-linked and puri ed nuclei were digested by Hind III (NEB, England).Subsequently, the sticky ends of these enzyme-digested fragments were end-repaired by biotin-modi ed base.Then, the repaired DNA molecules were cyclized using blunt-end proximity-ligated fragments.Finally, the circular-like products were fragmented into 300-600 bp fragments and enriched by biotin beads.After fragment size and concentration monitoring as mentioned above, the library was subjected to paired-end 150 bp sequencing in one lane (Illumina, San Diego, CA, USA).This Hi-C library generated approximately 278.05 Gb clean data (~ 121 × depth).

Comparative genome analysis
To investigate the colinearity between YM11 and TM-1, we used this newly assembled genome as queries to map it to the published genome (Hu et  ) as queries to map them one by one to the genome of YM11 using minimap2 v2.24 (Li et al. 2018) under the setting of -eqx -ax asm5, followed by structure variation (SV) and presence/absence variation (PAV) calling using SyRI (Goel et al. 2019) under default settings.

CUT&Tag assay
Two biological replicates at 0 h and 6 h were selected for CUT&Tag library construction.H3K4me3 (Abcam, ab8580) and H3K9ac (Abcam, ab10812) were used as antibodies, and IgG (Millipore cat.no.12-370, 1 mg/mL) was used as a control.The samples were named C0-H3K4me3 and C0-H3K9ac at 0 h and C6-H3K4me3 and C6-H3K9ac at 6 h.Library construction was performed according to the method of Tao (Tao et al. 2020).Brie y, it is mainly divided into the following steps: 1. Extract and purify the nucleus; 2. Incubate with primary antibody and secondary antibody; 3. Incubate with pG-Tn5/pA-Tn5 enzyme (Vazyme, cat.no.S602/S603); 4. DNA extraction and fragmentation; 5. Puri cation, ampli cation and screening of the DNA fragments; 6. Illumina sequence library construction and sequencing were performed on the Novaseq 6500 platform.Finally, a total of ~ 5 Gb of data were obtained for each sample.We used Fastp to evaluate the quality of the data to obtain clean data (Chen et al. 2018).The clean data were then aligned to the genome of YM11 using bowtie2 (Langmead et al. 2012) with the parameters '--local --very-sensitive --no-mixed --no-discordant --phred33 -I 10 -X 700', followed by PCR duplication removal using Picard (https://broadinstitute.github.io/picard/).Then, the signal intensities of each sample were calculated separately under a resolution of 500 bp bin size, and the Pearson correlation coe cient between samples was analyzed.We used macs2 version 2.1.4(Zhang et al. 2008) to perform peak calling with the parameter '-f BAMPE -q 0.1 --keep-dup all'.The signal distribution and heatmaps are displayed within 3 kb upstream of the transcription start site (TSS) and gene region and 3 kb downstream of the transcription termination site (TES).An R package Genomation (Akalin et al. 2015) was used to annotate the peaks and count the distribution ratios of peaks in each functional area.For differential peak selection, peaks with overlapping regions were merged rst, then the read abundance of each peak was calculated, and nally, the difference between peaks was analyzed using DESeq2 (Love et al. 2014).Neighboring genes of differential peaks were annotated by Gene Ontology (GO) (Dimmer et al. 2012) and Kyoto Encyclopedia of Genes and Genomes (KEGG) (Kanehisa et al. 2000) enrichment analysis.IGV was used to view the peak abundance of speci c locations.

RNA-seq analysis
With regard to RNA-seq bioinformatic analysis, the fresh leaves of YM11 at 0 h and 6 h of cold treatment were harvested with three biological replications for total RNA extraction using the RNAprep Pure Plant Kit (Tiangen Biotech, Beijing, China).The samples were titled C-0 at 0 h and C-6 at 6 h.RNA-seq libraries were generated and sequenced (paired-end 150 bp model) on the Illumina NovaSeq 6000 (Illumina, San Diego, CA, USA) platform according to the manufacturer's recommendations.The raw sequencing reads were processed to generate clean data after adaptor removal, paired reads with unidenti ed nucleotides (ratio in one read > 10%) ltering, and low-quality read removal.The clean reads were aligned to the genome of YM11 using HISAT2 with default settings (Kim et al. 2019).The fragments per kilobase of transcript per million mapped reads (FPKM) method was used to quantify transcript expression levels (Li et

Metabolomic testing and data analysis
The same treatment samples used for the above RNA-seq with six biological replications were collected and ash-frozen in liquid nitrogen.The samples were titled Z-0 at 0 h and Z-6 at 6 h.The samples were sent to MetWare Biological Science and Technology Co., Ltd.(Wuhan, China) for untargeted metabolomics assays.Brie y, the samples were ground to a ne powder, and 10 mg of leaf powder was added to a 400 µL solution (methanol:water = 7:3, V/V) containing an internal standard, vortexed for 3 min, and then placed at -20°C for 30 min.The sample was then centrifuged at 12000 rpm for 10 min (4°C).The sediment was removed, and the supernatant was centrifuged at 12000 rpm for 3 min (4°C).Two hundred microliters of supernatant from all samples was acquired by the LC-MS system following the manufacturer's instructions.The analytical conditions were as follows: UPLC: column, Waters ACQUITY UPLC HSS T3 C18 (1.8 µm, 2.1 mm*100 mm); column temperature, 40°C; ow rate, 0.4 mL/min; injection volume, 2 µL; solvent system, water (0.1% formic acid): acetonitrile (0.1% formic acid); gradient program, 95:5 V/V at 0 min, 10:90 V/V at 11.0 min, 10:90 V/V at 12.0 min, 95:5 V/V at 12.1 min, 95:5 V/V at 14.0 min.The raw mass spectrometry data were converted into mzML format using ProteoWizard software (Chambers et al. 2012).Peak extraction, peak alignment and retention time correction were performed by the XCMS program (Mahieu et al. 2016).The "SVR" method was used to correct the peak area.The peaks with detection rates lower than 50% in each group of samples were discarded.Afterward, metabolic identi cation information was obtained by searching the laboratory's self-built database, integrated public database, AI database and metDNA.Unsupervised PCA (principal component analysis) was performed by the statistics function prcomp (Chen et al. 2009).Differential metabolites were determined by VIP (VIP ≥ 1), p value (p value < 0.05, Student's t test) and absolute Log2FC (|Log2FC| ≥ 1.0).VIP values were extracted from the OPLS-DA results (Thévenot et al. 2015), which also contain score plots and permutation plots, and were generated using MetaboAnalystR (Chong et al. 2018).The data were log transformed (log2) and mean centered before OPLS-DA.To avoid over tting, a permutation test (200 permutations) was performed.Identi ed metabolites were annotated using the KEGG Compound database (Kanehisa et al. 2000).

Combined analysis of the transcriptome and metabolome
Using the Hmisc R package (Harrell et al. 2019), Pearson correlation coe cients were calculated to detect associations between the quanti cation data of gene expression and the metabolite content, and only the detected associations with r > 0.80 and p value ≤ 0.05 were maintained.The network diagrams were visualized using Cytoscape (version 3.9.1,Cytoscape, San Diego, CA, USA).In addition, the differentially expressed genes (DEGs) and differentially expressed metabolites (DEMs) were mapped to the KEGG pathway database to obtain their common pathway information.

Results
Genome assembly, annotation and variation mining A total of ~ 78.58 Gb CCS long reads, with a read N50 of 17.44 kb and the longest read length of 50.46 kb, were generated by PacBio sequencing, covering approximately 33-fold of the genome.After hi asm assembly, we obtained 2343.06 megabases (Mb) of this allotetraploid genome with 725 contigs and a contig N50 of 88.96 Mb (Table 1).Hi-C oriented 2296.43Mb (98.01%) into 26 chromosomes successfully (Table 1; Fig. 1).BUSCO assessment showed that 95.90% of the complete single-copy genes were assembled from 1,614 orthologous homologous single-copy genes.Furthermore, the average LAI score was 17.27 (Fig. 1), indicating that the assembly was relatively complete and that a reference genome standard was reached.In addition, synteny analysis between this genome and TM-1 revealed that both the A and D subgenomes displayed high collinear relationships (Fig. S1).Moreover, synteny of YM11 with G. barbadense, G. darwinii, G. tomentosum, and G. mustelinum was performed (Fig. S2).
The assembled genome harbored 68.24% repeat sequences with proportions of 54.45%, 10.66% and 3.13% for LTR, TIR and non-TIR, respectively (Table S1).A total of 73,822 protein-coding gene models were annotated, of which 97.85% were homologous in the KEGG, eggNOG, GO and Pfam databases (Table S2).The mean coding sequence length, mean exon length and mean intron length were 1,196.75bp, 243.54 bp and 345.25 bp, respectively (Table 1).
Structural variation (SV) and presence/absence variation (PAV) have proven to be closely correlated with dynamic phenotypic traits across different species.We used the genome sequence of YM11 as a reference and ve other allotetraploid cotton accessions as queries to detect structural variations (SV) (Table 2; Fig. 2-3).Three kinds of variations, inversions, translocations and duplications, were identi ed in comparisons of G. darwinii vs. YM11, G. mustelinum vs. YM11, G. tomentosum vs. YM11, G. barbadense vs. YM11 and G. hirsutum vs. YM11 with total variation length proportions of 5.98%, 6.08%, 6.42%, 3.54% and 2.49%, respectively, indicating the genetic distance between YM11 and these ve queries.In each comparison pair, inversions possessed the minimum numbers, but the length was the largest, while the number of translocations was the largest with the shortest length.Comparison of G. mustelinum vs. YM11 harbored the largest number of each of three types of SVs, inversion, translocations and duplications, comprising 298, 1926 and 611 with a total length of 127.20 Mb, 11.60 MB and 3.75 MB, respectively.As expected, the lowest number occurred in G. hirsutum vs. YM11, which included 107, 324 and 51 inversions, translocations and duplications, respectively, with a total length of 54.04 Mb, 3.79 Mb and 549.65 Kb, respectively (Table 2).In the A subgenome, two visible inversions were shared between YM11 and G. darwinii, G. mustelinum, and G. tomentosum on chromosomes A02 and A06.On chromosome A08, however, two obvious inversions could be simultaneously detected in the ve comparisons (Fig. 2).In the D subgenome, only one visible inversion was observed in YM11 vs. G.darwinii, YM11 vs. G.mustelinum and YM11 vs. G.tomentosum simultaneously.In addition, a distinct inversion was found on chromosome D12 in YM11 vs. G.tomentosum (Fig. 3).Furthermore, we annotated the insertions and deletions in the SVs and found that there were 683 and 584 shared insertions and deletions, respectively (Fig. S3).The identi cation of these SVs and PAVs will lay a foundation for subsequent functional and evolution studies.Transcriptome and metabolome analysis of YM11 under cold stress To explore the response mechanism of YM11 to cold stress, we conducted transcriptome and metabolome analyses.The PCA of transcriptome and metabolome samples showed a good correlation for further bioinformatic processing (Fig. S4-S5).qRT-PCR results showed a high consistent expression quanti cation with RNA-seq (Fig. S6).Comparing RNA-seq at 0 h to 6 h, we identi ed a total of 8710 differentially expressed genes (DEGs), including 4278 upregulated expression genes and 4432 downregulated expression genes (Fig. 4A).These DEGs were clustered into two groups (Fig. 4B).
According to the GO analysis, the catalytic activity process was the largest enriched group in the biological process, followed by the abiotic stimulus and chloroplast process, while KEGG was enriched mainly in biosynthesis of secondary metabolites and metabolic pathways (Fig. 4C-D).Further analysis predicted 310 upregulated TFs and 312 downregulated TFs, mainly MYB, ERF, bHLH and NAC TF classi cations (Table S3).Notably, the typical ERF-type CBF TFs A12G30590 and D12G30970 were upregulated signi cantly, as were the CAMTA-type TFs D02G23960, RVE-type TFs D12G13640, D12G32290, D07G07080 and D12G32290, and BTF-type TFs D12G26250, A12G25920 and D05G13570.However, the RGA-type TFs D07G09110, A07G09020, A05G02960 and D05G02220 were signi cantly downregulated.For ICE-type TFs, D08G23510, A11G13330 and D11G13450 showed signi cant upregulation, while A07G29440, A11G04330, D12G25510 and A03G00570 were signi cantly downregulated.The PIF-type TFs D07G14720, A07G15190 and A11G11490 were signi cantly upregulated, and A11G21050, D11G21450, A07G08010, A03G08920 and D07G07990 were signi cantly downregulated (Table S4).These results suggested that TFs may play a role in coupling with cold stress in cotton.Then, we tested the metabolic pro les accordingly, and a total of 10818 metabolites were annotated, including 283 upregulated differentially expressed metabolites (DEMs) and 358 downregulated DEMs.The top enriched KEGG terms among the DEMs detected for all the compared samples were metabolic pathways and biosynthesis of secondary metabolites (Fig. S7).The results of Pearson's correlation coe cient calculation between DEGs and DEMs showed that there were eight strong interaction modules in ten metabolic pathways (Fig. 4E).The DEGs and DEMs were mapped to common bean biological pathways in the KEGG online database.These comapped pathways, the TCA cycle, metabolic pathways, avonoid biosynthesis, and biosynthesis of secondary metabolites were the signi cantly enriched pathways.Conjoint analysis revealed that DEGs and DEMs were commonly enriched in porphyrin metabolism, α-linolenic acid metabolism, alanine, aspartate and glutamate metabolism, etc., for a total of 6 KEGG pathways (Fig. S8).

Histone modi cation features of YM11 under cold stress
To gain insight into the histone-mediated modi cation changes under cold stress, we identi ed genomewide H3K4me3 and H3K9ac marks using corresponding recognition antibodies.The sequencing data are summarized in Table S5.Correlation clustering analysis showed that samples with different replicates can be clustered together with high correlation, indicating that the repeatability was good for the next step of analysis (Fig. S9).From the CUT&Tag results, we found that the read alignments of H3K4me3 and H3K9ac in the eight samples were concentrated in the chromosomal end region (Fig. 5A).At 0 h, we called 58777 and 26995 peaks for H3K4me3 and H3K9ac peaks, respectively.Under cold stress at 6 h, a total of 60096 and 15022 H3K4me3-modi ed peaks and H3K9ac-modi ed peaks were identi ed in the genome, respectively.The read densities of histone modi cation intensity (H3K4me3 and H3K9ac) surrounded the centers of genes (Fig. 5C).Structural annotation showed that over 50% of H3K4me3 and H3K9ac modi cations were located in the promoter region, followed by exons, and the least mutations were located in introns (Fig. 5B).Compared with 0 h, we detected 5620 differential peaks for H3K4me3 and 3619 differential peaks for H3K9ac, indicating temperature-induced chromatin changes in cotton (Table S6-S7).The differential peaks of H3K4me3 and H3K9ac varied from 354 bp to 4776 bp and from 374 bp to 3510 bp with average lengths of 1423 bp and 1142 bp, respectively (Table S6-S7).H3K4me3 served as an active mark and modi ed 2381 upregulated differential peaks and 3239 downregulated differential peaks.For H3K9ac, also an active mark, 1823 peaks were found to be signi cantly upregulated, and 1796 peaks were downregulated.Taken together, all the above analyses demonstrate that low-temperature treatment induces local or global chromatin changes.
To illustrate the possible functions of histone modi cation changes, genes with an enriched fold change ≥ 1.2 and a p value ≤ 0.05 were isolated and considered signi cantly differentially histone-modi ed genes (DMGs).We identi ed 5497 DMGs and 3531 DMGs for H3K4me3 and H3K9ac under cold stress, respectively, of which only 727 DMGs were shared (Fig. 5D), indicating that different response mechanisms existed between H3K4me3 and H3K9ac.

Gene expression with altered histone modi cation levels related to cold stress
To investigate the functional relationships between H3K4me3 and H3K9ac marks and gene expression, we analyzed RNA-seq data obtained for the same samples and inspected the expression of H3K4me3/H3K9ac-enriched genes.We screened 957 (H3K4me3) and 651 (H3K9ac) DMGs that were differentially expressed between 0 h and 6 h, of which 137 were shared (Fig. 6A).Of the 412 DMGs with boosted H3K4me modi cation levels under cold stress, 229 showed highly signi cant upregulation (p value = 3.34E-18, hypergeometric test), a relatively high ratio of 55.58% with gene expression for gain of the H3K4me3 mark and upregulation.Of the 343 DMGs that signi cantly gained the H3K9ac mark, 211 (61.51%) showed a highly signi cant upward trend in expression during the course of cold stress (p value = 6.15E-31, hypergeometric test).In contrast, DMGs with reduced H3K4me3 or H3K9ac modi cation levels showed an insigni cant upregulation trend (p = 1.00, hypergeometric test).This result indicated that elevated H3K4me3 and H3K9ac modi cation levels under cold stress could lead to a signi cant increase in gene expression.We also examined the expression levels of genes modi ed by different combinations of the two marks.As shown in Fig. 6B, genes marked by H3K4me3 and H3K9ac exhibited higher expression than all the expressed genes, and genes modi ed by a combination of the two histone marks showed higher expression than those modi ed by the H3K4me3 mark.All these results indicated that H3K4me3 and H3K9ac are associated with active gene expression.
Furthermore, those genes for which histone modi cations and gene expression were signi cantly and consistently upregulated were screened and subjected to GO and KEGG enrichment analyses (q value < 0.05).As shown in Fig. 6C (H3K4me3), GO was enriched in the terms "response to lipid", "response to gibberellin", etc. KEGG pathways enriched in plant hormone signal transduction, phototransduction, avonoid biosynthesis, etc. (Fig. S10A).The corresponding genes marked by H3K9ac were enriched in response to stress, response to organic substances and response to oxygen-containing compound GO terms (Fig. 6D).KEGG analysis showed enrichment in the plant hormone signal transduction and plantpathogen interaction pathways (Fig. S10B).The 229 genes with increases in both H3K4me3 and mRNA levels included 31 TFs, including bHLH, ERF, MYB and WRKY domain transcription factors.The number of genes with increased H3K9ac and mRNA levels was slightly smaller (211), and these included 34 TFs, mostly ERF, NAC and bHLH, indicating that these TFs play a role in the cold response (Table S8).

A high-quality genome of upland cotton can aid large-scale genetic variation mining
This study utilized high-depth HiFi data for contig assembly and Hi-C to assist the assembly of the YM11 genome at the chromosome level and evaluated the contiguity and accuracy of assembly via BUSCO and LAI (Fig. 1), which were higher than those of recent assemblies (Wang et al. 2022).The genome size and the number of predicted genes in the YM11 genome were slightly greater than those in upland cotton TM-1 (Hu et al. 2019), ZM24 (Hu et al. 2019), and CRI12 (Lu et al. 2022), which may be correlated with the higher integrity and continuity of the YM11 genome.Compared with previous releases, the proportion of genome repeats is similar to that of other upland cotton, both approximately 63% (Hu et al. 2019;Huang et al. 2020;Lu et al. 2022).Genome collinearity analysis revealed good collinearity between YM11 and TM-1 (Fig. S1).These results indicate that a new high-quality reference genome of upland cotton was established.Identifying structural variations (SVs) in the sequencing era is important for deciphering the function of these variations (Alonge et al. 2020).By comparing YM11 and ve other tetraploid cotton species, we identi ed a total of 2454 (G.darwinii vs. YM11), 3814 (G.mustelinum vs. YM11), 1832 (G.tomentosum vs. YM11), 2254 (G.barbadense vs. YM11), and 956 (G.hirsutum vs. YM11) large-scale variations, with sizes accounting for 6.53%, 6.10%, 6.80%, 3.68% and 2.63% of the YM11 genome, respectively (Table S2).Speci cally, three structural variations were shared on chromosome A08 of upland cotton except for the comparison Group G. tomentosum vs. YM11, which was reported by genome comparison of TM-1 and ZM24 (Hu et al. 2019).In addition, we observed more consistent SVs in the three comparisons (G.darwinii vs. YM11, G. mustelinum vs. YM11 and G. tomentosum vs. YM11) on chromosomes A02, A03, A06, A08 and A12 (Fig. 2) of the A subgenome and D03 and D04 of the D subgenome (Fig. 3), suggesting that these large-scale variations may play a role in species differentiation and trait establishment.The D subgenome harbored fewer variations than the A subgenome, indicating that the D subgenome is relatively conserved.This high-quality genome and large-scale variation dataset may accelerate the functional genome and subgenomic evolution study of Gossypium.

CUT&Tag is a promising histone modi cation research tool in plants
Chromatin pro ling features play important roles in regulating gene expression to address environmental stress.In this NGS era, chromatin immunoprecipitation sequencing (ChIP-Seq) technology, a method combining ChIP and NGS sequencing, enables the detection of DNA regions that interact with proteins on a genome-wide scale (Barski et al. 2007;Nakato et al. 2021).ChIP-seq has been applied to explore protein-DNA interactions and gene transcription regulation in plants (Pajoro et al. 2017;Tu et al. 2020).
However, the application of ChIP-Seq remains a challenging experimental operation due to the presence of cell walls, vacuoles and complex secondary metabolites in plants (Haring et al. 2007).In 2019, the CUT&Tag method was developed for epigenomic chromatin analysis (Kaya-Okur et al. 2019).As a newly emerging DNA-protein interaction technology, it uses an enzyme-tethering strategy to analyze different chromatin compounds, which can provide e cient sequencing libraries and high-resolution interactions.
Compared with ChIP-Seq, CUT&Tag technology showed easy library construction with higher sensitivity to the target, smaller amounts of starting materials, higher resolution, lower background signal and better repeatability, resulting in a promising alternative to ChIP-seq to study epigenomics (Kaya-Okur et al. 2020).In plants, Tao et al. (2020) rst developed a CUT&Tag protocol using the nuclei of cotton leaves and compared the e ciency of CUT&Tag with that of ChIP-seq.They revealed genome-wide histone H3K4me3 modi cation patterns and found that the CUT&Tag procedure was faster and had a higher pro ling resolution with a lower background signal than ChIP-seq.In this paper, we adopted CUT&Tag to investigate the region and alternation of H3K4me3 and H3K9ac modi cation patterns genome-wide, indicating major genic region deposition, which was in line with previous reports by ChIP-seq (Wang et   Chromatin structure serves as a direct template for the binding and recruitment of proteins that are involved in the regulation of transcription factor binding, such as histone modi cations.The elucidation of dynamic histone modi cation pro ling underlying cold-responsive genes facilitates understanding of the mechanisms that confer cold tolerance in plants (Kim et al. 2015).Although the causal relationship between chromatin dynamics and transcriptional changes is still lacking for plant cold stress responses, there is accumulating evidence revealing concurrent chromatin modi cations in transcriptional stress responses.Through genome-wide RNA-seq and CUT&Tag analysis, we found that elevated H3K4me3 and H3K9ac modi cation levels under cold stress may lead to a signi cant increase in gene expression, and those genes for which signi cant and consistently changing histone modi cations and expression level enriched in "response to plant hormone and stress" (Fig. 6C-D; Fig. S10), indicating H3K4me3 and H3K9ac modi cation levels were elevated, and subsequently, the activation of gene expression to deal with cold stress in cotton, which was in accordance with rice, tomato and barley (Roy et al. 2014;Zeng et al. 2019;Li et al. 2022).We further used MEME-ChIP analysis to nd the potential conserved motifs in DMGs under cold stress (Machanick et al. 2011).We found three AP2-EREBP TF recognition sites and one C2H2 recognition site in the top 10 conserved DNA sequences in H3K4me3-modi ed DMGs.In H3K9ac, it was found that the conserved DNA sequences of the top 10 motifs were all AP2-EREBP TF recognition sites (Table S9), suggesting that target gene sites may be recognized through these two types of TFs.Meanwhile, 31 TFs out of the 229 (differential H3K4me3) and 34 out of the 211 (H3K9ac) genes with increased levels in both histone modi cation and mRNA expression mostly included ERF TFs, suggesting that ERF TFs may play a crucial role.In addition, CBF, a typical ERF TF, has been found to be involved in the induction and response to cold stress in different species (Shi et al. 2018;Ding et al., 2022).In these types of gene families, both A12G30590 and D12G30970 are upregulated, and the upregulated ratio of A12G30590 is up to nine times.However, important questions regarding how chromatin receives the external signal in the rst place to regulate gene expression and how those target genes are determined for upregulation or downregulation in response to plant hormones or stresses at the genome-wide level are still largely unknown.In particular, tissue-speci c and cell-type-speci c chromatin regulation remain less understood.All these outstanding questions will be the focus of future research on stress responses (Wang et al. 2020).

Declarations
Author contribution statement YX and XL made the concept, edited and reviewed the manuscript.JW and YL prepared the CUT&Tag assay and data analysis.ZL and GZ planted the seedling of cotton and RNAseq analysis.YX conduct genome assembly and annotation analysis.All authors read and approved the nal manuscript.

Supplementary Files
This is a list of supplementary les associated with this preprint.Click to download.
al. 2017; Zhang et al. 2021).Afterward, the relationship between DMGs and their corresponding gene differential expression was revealed, indicating that it is informative to study abiotic stress response using CUT&Tag in upland cotton.Recently, Ouyang et al. (2022) described a simple and practical singlecell nuclear chromatin immunosplicing method combined with NGS sequencing, called snCUT&Tag, to analyze the single-cell-level H3K4me3 histone modi cation characteristics of Minghui 63 seedlings, demonstrating application scenarios of snCUT&Tag in plant single-cell-level chromatin conformation studies and providing a new technical route for single-cell epigenetic research in plants.Ever emerging CUT&Tag-devied technologies such as multi-CUT&Tag (Gopalan et al. 2021) and spatial-CUT&Tag (Deng et al. 2022) in animals will provide novel intelligence into chromatin modi cation and regulation studies in a multidimensional manner, which will inspire relevant studies for plants.

4. 3
Dynamic changes in histone modi cation state function in regulating the transcription of cold-responsive genes

Funding
This was sponsored by the Major Special Project of Xinjiang Uygur Autonomous Region "Cultivation and Technical Demonstration of New Varieties of Mechanized Cotton" (2021A0201-4).

Figures Figure 1
Figures

Figure 5 Dynamic
Figure 5