Population genomics provides insights into the population structure and climate-driven 1 adaptation of Collichthys lucidus 2

9 Background: Understanding the genetic structure and local adaptive evolutionary 10 mechanisms of marine organisms is crucial for the conservation and management of 11 biological resources. Collichthys lucidus is an ideal candidate for investigating population 12 differentiation and local adaptation under heterogeneous environmental pressure. 13 Results: To elucidate the fine-scale genetic structure and local thermal adaptation of C. 14 lucidus , we performed restriction site-associated DNA tag sequencing (RAD-seq) of 177 15 individuals from 8 populations, and a total of 184,708 high-quality single nucleotide 16 polymorphisms (SNPs) were identified. All the results revealed significant population 17 structure with high support for two distinct genetic clusters, namely, the northern group 18 (populations DL, TJ, LYG, NT, ZS, and WZ) and southern group (populations XM and ZH). 19 The genetic diversity of the southern group was evidently lower than that of the northern 20 group, which indicated that the southern group was possibly under climate-driven natural 21 selection. In addition, a total of 314 SNPs were found to be significantly associated with 22 temperature variation. Annotations of temperature-related SNPs suggested that genes 23 involved in material (protein, lipid, and carbohydrate) metabolism and immune responses 24 were critical for adaptation to spatially heterogeneous temperatures in natural C. lucidus 25 populations. 26 Conclusion: In the context of anthropogenic activities and environmental change, the results 27 of the present population genomic work could make important contributions to the 28 understanding of genetic differentiation and adaptation to changing environments.

NetView P with kNN = 20 was applied to reveal the clustering relationships of all C. lucidus 120 individuals at a fine scale, and the results further supported the previous ADMIXTURE 121 clustering pattern with K = 2 and 3, showing that individuals were grouped into two different 122 clusters, with all individuals from Xiamen and Zhuhai clustered together (Figure 4). 123 Additionally, the hierarchical AMOVA (Table 3) showed that the FST across the eight 124 populations was 0. 07084, and there was significant genetic differentiation between the two 125 groups ("Dalian, Tianjin, Lianyungang, Nantong, Zhoushan, and Wenzhou" and "Xiamen and 7 Zhuhai", F CT = 0.13, P = 0.0004). 127 128

Candidate genomic regions under temperature-driven selection 142
In the present study, we first calculated the average ASST, LSST and HSST of eight sea 143 areas over 68 years (Table S2). Then, Bayenv software revealed a total of 314 SNPs 144 associated with temperature variables. Of these SNPs, 255 were associated with ASST, 56 145 were associated with LSST, and 4 were associated with HSST. There was little overlap 146 among SNPs associated with ASST, LSST and HSST. Thereafter, we used the 314 147 overlapping ASST-related, LSST-related, and HSST-related SNPs as the candidate 148 temperature-selected SNPs. Whole-genome sequences containing 314 SNPs were then used 149 for further annotations, and the results showed that 105 sequences containing 150 temperature-selected SNPs matched homologous protein sequences in the nonredundant 151 protein sequences (Nr) database (Table S3) Therefore, temperature may be an important selective force affecting the genotypic and 206 phenotypic compositions of local populations. In the present study, we detected a number of 207 important candidate SNPs that appeared to be affected by temperature heterogeneity. Notably, 208 different numbers of candidate SNPs were obtained using different datasets (ASST, LSST, 209 and HSST). We speculate that the LSST is more likely than the HSST to affect the spatial 210 distribution of C. lucidus. 211 The GO annotation results showed that the sequences containing the 212 temperature-selected SNPs were mainly involved in metabolic and cellular processes, and 213 their functions were mainly in binding and catalytic activity. This suggests that temperature 214 differences drove adaptive differentiation in parts of the genome of C. lucidus populations, 215 13 ultimately leading to differences in physiological regulation.

Conclusion 233
We revealed the population genetic structure and genomic regions under 234 temperature-driven selection based on genome-wide SNPs in eight C. lucidus populations. 235 Genetic structure analysis revealed significant population structure, with high support for two 236 distinct clusters among the eight populations. We speculate that long-term geographic 237 isolation during the glacial maximum may have intensified the development of limited 238 dispersal potential, reproductive isolation and local adaptive heterogeneity between the two C.

RAD data processing and SNP filtering 265
All raw reads in FASTQ format were filtered using Trimmomatic software (version 0.36; 266 [26]) based on the following criteria: (I) raw reads with sequencing adaptors; (II) a ratio of 267 unidentified nucleotides in the raw reads ≥8%; and (III) raw reads that had more than 50% of 268 base calls with a low quality score (Q<30). After filtering, we downloaded the whole-genome 269 sequence of C. lucidus [27] and used it as a reference sequence for subsequent SNP filtering. 270 The whole-genome sequence was first constructed into an index file using BWA software 271

Outlier SNP detection and annotation 306
The genotype-environment association method implemented in Bayenv (version 2.0; 307 [39]) was applied to detect putative SNPs correlated with temperature variations. First, we 308 obtained the high-resolution mean lowest sea surface temperature (LSST), mean annual sea 309 surface temperature (ASST) and mean highest sea surface temperature (HSST) data of eight 310 sea areas over 68 years (from 1950  identify putative SNPs correlated with temperature variations, and a Bayes factor (BF) value 316 higher than 10 was set as the filtering condition for putative SNPs. We repeated the Bayenv 317 analysis four times to avoid false positives, and only the SNPs that were continuously 318 screened were used for subsequent analysis. Thereafter, we used the overlapping 319 ASST-related, LSST-related, and HSST-related SNPs as the candidate temperature-selected 320 SNPs. To determine the genetic mechanisms underlying temperature-related adaptive 321 differentiation between C. lucidus populations, gene sequences containing these SNPs were 322 then annotated using Blast2GO software [40].  Table S1. Sequence information for all individuals. Table S2. The average ASST, LSST and HSST of eight sea areas. Table S3. Nr annotation information for whole genome containing temperature-selective SNPs.