Overview of the comparative transcriptome sequencing using ovules of Xu142fl and Xu142 at early fiber initiation stage
In order to study Xu142fl and Xu142 in the context of fiber initiation, we first performed transcriptome sequencing by mixing ovules at -3 and -1 DPA of Xu142fl and Xu142 respectively, before obvious fiber initials could be observed from the epidermis of the wild type cotton seeds under optical microscope, by setting 3 biological replicates for each variety. After removing adapter contamination and low quality tags, a total of 66.12-72.19 million clean reads were generated from each library, with clean read ratios between 92.51%-93.74%, and ~ 95% of the clean reads can be mapped to cotton TM-1 genome (Table 1). Besides this, 49381 novel transcripts were identified, including 36093 candidate protein coding and 13288 noncoding transcripts and 5604 novel genes were predicted.
After calculating the expression levels of each gene in each sample; Pearson correlation (R2) was calculated based on the whole gene expression profile between each sample pair among the total 6 samples. The result showed that the correlations between biological repeats were 0.966-0.993 for the mutant pairs, and 0.971-0.996 for the wild type pairs, but lower (0.921-0.960) between mutant and wild type pairs (Additional file 1 Figure 1). The results indicated high uniformity between biological repeats. Finally, 5516 DEGs including 1840 upregulated and 3676 downregulated genes were identified in the mutant compared with the wild type.
Table 1. Overview of the data quality and genome mapping of the transcriptome sequencing of Xu142 and Xu142fl.
Ovule Sample
|
Total Raw Reads (M)
|
Total Clean Reads (M)
|
Clean Reads Q20(%)
|
Clean Reads Ratio (%)
|
Total Mapped (%)
|
Uniquely Mapped (%)
|
Xu142-1
|
70.82
|
66.39
|
98.56
|
93.74
|
95.89
|
76.20
|
Xu142-2
|
75.39
|
70.42
|
98.59
|
93.41
|
95.55
|
75.11
|
Xu142-3
|
73.31
|
68.7
|
98.59
|
93.71
|
95.95
|
75.99
|
Xu142fl-1
|
70.61
|
66.12
|
98.53
|
93.64
|
95.75
|
74.58
|
Xu142fl-2
|
75.72
|
70.78
|
98.55
|
93.48
|
95.85
|
75.43
|
Xu142fl-3
|
78.04
|
72.19
|
98.6
|
92.51
|
95.87
|
75.52
|
Note: M means megabase.
Complicated upstream transcription and downstream biosynthesis and metabolism events occurred during early lint fiber initiation
To investigate the biological processes and functions attributed to the DEGs, GO enrichment analysis of the downregulated and upregulated DEGs were conducted respectively. The results showed that the downregulated genes were enriched in 15 level 3 biological process (BP) terms including transcription (262 genes), RNA (288 genes) and nucleic acid metabolism (339 genes), nucleobase-containing compound biosynthesis (267 genes) and metabolism (358 genes), heterocycle biosynthesis (275 genes) and metabolism (377 genes), aromatic compound biosynthesis (273 genes) and metabolism (383 genes) and organic cyclic compound biosynthesis (276 genes) and metabolism (380 genes), photosynthesis (42 genes) and light harvesting (15 genes), respond to chitin (7 genes), and plant cell wall organization (20 genes) (Fig. 1a; Additional file 2 Table S1), with molecular functions (MF) including DNA binding transcription factor activity (205 genes), transcription regulator activity (208 genes) and dioxygenase activity (32 genes), and DNA (404 genes), nucleic acid (585 genes), organic cyclic compound and heterocyclic compound binding activities (1008 genes) (Fig. 1b; Additional file 3 Table S2). These results demonstrated that the essential transcriptional regulations in Xu142fl were impaired and led to the aborted lint fiber initiation.
Compared with the significantly enriched GO terms for the downregulated DEGs, the enriched GO terms for the upregulated DEGs were fewer on the BP level, and no GO terms were found on the MF level. The significantly enriched level 3 terms was DNA replication (21 genes), DNA-dependent DNA replication (11 genes), DNA replication initiation (8 genes) and phospholipid biosynthesis (19 genes) (Fig. 1c; Additional file 4 Table S3). Overall, these findings suggest that DNA replication and phospholipid biosynthesis in the process of cell mitosis division, were inhibited in the fiber initials of wild type during lint fiber initiation.
Transcription factor expression dynamics during early fiber initiation
Next, to characterize the complicated transcription regulations during lint fiber initiation, the TF coding genes were firstly predicted and then filtered to obtain the differentially expressed TF genes closely related to lint fiber initiation (only TF family containing more than two genes were considered), and finally 490 genes belonging to 26 TF families were identified, consisting of 424 downregulated genes and 66 upregulated genes. Moreover, most TF families consisted more downregulated genes (57.14%-100%), except for LOB domain-containing and MADS domain-containing families which contained more up-regulated genes (76.92% and 71.43% respectively) (Table 2).
As shown in Table 2, among the down-regulated TF families containing members more than 20, the biggest TF family was AP2-EREBP, which contained 123 DEGs, followed by MYB (54 DEGs), WRKY (41 DEGs), NAC(33 DEGs), C2C2-Dof (33 DEGs), bHLH (29 DEGs) and GRAS (23 DEGs). Subsequently, we chose six TF families including MYB, bHLH, NAC, C2C2 Dof, GRAS, and WRKY containing members ranging from 23-54 to plot expression heat maps. The results showed that 6 Malvaceae-specific MML homologs (Wu et al., 2018) including two MML3, one MML4, two MML8 and one MML9 were all downregulated. The MML3 on chromosome D12 chromosome (Gh_D12G1628) had higher expression level in the wild type and were much more downregulated in the mutant than the one on A12 chromosome (Gh_A12G1503) (Fig. 2a). Although most members were downregulated in the identified TF families, subclasses of members were still upregulated for 5 TF families, including the MYB family which contains one RL6, 3 MYB44 and 4 GAM1 (Fig. 3a), bHLH family which contains 5 members including BHLH82, BHLH130, BEE3 and two novel genes (Fig. 2b), NAC family which contains 6 NAC genes, three of which annotated as NAC100 had dominant expression in both varieties (Fig. 2c), the GRAS family which contains 2 members (Fig. 2d), and WRKY family which contains 7 members, with the genes annotated WRKY48 and WRKY65 had higher expression levels while expression of the other five genes were lower in both varieties (Fig. 2e), except for the C2C2 Dof family genes which were all down-regulated (Fig. 2f).
Table 2 Statistics of the differentially expressed transcription factor genes.
TF Family
|
Total DEGs
|
Upreg
|
Downreg
|
Upreg (%)
|
Downreg (%)
|
ABI3VP1
|
7
|
1
|
6
|
14.29
|
85.71
|
AP2-EREBP
|
123
|
4
|
119
|
3.25
|
96.75
|
bHLH
|
29
|
9
|
20
|
31.03
|
68.97
|
C2C2-Dof
|
33
|
0
|
33
|
0
|
100
|
C2C2-GATA
|
6
|
0
|
6
|
0
|
100
|
C2H2
|
17
|
2
|
15
|
11.76
|
88.24
|
C3H
|
11
|
0
|
11
|
0
|
100
|
FAR1
|
5
|
1
|
4
|
20
|
80
|
G2-like
|
7
|
2
|
5
|
28.57
|
71.43
|
GRAS
|
23
|
2
|
21
|
8.7
|
91.3
|
GRF
|
4
|
0
|
4
|
0
|
100
|
HSF
|
12
|
1
|
11
|
8.33
|
91.67
|
LIM
|
3
|
1
|
2
|
33.33
|
66.67
|
LOB
|
13
|
10
|
3
|
76.92
|
23.08
|
MADS
|
7
|
5
|
2
|
71.43
|
28.57
|
mTERF
|
7
|
1
|
6
|
14.29
|
85.71
|
MYB
|
54
|
8
|
46
|
14.81
|
85.19
|
NAC
|
33
|
6
|
27
|
18.18
|
81.82
|
OFP
|
11
|
0
|
11
|
0
|
100
|
PLATZ
|
4
|
0
|
4
|
0
|
100
|
SBP
|
4
|
1
|
3
|
25
|
75
|
TCP
|
5
|
0
|
5
|
0
|
100
|
Tify
|
14
|
0
|
14
|
0
|
100
|
Trihelix
|
7
|
3
|
4
|
42.86
|
57.14
|
WRKY
|
41
|
7
|
34
|
17.07
|
82.93
|
zf-HD
|
10
|
2
|
8
|
20
|
80
|
Total
|
490
|
66
|
424
|
|
|
Note: Upreg means the upregulated genes; Downreg means the downregulated genes.
Main GhMMLs contributed to lint fiber initiation
The 9th subfamily R2R3-MYB transcription factors MMLs is considered as Malvaceae-specific through evolutionary analysis [12] and among the 10 GhMMLs from GhMML1 to GhMML10, GhMML3 and GhMML4 had been demonstrated responsible for fuzz and lint fiber initiation respectively [11, 12] . Our TF classification had revealed that GhMML3, GhMML4, GhMML8, and GhMML9 were involved in lint fiber initiation, to confirm that, we conducted RT-PCR of all ten GhMMLs in n2NSM and Xu142fl during early lint fiber initiation period from -1 DPA to 1 DPA, given the fact that n2NSM and Xu142fl are all naked seed mutants and the only difference is whether the lint fiber initiates or not [12]. The results showed that 8 MMLs can be detected except GhMML8 and GhMML9 (Fig. 3), and 4 were down-regulated including GhMML1, GhMML3, GhMML4 and GhMML7 in Xu142fl compared with n2NSM,. Further investigation of their expression patterns by qRT-PCR confirmed that GhMML3, GhMML4 and GhMML7 were significantly down-regulated in Xu142fl compared with n2NSM at all three time points while no obvious expression differences were observed for GhMML1 (Fig. 4). We also found that expression levels of GhMML3 and GhMML4 in n2NSM were decreased, while that of GhMML7 were increased from -1 DPA to 1 DPA, implying different mechanisms between GhMML7, and GhMML3 and GhMML4 (Fig. 4).
Expansins enriched in cell wall organization may contributed to early lint fiber initiation
Cell wall reorganization is an essential event during fiber development involving multiple enzymes and wall proteins [5]. Here, we had a detailed investigation of the GO term-plant cell wall organization which contains 20 genes, and the results showed that 17 were EXPA encoding genes, including 4 EXPA1, 1 EXPA2, 6 EXPA4, 3 EXPA8 and 3 EXPA15, one was Pectinesterase (PE) encoding gene, and 2 were COBL10 (protein transport protein SEC61 subunit alpha) encoding genes, and they were all downregulated in Xu142fl comparing with that in the wild type (Table 3). It was worth noting that the EXPA2 gene, one EXPA4 encoding gene (Gh_A10G2323) and the two COBL10 encoding genes had very low expression levels in the wild type and hardly detectable in the mutant, implying that they might not the dominant genes for early lint fiber initiation. Of special note, data indicated that EXPAs might be the most important cell wall proteins for the early fiber cell initiation.
Table 3. Expression differences of the cell wall organization related genes.
Gene ID
|
Gene Annotation
|
Xu142_RPKM
|
Xu142fl_RPKM
|
log2FoldChange
(Xu142fl/Xu142)
|
Gh_A06G0018
|
EXPA1
|
8.39
|
3.13
|
-1.41
|
Gh_D05G1754
|
EXPA1
|
35.18
|
13.74
|
-1.36
|
Gh_A05G1576
|
EXPA1
|
27.35
|
11.60
|
-1.25
|
Gh_D12G1759
|
EXPA1
|
19.72
|
9.50
|
-1.06
|
Gh_A13G0672
|
EXPA15
|
16.72
|
4.63
|
-1.86
|
Gh_D13G0786
|
EXPA15
|
20.71
|
6.30
|
-1.71
|
Gh_A03G0885
|
EXPA15
|
48.83
|
21.23
|
-1.21
|
Gh_D10G1145
|
EXPA2
|
0.69
|
0.00
|
-6.20
|
Gh_A10G2323
|
EXPA4
|
0.32
|
0.00
|
-5.12
|
Gh_A07G0902
|
EXPA4
|
30.35
|
9.97
|
-1.59
|
Gh_D09G1463
|
EXPA4
|
125.90
|
43.37
|
-1.54
|
Gh_A04G0707
|
EXPA4
|
27.78
|
11.04
|
-1.35
|
Gh_A09G1454
|
EXPA4
|
63.47
|
27.08
|
-1.23
|
Gh_D07G0974
|
EXPA4
|
28.64
|
13.56
|
-1.01
|
Gh_A05G3493
|
EXPA8
|
17.43
|
2.33
|
-2.93
|
Gh_D04G1924
|
EXPA8
|
3.52
|
1.60
|
-1.17
|
Gh_D10G1861
|
EXPA8
|
11.23
|
5.25
|
-1.11
|
Gh_A10G1502
|
Pectinesterase
|
26.62
|
10.09
|
-1.40
|
Gh_D06G1606
|
COBL10
|
0.27
|
0.02
|
-3.70
|
Gh_A06G1281
|
COBL10
|
0.78
|
0.09
|
-3.11
|
Note: COBL10 encodes a protein transport protein SEC61 subunit alpha