Fiber yield from multiple plants is a crucial parameter that affects the commercial production of textiles and has been studied in multiple crops [29, 30]. It is challenging to identify the key genetic drivers regulating this complex yet essential trait. The genes regulating the fiber content of hemp remain unexplored to date. Hence, in this study, we leveraged the potential of next-generation sequencing (NGS), a highly efficient, cost-effective, and accurate method compared with conventional methods, to develop new genetic markers modulating essential traits such as fiber content of crops [31–33]. Recently, SLAF-seq has emerged as a technique with exceptionally high resolution and efficiency for identifying SNP markers in specific populations. Molecular markers can be directly developed from paired-end analysis of the sequence-specific restriction fragment lengths. Our study has identified 389,687 SNP locus that can be used to develop SNP molecular markers. KEGG analysis of the identified genes from our study showed the enrichment of candidate regions closely associated with of amino sugar and nucleotide sugar metabolism, glycosphingolipid biosynthesis of both globo and ganglio series, one carbon pool by folate, basal transcription factors, lysine biosynthesis, photosynthesis, glycosaminoglycan degradation, and starch and sucrose metabolism pathways(Fig. 5). Moreover, quantitative analysis showed six genes (LOC115705530, LOC115705875, LOC115704794, LOC115705371, LOC115705688 and LOC115707511) that revealed their essential role in hemp fiber content regulation.
One of the candidate genes identified by our analysis, LOC115705530, is located in region 1. A comparative analysis revealed that this gene has the highest similarity with the gene At3G11280, which encodes the superfamily protein of replica-like homologous domain in Arabidopsis thaliana, and encodes for a protein that has a typical MYB domain. Studies involving the novel regulators of vascular development in Arabidopsis thaliana show that MYB has 40 interacting molecules [34]. Also, it has been reported that the gene FSM1, encoding an atypical MYB-like domain short protein found in tomato, negatively regulates the expansion of cells in the vascular bundle of fruit pericarp [35]. These studies collectively indicated an essential role of the MYB domain in regulating vascular development in multiple plants. MYB is also known to regulate lignin biosynthesis by recognizing AC elements in promoters of many lignin monomer biosynthesis genes [36] and is highly implicated in the regulation of secondary cell wall biosynthesis [37–39]. MYB, a transcription factor, is also closely associated with lignification in jute and ramie [40, 41], while MYB46-1, another MYB family transcription factor, regulates secondary cell wall and lignin biosynthesis in hemp [42]. However, further investigation is required to conclude the actual role of the gene identified in our study concerning hemp.
Another gene identified in our study was LOC115705875, located in region 2, which had only one candidate gene. The protein encoded by this gene had the highest homology with AUXIN1 (AUX1). AUX1 had been reported to encode a high-affinity auxin influx vector. In Arabidopsis thaliana, AUX1 belongs to the AUX/LAX multigene family, consisting of four highly conserved genes AUX1 and Like AUX1 (LAX) genes LAX1, LAX2, and LAX3. All four AUX/LAX family members are known to have the auxin uptake function [43]. Auxin is an essential hormone for plant growth and development. Auxin flow carrier AUX1/LAX transports auxin into cells and promotes xylem differentiation in stem and root tissues by increasing cytoplasmic auxin signaling to regulate vascular patterns and differentiation [44].
Our RT-qPCR analysis showed that LOC115705875, the expression pattern of which shows a negative regulation during the seedling and the technical maturity period, differed in terms of function, with respect to the other genes mentioned. Low fiber-containing hemp varieties required more auxin, which could be attributed to the fact that a lower fiber content requires more auxin for xylem differentiation. The expression of this gene at the technical maturity stage was significantly lower than that at the seedling period, which might be because of the increased requirement of auxin by hemp during early development. Guerriero et al. found that the expression of genes related to auxin metabolism was higher in the older stem nodes at the 6-week seedling stage [45]. Studies on the secondary growth stage of hemp showed a high biological activity of auxin during the deposition and remodeling of the primary cell walls [46]. In other words, there is an increased demand for auxin in the early development of secondary phloem fibers of hemp. These reports collectively validate our findings based on the expression pattern of this gene and suggest a probable mechanism by which auxin might regulate the fiber content of hemp.
The next candidate gene, LOC115704794, was located in region 3 and had the highest homology with AT4G00430, encoding aquaporin protein AtPIP1;4. Aquaporins(AQPs) are transmembrane channel proteins that regulate the intracellular and intercellular diffusion of water and other uncharged solutes such as glycerin, hydrogen peroxide, ammonia, small organic acids, urea, and metallic substances. AQPs are essential for maintaining water composition, osmotic regulation, signal transduction, detoxification processes, and the acquisition and transport of nutrients in various organisms [47, 48]. Plasma membrane intrinsic proteins (PIP) are one of five AQP subfamilies that have attracted particular attention for their potential to improve water retention and photosynthesis in plants [49]. PIPs can be divided into the subtypes PIP1 and PIP2 [50], which have an 80 % amino acid sequence homology. The main differences between the two groups lie at the N- and C-terminal ends, the ring A length, and the amino acid composition [51]. The PIP1 subfamily was initially thought to be nonfunctional due to its failure to localize to the plasma membrane [52]. However, later experiments have shown that PIP1 does have a functional role inside the cell. For example, it was found that the stem parenchymal cells’ response to drought stress was significantly upregulated by the PIP1 subfamily of water channels, rather than the PIP2 subfamily [53].
Moreover, it is known that AtPIP1;2 in Arabidopsis thaliana promotes the water conductivity of roots and rosette leaves [54], while AtPIP1;4 mediates the transport of CO2, an essential regulator of photosynthesis [55]. However, a recent study has reported negligible changes in photosynthetic efficiency and mesophyll conductivity of Arabidopsis aquaporin knockout mutants (PIP1;2, PIP1;3, PIP2;6), compared to the control group [56]. A study based on cotton plants showed that GhPIP1-2, which belongs to the PIP1 family, is mainly expressed during the fiber extension period of cotton. The gene expression was recorded to be the highest at five days post-flowering, suggesting a vital role in supporting the rapid water flow into the vacuoles during cell elongation in cotton [57]. The transcriptional abundance of the PIP gene family in Calotropis procera fiber cells is greater than that of cotton. Studies on long thorns, which are adapted to survive in harsh environmental conditions such as drought and salty and alkaline conditions, also verified the role of PIP aquaporin in the elongation of fiber cells. However, the study suggests a more critical role of PIP2 than that of PIP1 [58]. Therefore, we conclude that LOC115704794 may affect the fiber yield by regulating the CO2 transport for photosynthesis. We also suggest a similar role of LOC115704794 to that of GhPIP1-2, whereby it adjusts the length and width of fibers to affect the fiber content. However, further experiments are needed to determine the exact mechanism by which LOC115704794 functions in regulating the hemp fiber content.
The following candidate gene, LOC115705371, also located in region 3, had the highest similarity to the Arabidopsis gene AT2G28760, which encodes the protein UXS6. There are six UXS genes in the Arabidopsis genome, among which UXS3, UXS5, and UXS6 have the highest expression in the stem, mainly in the xylem cells and the interfascicular fibers. The proteins encoded by these genes, which regulate secondary wall formation, are directly regulated by the secondary wall NAC transcription factors. The simultaneous down-regulation/mutation of UXS3, UXS5, and UXS6 results in a significant decrease in the primary wall xyloglucan content, the thickening of the secondary wall, the content of xylan, and severe deformation of xylem vessels. Xylan and xyloglucan are the two main hemicelluloses in plant cell walls. Xylan is the main hemicellulose in the primary wall of dicotyledonous plants. Their biosynthesis requires a stable supply of sugar donors like UDP-xylose, which is synthesized by converting UDP-glucuronic acid via the activity of UDP-xylose synthase. UXS3, UXS5, and UXS6 play a significant role in the supply of UDP-xylose for the biosynthesis of xylan and xyloglucan [59]. Hence, we conclude that LOC115705371 may play a possible role in affecting the fiber content by regulating the hemicellulose content in hemp phloem fibers.
The following candidate gene, LOC115705688, located in region 3, showed the highest similarity with AT2G35040. This gene encodes the protein phosphoribosylaminimidazole formamide formyltransferase, which belongs to the AICARFT/IMPCHase two-enzyme family proteins [60]. GO analysis showed that this gene was involved in nucleotide transport and metabolism, while KEGG pathway analysis showed the involvement of this gene in one-carbon (C1) metabolism. C1 metabolism is closely related to lignin biosynthesis [61–63]. Hemp lignification is one of the differentiation processes of bast fiber and core fiber. A comparative gene expression analysis of hemp bast fiber and core fiber showed that most of the coding proteins were involved in regulating C1 metabolism and lignin biosynthesis [64]. We conclude that LOC115705688 may participate in the lignification process to regulate hemp fiber content.
The final candidate gene, LOC115707511, located in region 4, had the highest similarity with WRKY transcription factor WRKY70. WRKY transcription factors belong to one of the largest transcription factor families discovered to date and participate in regulating development, signal transduction, and stress defense processes in various plants. WRKY executes transcriptional activation or inhibition in either a homodimeric or a heterodimeric form [65]. WRKY70 has also been reported to regulate jasmonic acid and salicylic acid signaling [66]. A study of cotton plants showed that the laccase gene GhLac1 regulates fiber initiation and elongation by coordinating jasmonic acid and flavonoid metabolism [67]. WRKY transcription factors are also found to be up-regulated in jute during early fiber development [68]. The jasmonic acid biosynthesis-related gene expression in adult stem nodes was higher in the phloem [69]. Therefore, we conclude collectively from all the studies that WRKY70 may affect fiber development by regulating the jasmonic acid pathway.
Our study highlights our novel identification of genes involved in regulating the fiber content trait of hemp by using the integrated SLAF-seq and BSA methods. The findings of our study have the potential to lay a good foundation for determining regulators of hemp breeding via molecular marker-assisted selection. However, further in-depth analysis and functional characterization of these candidate genes by transformation or assessment of mutation are required to delineate their roles in regulating hemp fiber content conclusively.