Phylogeny of NFCs and distribution of RWP-RKs
To learn about the distribution of RWP-RKs across the NFC, based on gene duplication events, we built a phylogenetic tree with high bootstrap values for 26 species from three clades, defined as clade I (Brassicales), clade II (Fagales, Cucurbitales, and Rosales), and clade III (Fabales), with A. thaliana as the outgroup. Of the 25 species from the NFC, 21 form nodules, whereas the four remaining species lack the capacity for nodulation (Fig. 1A). A total of 292 RWP-RK domains and 156 PB1 domains has been identified in these analyzed species (Table S1). RWP-RKs appear to be randomly distributed in each order, indicating that genome-wide gene duplication events did not contribute to the expansion of RWP-RKs in specific orders. Moreover, it appears that the RWP-RKs did not expand in nodulated plants (Fig. 1A and Fig. S1). For example, Begonia fuchsioides (which lacks nodules) contains the second-highest number (22) of RWP-RKs after G. max, and A. thaliana contains 14 RWP-RKs, more than many nodule-forming plants. These results suggest that the expansion of RWP-RKs involved in the nitrate signaling pathway and nodule inception might represent an adaptive response of plants to diverse nitrogen conditions. Compared to RKDs, NLPs (such as NIN, which is specifically required for nodulation in the NFC) contain an additional PB1 domain, which increases their interaction with other proteins [33, 34]. Despite the similar median values for each domain number between nodule-forming and non-nodule-forming plants, the numbers of RWP-RK and PB1 domains are more variable in non-nodulating than in nodulating plants, except for the outlier, large number of RWP-RK in G. max (possibly because of its two genome-wide duplications) (Fig. 1B). We speculate that non-nodulating plants require more RWP-RK and PB1 domains to improve NUE due to their inability to fix nitrogen from atmospheric inorganic nitrogen.
We also analyzed the consensus motifs of both the RWP-RK and PB1 domains within each species. The consensus motifs could not be analyzed in some species with few instances of these domains, for example for both the RWP-RK and PB1 domains in Lupinus angustifolius and for PB1 in Vigna angularis and Lotus japonicus (Table S1). A few species have large insertions in their RWP-RK domains, such as A. thaliana AT2G43500 (AtNLP8) and Discaria trinervis Distr2169S18482. In addition to conserved motifs commonly existed in almost RWP-RKs, similar insertions also occur in PB1 domains to some extent (Fig. S2).
Analysis of RWP-RK and PB1 domains in the NFC
Despite the important roles of RWP-RKs in both nodulating and non-nodulating plants [7, 9, 10], the features of these proteins within the NFC, such as protein motifs, gene structures, and evolutionary relationships, have not been investigated in detail. We therefore analyzed the consensus domains of RWP-RK and PB1 in nodule-forming and non-nodule-forming plants separately. These domains are strongly conserved but contain some modified insertions at different positions (Fig. 2A). Alignment of the domains revealed that the 49th site (K, Lys) and 63th site (R, Arg) are conserved across all RWP-RKs (Fig. S3), suggesting that these sites are vital for RWP-RK activity. Except for Arachis ipaensis Araip.KR88K, B. fuchsioides Begfu255S10103, B. fuchsioides Begfu91828S44750, which have lost the RWP-RK domain, all species contain the conserved RWPxRK signature, with some modifications, such as HWP-RK (HWPHRK of Araip.YWB61, Fabales), NWP-RK (NWPHRK of C.cajan_36762, Fabales), WWP-RK (WWPYRK of Datgl206S24145, Cucurbitales), HWP-RK (HWPSRK of Datgl229S25120, Cucurbitales), KWP-RQ (KWPHRQ of Glyma.04G054800.1.p, Fabales), and KWP-RK (KWPQRK of Glyma.06G054900.1.p) (Fig. S3).
Interestingly, almost all of these modifications occurred in the first amino acid of the conserved RWP-RK motif in nodule-forming Fabales and Cucurbitales plants. Family members with losses or modifications of RWP-RK might be nonfunctional or might have divergent function. Some proteins (Araip.16Y1B, Araip.6QW2Y, Araip.G0MMI, Vang0010ss03040.1, Vang01g00130.1, Vradi06g00060.1) containing only PB1 domains but lacking RWP-RK domains might have different origins or play different roles and were therefore excluded from our analysis. The 4th K, 13th R, 62th D, and 75th D sites of the PB1 domains of RWP-RKs are conserved across nodulating and non-nodulating plants (Fig. S4).
To explore the possible origins of all the identified RWP-RKs among the 26 species, we grouped the RWP-RKs into different orthogroups. An orthogroup is defined as a set of genes that descended from a single gene in the last common ancestor of the analyzed species [17]. Except for one A. thaliana RWP-RK (AT4G38340) and six Arachis duranensis RWP-RKs (Aradu.I1BME, Aradu.TG0QF, Araip.377BK, Araip.6M6N8, Araip.YB35N, Araip.YWB61), which lack orthogroups (and are possibly orphan genes), the 285 remaining RWP-RKs clustered into 11 orthogroups containing 1–69 RWP-RKs genes, suggesting that these genes have diverse origins (Fig. 2B and Table S3).
We predicted the length, isoelectric point, and subcellular localization of each RWP-RK using various databases [24, 35]. The isoelectric points of the 292 RWP-RKs range from 4.73 for Drydr146S16094 to 10.6 for Aradu.I1BME, and the molecular weights range from 7.14 kDa for Aradu.I1BME to 156.01 kDa for Aradu.G4SB3. Most proteins were predicted to localize to the nucleus (Table S4).
Phylogeny and characteristics of the RWP-RKs
To investigate the phylogenetic relationships among RWP-RK family members in the 26 species, we constructed a phylogenetic tree of 292 RWP-RKs via the neighbor-joining method with 1000 bootstrap values. To limit issues related to high divergence between proteins, we selected the RWP-RK domains with 30 additional amino acids upstream and downstream of the domains for alignment and phylogenetic analysis. The tree formed six clades based on the relationships of the NLPs and RKDs in A. thaliana. The NLP subfamily clustered into a single clade with three subclades, NLP-1 (AtNLP1, AtNLP5), NLP-2 (AtNLP6, AtNLP7), and NLP-3 (AtNLP8, AtNLP9), containing all NLPs with PB1 domains. The RKD subfamily includes RKD-1 (AtRKD4, AtRKD5), RKD-2 (AtRKD1, AtRKD2, AtRKD3), and RKD-3 (Fig. 3A). The NLP subfamily also includes several non-NLPs without PB1 domains (Glyma.06G000400.1, vigan.Vang06g08410.1, Araip.YWB61, Araip.YWB61, vigan.Vang02g05230.1, Araip.377BK, and Araip.5C6JK), perhaps due to partial gene duplications of the NLP genes [8].
To learn more about the features of each clade of RWP-RKs in the phylogenetic tree, we investigated their gene structures using GSDS 2.0 [36] and predicted additional conserved motifs using MEME 5.0.1 [25]. Due to the lack of exon information in the annotated gff3 file of Trifolium subterraneum (with 13 RWP-RKs), we omitted those data and used the 279 remaining RWP-RKs for analysis. The average number of exons in the NLP subfamily and RKD subfamily is 4.9 (ranging from 2–13) and 4.0 (1–9), respectively. The number of exons in each subclade of NLP subfamily members is 4.1 for NLP-1 (2–11), 5.3 for NLP-2 (3–8), and 5.6 for NLP-3 (4–13). The number of exons in each subclade of RKDs is 3.7 for RKD-1 (1–6), 5.6 for RKD-2 (2–9), and 3.5 for RKD-3 (3–5) (Fig. 3B and Table S5). Therefore, the average number of exons is higher in the NLPs than in the RKDs.
Of the 50 consensus motifs predicted by MEME, in addition to the three common motifs (including known RWP-RK and PB1 motifs) in both the RKDs and NLPs, the NLP subfamily contains 39 enriched motifs, whereas the RKD subfamily contains only 8 enriched motifs. Each subclade contains some unique motifs, pointing to the diverse roles of RWP-RKs, such as motif #24 in NLP-1, motif #18 in NLP-2, and motif #23 in NLP-3. Based on predictions from NetPhos 3.1 Server [37], motif #24 contains a predicted phosphorylation site (Y) with a score of 0.89; motif #18 contains a predicted phosphorylation site (S) with a score of 1.00; and motif #23 contains multiple predicted phosphorylation sites (S) with high scores. In addition, members of RKD-3 (only in the NFC) contain several unique motifs, including motif #34, motif #37, and motif #41 (Fig. S5, Fig. S6 and Table S6). These unique motifs in each subclade may increase the number of specific interactions with other proteins.
Statistical analysis of intron phases, that is whether the intron disrupts a codon, revealed that RWP-RKs contain the most phase 0 introns, which cause no disruption of a codon), followed by phase 2 and phase 1, which disrupt the codon between bases 2 and 3 or bases 1 and 2, respectively. However, RKDs contain more phase 1 introns than NLPs with phase 1 introns (Fig. 3C). Overall, the results of exon number and intron phase analyses were consistent with the phylogenetic analysis, but there were some exceptions. For example, Cerca58S27147 has more short additional exons than other members of the NLP-1 clade but lacks additional protein motifs, suggesting that additional exons of this gene might play a regulatory role in Cercis canadensis (Fig. S6).
Comparison of RWP-RKs in non-nodulating vs. nodulating plants
A. thaliana has many advantages to study interactions between diazotrophic bacteria and dicots [14], and therefore the knowledge of A. thaliana is helpful to translate biological knowledge from model organisms to crops. To explore the relationship of RWP-RKs in nodulating and non-nodulating plants, we select A. thaliana (non-nodulation), G. max (nodulation), and P. vulgaris (nodulation) for subsequently comparative analysis. We classified the RWP-RKs from A. thaliana (14), G. max (28), and P. vulgaris (12) into three clades: unique RKDs in G. max and P. vulgaris (Clade I), RKDs in all three species (Clade II, including AtRKD1–AtRKD5), and NLPs in all three species (Clade III, including AtNLP1–AtNLP9). Analysis of gene structure revealed that Clade III genes contain many more exons than those of clade I, which is consistent with the finding that additional protein motifs (such as the important PB1 domain) are essential for the functioning of NLPs (Fig. 4A, Fig. S6 and Table S6). In addition to motif 1 and motif 3 (motif RWPxRK), which are present in all three clades, clade II contains a unique motif (#10), indicating that this motif plays a unique role in the clade. Based on the phylogenetic analysis and the presence of protein motifs in different positions, clade III was classified into three subclades: subclades IIIa (including AtNLP6 and AtNLP7), IIIb (including AtNLP8 and AtNLP9), and IIIc (including AtNLP1–AtNLP5). Almost all members of subclade IIIa contain a unique motif (#21) in their centers and motif #25 and motif #18 at their C termini; subclade IIIb members contain a unique motif (#23) at their C termini; and subclade IIIc members contain a unique motif (#24) at their N termini (Fig. 4A and Fig. S5). The additions and deletions in genes at different positions within clade III might have given rise to different functions, such as different cellular localizations and protein interactions.
Despite the important roles of nitrates in plant growth, N limitation, including N starvation or low N, is essential for nodulation in legumes [11, 12]. To explore the relationship between N limitation and nodulation, we integrated time-series transcriptome datasets from A. thaliana, including root samples treated with KCl (defined as N starvation) and KNO3 (defined as N supplementation) [38], as well as expression atlases from G. max and P. vulgaris [39, 40]. In A. thaliana, all AtRKDs are not expressed in roots, whereas AtNLPs are differentially regulated. Specifically, AtNLP1 and AtNLP3 are downregulated and other AtNLPs are upregulated, indicating that the different NLPs have different effects on plant responses to N starvation (Fig. 4B).
NLP genes are expressed in different tissues, with some specifically expressed in the nodules of G. max and P. vulgaris (Glyma.02G311000, Glyma.06G000400, Glyma.04G000600, Glyma.14G001600, Phvul.008G291800, and Phvul.009G115800). These nodule-specific genes clustered together and are closely related to AtNLP1 and AtNLP2 within subclade IIIc (Fig. 4A, C, D). However, AtNLP1 and AtNLP2 showed opposite fold changes in response to N starvation, indicating that regulatory elements have rapidly evolved. Taking the evolutionary relationships and expression patterns of model dicot and legume crops together, we conclude that compared to subclades IIIa and IIIb, the acquisition of additional protein motifs in subclade IIIc has provided the prerequisite for nodule inception and that the further evolution of regulatory elements within IIIc will be crucial for the development of this process.
The connected genes of AtNLPs in coexpression networks provide the prerequisite for nodulation
Our results demonstrate that AtNLPs are widely involved in plant responses to N starvation. Weighted correlation network analysis (WGCNA) is a powerful tool for dividing genes with correlated expression patterns into different modules with biological significance [15, 27, 41]. To investigate the possible relationship between N starvation and nodulation, we explored differences in the modules containing AtNLPs under N starvation vs. N supplementation. We used WGCNA to construct a weighted network from N-starvation datasets with rlog normalization (Fig. S7) implemented in the DESeq2 package [31]. Expression correlation (rho = 0.99) is more conserved than connectivity correlation (rho = 0.84) under N starvation (Fig. S8), suggesting that changes in connected neighbors play more important roles than changes in expression levels under N starvation. AtNLPs were present in five of the 12 modules (Fig. 5A–E). Different modules showed different expression patterns across a series of time points, indicating that their component genes play different roles under N starvation.
GO enrichment analysis of genes in the biological process category revealed no commonly enriched biological process in the five modules (Fig. 5F and Table S7). In the purple module (with AtNLP2, AtNLP4 and AtNLP9), in which genes were continuously upregulated after 20 min under the treatment of N starvation, we detected a high proportion of unique GO terms associated with transport (20/54), such as “calcium ion transport” (GO:0006816) and “calcium ion transmembrane transport” (GO:0070588), which may be related to “calcium spiking”, a symbiotic signaling event. Genes in the blue module (with AtNLP3) were relatively upregulated at the early stage (10–15 min of treatment). We discovered uniquely enriched GO terms in the blue module, such as “endocytosis” (GO:0006897). The unique GO terms in the green module (with AtNLP6 and AtNLP7) included many important terms, such as “cell wall organization or biogenesis” (GO:0071554), “cell wall organization” (GO:0071555), “plant-type cell wall organization or biogenesis”(GO:0071669), “cell wall modification” (GO:0042545), “root hair elongation” (GO:0048767), “root hair cell development” (GO:0080147), and “auxin-activated signaling pathway”(GO:0009734). Interestingly, the blue and green modules were both enriched in “symbiosis, encompassing mutualism through parasitism” (GO:0044403), with 36 and 24 genes, respectively (Table S7). Although genes in both modules were highly expressed only at 10–15 min under the treatment of N starvation, the genes enriched for the GO term “symbiosis, encompassing mutualism through parasitism” did not overlap, highlighting the independent roles of genes regulated by AtNLP3, AtNLP6, and AtNLP7 within each module. Also, this result suggested that response to N starvation at early stage may be essential for symbiosis.
The connectivity value of AtNLP3 within the blue module was 0.32 under N-starvation conditions and 0.08 under N-supplementation conditions. AtNLP3 was positively correlated with various pathogen/virus-related genes, such as AT5G03210 (ATDIP2, resistance to virus), AT5G08790 (ANAC081, may function as a repressor of pathogenesis-related proteins), AT1G32400 (TOM2A, viral replication complex formation and maintenance), AT5G48160 (OBE2, transport of virus in host), AT5G10270 (CDKC;1, response to virus), AT5G42950 (EXA1, defense response to virus), AT3G11650 (NHL2,defense response to virus), AT5G23570 (ATSGS3, defense response to virus), AT2G25620 (ATDBP1, regulation of defense response to virus), AT2G23350 (PAB4, viral process), AT1G60800 (AtNIK3, defense response), and AT3G04720 (AtPR4, defense response to bacterium).
Genes positively correlated with AtNLP6/AtNLP7 also included several pathogen/virus-related genes, such as AT3G11660 (NHL1, defense response to virus), AT5G06320 (NHL3, defense response to bacterium), AT2G35980 (ATNHL10, defense response to virus), AT4G13350 (NIG, response to virus), AT1G70690 (HWI1, defense response to bacterium), AT3G60240 (CUM2, response to virus), and AT5G04430 (BTR1, regulation by virus of viral protein levels in host cell). Correlation analysis showed that AtNLP3, AtNLP6, and AtNLP7 were more highly correlated with genes in GO categories GO:0044403 (symbiosis, encompassing mutualism through parasitism) and GO:0009267 (cell response to starvation) under N-starvation than under N-supplementation conditions (Fig. 6A,B,C). In addition, AtNLP3 was downregulated under N starvation compared to N supplementation (Fig. 5B). The downregulation of AtNLP3, along with the downregulation of immune response genes, may be beneficial for rhizobial invasion via reducing the immune response. The expression levels of AtNLP6 and AtNLP7, which have high sequence similarity, changed little (fold change < 2) between N-starvation and N-supplementation conditions. AtNLP7 is a major regulatory element among NLP proteins [42]. Despite their weak upregulation, AtNLP6 and AtNLP7 might function primarily at the protein level rather than the transcriptional level, as previously reported [42].
We constructed a protein association network of AtNLP3, AtNLP6, and AtNLP7 to genes using STRING 11.0 (Fig. 6C,D). While these proteins were connected to genes from GO term GO:0044403 (symbiosis, encompassing mutualism through parasitism), there was no significant association between these genes and other proteins. Additional experiments should be carried out to investigate the interactions of these highly connected genes, such as AT4G13350 (response to virus, connectivity value of 0.80 for AtNLP6 and 0.90 for AtNLP7 under N starvation, 0.26 for AtNLP6 and 0.35 for AtNLP7 under N supplementation). A set of genes from GO:0009267 (cell response to starvation), including AT5G45380 (ATDUR3), AT4G35090 (CAT2), AT3G05630 (PDLZ2), AT1G20620 (ATCAT3), and AT1G20630 (CAT1), were also enriched in GO:0006995 (cellular response to nitrogen starvation). Moreover, AtNLP6 and AtNLP7 were both associated with AT1G13300 (NIGT1/HRS1), which is regulated by AtNLP7 [43]. Although AtNLP6 and AtNLP7 might play redundant roles [44], greater differences in correlation were observed between AtNLP6 and NIGT1/HRS1 than between AtNLP6 and NIGT1/HRS1 under N-starvation and N-supplementation conditions. NIGT1/HRS1 is induced by NO3− [43]. The higher correlation between AtNLP6/AtNLP7 and NIGT1/HRS1 and the higher connectivity of AtNLP6 under N starvation (0.62 under N starvation vs. 0.16 under N supplementation), and AtNLP7 (0.47 vs. 0.11) and NIGT1/HRS1 (0.41 vs. 0.06) in the green module (Table S8), indicate that these genes strongly interact under N starvation conditions.
Most NLPs (AtNLP2–AtNLP7), especially AtNLP5 from the turquoise module, which were significantly upregulated under N starvation (Fig. 4B), showed higher correlations to more genes under N starvation than under N supplementation (absolute pcc ≥ 0.8) (Fig. S9). AtNLP5 (in the turquoise module) was significantly upregulated after 20 min of N starvation, whereas AtNLP3 (in the blue module) was significantly downregulated (Fig. 5C,E); these opposite expression patterns indicate that AtNLPs play diverse roles under N starvation. AtNLP1, AtNLP8, and AtNLP9 without strong correlation to any other genes (absolute pcc ≥ 0.8) under N supplementation (defined as lonely expressed genes) showed differential expression patterns under N starvation vs. N supplementation. The expression patterns of these genes were not significantly correlated to those of other genes under N starvation, indicating that they have lost many connected neighbors.
Among the genes most closely related to nodule-specific NLPs of G. max and P. vulgaris, AtNLP1 and AtNLP2 had opposite expression patterns. AtNLP2 has many connected neighbors, whereas AtNLP1 has completely lost highly connected neighbors (Fig. 4 and Fig. S9). AtNLP2 reflects the evolutionary imprint of nodule-specific NLP genes. Taking the expression patterns of AtNLP6/AtNLP7 together, these results suggest that the differential connectivities of the NLPs, along with their differential expression patterns, may be essential for nodulation under N starvation. The association of genes with NLP genes under N starvation, which are related biological processes of symbiosis, cell cycle, cell wall organization or biogenesis, and calcium ion transport, might have paved the way for nodulation in the NFC.
Genes regulated under N starvation and nodulation are connected to multiple transcription factors, transcriptional regulators, and protein kinases
TFs, transcriptional regulators (TRs), and protein kinases (PKs) are important classes of regulatory proteins associated with numerous aspects of plant growth and development, as well as biotic and abiotic stress responses [45]. To further explore the relationships of NLPs to other TFs, TRs, and PKs under the N-starvation conditions, N supplementation in A. thaliana, and in transcriptome atlaes of G. max and P. vulgaris, we analyzed sub-networks of the NLPs connected to these regulatory proteins with absolute pcc ≥ 0.8. Many more TFs, TRs, and PKs were highly correlated to NLPs under N starvation (1,371 with 674 TFs, 199 TRs, and 498 PKs, with 397,777 edges) vs. N supplementation (288 with 166 TFs, 29 TRs, and 93 PKs, with 7,298 edges), indicating that N starvation strongly influences the TFs, TRs, and PKs connected to NLPs at the transcriptional level (Fig. 7A,B and Table S10).
The highly correlated TFs, TRs, and PKs were enriched in both the same and unique GO terms. For example, of the 577 TFs highly correlated only to AtNLPs under N starvation (pcc ≥ 0.8), 502 were enriched in “regulation of nitrogen compound metabolic process” (GO:0051171). In addition, 109 of 182 TRs were enriched in “regulation of nitrogen compound metabolic process” (GO:0051171) (Fig. 7C,D and Table S11). The highly correlated PKs were enriched in “protein phosphorylation” (GO:0006468), “cell communication” (GO:0007154), and “cell surface receptor signaling pathway” (GO:0007166), indicating that N starvation influences plant cell–cell signaling pathways mediated by NLPs (Fig. 7E and Table S11).
To further explore the possible roles of RWP-RKs in nodule-forming plants, we constructed networks for G. max and P. vulgaris. P. vulgaris contains 9 RWP-RKs, including 7 NLPs and 2 RKDs, which are highly correlated to 130 regulators, with 1,639 edges. G. max contains 16 RWP-RKs, including 14 NLPs and 2 RKDs, which are highly correlated to 270 regulators, with 4,113 edges (Fig. 8A,B and Table S12). Recent whole-genomic duplication event within G.max affecting RWP-RKs have increased the number of edges of RWP-RKs connected to gene regulators compared to P. vulgaris. Among these RWP-RKs, Phvul.009G115800 and Phvul.008G291800 in P. vulgaris and Glyma.04G000600, Glyma.02G311000, Glyma.14G001600, and Glyma.06G000400 in G. max, which are NIN (nodule inception) genes [6], were highly expressed in separate nodules and are located in separate modules, pointing to their overall upregulation in nodule tissue (Fig. 8C, D and Table S12). These nodule-specific NLPs clustered together in clade IIIc.
Two other genes in G. max, Glyma.13G346300 and Glyma.12G050100, which are closely related to AtNLP8 and AtNLP9 within clade IIIb, were highly expressed in nodules. Interestingly, Phvul.005G15510 in clade IIIb was expressed at higher levels in ineffective N-fixation nodules than in effective N-fixation nodules. This observation suggests that the regulation of NLPs from clade IIIb is also influenced by rhizobia without the capacity for N fixation and that NLPs from clade IIIc are critical for N-fixing nodules.
We also detected divergence within IIIc: Phvul.009G01120, together with Glyma.04G017400 and Glyma.06G017800, which are closely related to AtNLP4 and AtNLP5, also showed the highest expression levels in ineffective N-fixation nodules. Only genes in the small clade closely related to AtNLP1 and AtNLP2 were specifically upregulated in effective N-fixation nodules in both G. max and P. vulgaris (Fig. 4). These findings reflect the functional divergence of NLPs with regard to nodulation, that is, the existence of nodulation with and without N fixation.
Effect of nitrogen on nodulation via the regulation of NLPs in P. vulgaris
Not only are NLPs involved in the nitrate signaling pathway, but they also influence nodule inception [6]. Whether nitrate influences nodulation via the regulation of NLPs mediating nitrate signaling in P. vulgaris has been unclear. In plants inoculated with Rhizobium tropici treated with different concentrations of nitrate, 10 mM nitrate inhibited nodulation in both early and mature nodules. Mature nodules from plants treated with 5 mM nitrate appeared to be browner, and plant height was taller, than for plants treated with either 0 mM or 10 mM nitrate (Fig. 9A, B, C and Fig. S10).
Expression analysis of six PvNLP genes in early roots and root-nodule mixtures under different concentrations of nitrates indicated that Phvul.004G114100, Phvul.008G291800, and Phvul.011G052100 showed higher expression at 5 mM nitrate (low-nitrogen conditions) with inoculation than they did under either 0 mM nitrate (nitrogen-free conditions) or 10 mM nitrate (high-nitrogen conditions) regardless of inoculation (Fig. 10A). When the roots were treated with different concentration of nitrates without inoculation, Phvul.009G011200 and Phvul.009G115800 showed highest expression under low-nitrogen as compared to both nitrogen-free and high-nitrogen conditions, whereas the other genes showed gradual inhibition with increasing nitrogen concentration. Finally, Phvul.008G291800, Phvul.009G011200, and Phvul.011G052100 were significantly inhibited under high-nitrogen conditions (Fig. 10B). In mature nodules, Phvul.007G071900 was significantly upregulated under low-nitrogen vs. nitrogen-free conditions, whereas other genes were downregulated or varied in expression pattern (Fig. 10C).