Nodule-Inception-Like Protein (NLP) Gene Family Identied in Physcomitrella Patens Genome Responds to Variable Nitrogen Supply

NODULE-INCEPTION-like proteins (NLPs) are plant specic transcription factors that play signicant role in orchestrating nitrogen response. NLPs have been widely studied in vascular plants but very less is known about NLPs in non-vascular bryophytes till date. In current study, rst, the in silico tools were employed for identication and characterization of NLPs in model bryophyte Physcomitrella patens genome. Furthermore, the expression proles of PpNLPs were assessed under variable supply of nitrogen. A total of 6 Physcomitrella patens NLP genes (PpNLPs) were identied that shared resemblance in their physical and chemical attributes with Arabidopsis thaliana NLPs (AtNLPs). PpNLP genes possesses similarities in there iso-electric point and hydropathicity values with those of AtNLPs while gene lengths, protein lengths, and molecular weights were found higher in PpNLPs than AtNLPs. It was further observed that all PpNLPs, except PpNLP6, yield acidic hydrophilic proteins localized in nucleus and share a signicant degree of homology in their gene structures and protein motifs with AtNLPs. Phylogenetic analysis indicated that PpNLPs possess signicant evolutionary linkage with Arabidopsis thaliana, Oryza sativa, and Zea mays. Protein-protein interaction analysis suggested that PpNLPs possess substantial coordination with nitrogen responsive genes like nitrate reductase. Expressions of all PpNLPs were up-regulated in the availability of nitrate (5 and 10 mM) as sole nitrogen source while no signicant increment was observed in the absence (0 mM) of nitrogen. The expression levels increased with increasing retention-time treatment of 0, 6, 12, 24, 48, and 72 hours. Results proposed that NLPs are responsive to as well as signicantly regulated by nitrogen supply. is established in various vascular plants that NODULE-INCEPTION-LIKE PROTEIN (NLP) gene family is responsive to nitrogen. Our study aimed for genome-wide identication and characterization to identify NLP gene family in Physcomitrella patens genome. Furthermore, we found through expression analysis that NLPs are responsive to nitrogen supply in model bryophyte. The moss Physcomitrella patens is an established model non-vascular bryophyte for modern plants because it lies at the base of evolutionary lineage of today’s plants and algae. The similarities and dissimilarities between the mosses and modern plants must be eminent from their genomes. As the Arabidopsis thaliana and Physcomitrella patens genomes have been sequenced, the genome-wide comparison of A. thaliana with P. patens for nding orthologous and paralogous genes seems plausible in nding the evolutionary linkage between these two model organisms (Rensing et al. 2008). In this study, initially, we used in silico tools to identify NLP genes in P. patens genome databases. Subsequently, the expression patterns of NLP genes in various parts of P. patens in response to varying N concentrations were also assessed. Our study provides a valuable ground to understand the evolutionary relationship among NLPs of model vascular and non-vascular plants which facilitates in vivo functional characterization of PpNLPs in future. Analysis of predicting proteins interacting with a gene family is yet another preliminary procedure in directing functional characterization. Comparing with expression proles suggest that the predicted proteins enlisted might have conserved function in N uptake, transport, and assimilation. As demonstrated in previous studies, functional characterization of NLP genes in rice showed that they are responsive to N and are signicant in improving overall NUE (Alfatih et al. 2020; Wu et al. 2020). It is concluded on the basis of our ndings in this study compared with those reported earlier, that PpNLPs are responsive to as well as are signicantly regulated by N availability. NLPs are promising group of transcription factors that could potentially contribute in improving crop’s N use eciency (NUE). Our study provides only a hypothetical basis for the study of NLPs thus highlights questions for further detailed investigations. First, detailed structural and functional characterization by employing mutant studies can truly speck their molecular attributes. Our aim in studying NLPs in Physcomitrella patens was to ll the gap due to lack of relevant reports. Physcomitrella patens shall be focused for such studies, particularly for N transport, because it lies on the border-line of algae and vascular plants – thus can be promising for exploiting detailed mechanisms and key factors involved in N regulation for improving crop’s NUE.


Introduction
Nitrogen (N) is an essential macronutrient for plant growth and yield (Tegeder and Masclaux-Daubresse 2018). Usable N are limited in soil therefore N fertilizers are supplemented in agriculture to achieve high crop yield (Li et al. 2018). However, plants absorb a fraction (30-40%) of applied N while more than half (60-70%) is lost in soil causing severe soil and water pollutions (Garnett et al. 2009). Ine cient conversion and consumption of N fertilizer also induce emission of nitrous oxide hence elicit global warming (Fagodiya et al. 2017). Despite their potential threats to environment, global demands for N fertilizer in agriculture increases continuously. Approximately 112 million tons (Mt) of N fertilizer were applied worldwide in 2015 while it was recorded 118 Mt in 2019 (FAO 2019). Such progressive increment in the demand for enormous fertilizer quantities elicits agricultural cost as well. Therefore, enhancing the plant's ability to use N e ciently can elevate crop yield with reduced fertilizers input, agricultural costs, and environmental pollution (Alfatih et al. 2020). The term NUE (N use e ciency) is referred to jointly delineate the processes of N-uptake e ciency (NUpE) and -utilization e ciency (NUtE) in plants. NUE has been precisely de ned as the amount of crop biomass or grain yield achieved at per unit application of N (Moll et al. 1982). Crop NUE improvement is widely recognized as economic, effective, and desirable way of reducing N-associated agricultural and environmental problems. It is estimated that increasing the crop's NUE by merely 1% can signi cantly enhance crop yield and possibly can save up to 1.1 billion US dollars a year (Kant et al. 2011). However, the comprehensive molecular mechanisms regulating NUE are yet to be understood.
Plants are evolved with effective and highly coordinated molecular mechanisms of N acquisition, assimilation, transport, and metabolism, governed by several transcription factors (TFs) and gene families ). Plants absorb predominant inorganic nitrate (NO 3 − ) from soil and transport them with the help of nitrate transporters like NRT1 (Orsel et al. 2002) and NRT2 (Orsel et al. 2002) across the channels including CLC: chloride channel (Zifarelli and Pusch 2010) and SLAH: slow anion channel associated homologues (Qiu et al. 2016) into the cell. The absorbed inorganic nitrate is then reduced to ammonium (NH 4 + ) by nitrate reductases (NIA1, NIA2) (Olas and Wahl 2019) and nitrite reductase (NiR) (Takahashi et al. 2001  The moss Physcomitrella patens is an established model non-vascular bryophyte for modern plants because it lies at the base of evolutionary lineage of today's plants and algae. The similarities and dissimilarities between the mosses and modern plants must be eminent from their genomes. As the Arabidopsis thaliana and Physcomitrella patens genomes have been sequenced, the genome-wide comparison of A. thaliana with P. patens for nding orthologous and paralogous genes seems plausible in nding the evolutionary linkage between these two model organisms (Rensing et al. 2008). In this study, initially, we used in silico tools to identify NLP genes in P. patens genome databases. Subsequently, the expression patterns of NLP genes in various parts of P. patens in response to varying N concentrations were also assessed. Our study provides a valuable ground to understand the evolutionary relationship among NLPs of model vascular and non-vascular plants which facilitates in vivo functional characterization of PpNLPs in future.

Physcomitrella patens Growth Conditions
The Physcomitrella patens growth conditions were optimized according to established protocol (Koduri et al. 2010). The gametophores of P. patens ecotype Gransden 2004 were axenically grown at 25 ± 1 o C in continuous light (intensity: 50 µmol m − 2 s − 2 ) and sub-cultured for three weeks. Explants from precultures were allowed to grow for a week followed by treating with variable supply of N on liquid BCDA medium (Table S1). The KNO 3 was used as sole N source in treating P. patens with N-de cient (0 mM), -limiting (5 mM) and -su cient (10mM) conditions provided in BCDA medium. The grown P. patens were treated for 0, 6, 12, 24, 48, and 72 hours. The rhizoid, stem and phylloid were harvested and stored at -80 o C.

RNA Extraction and qPCR
The total cellular RNA from selected three parts; rhizoid, stem, and phylloid, was extracted with column-based RNA extraction methods (Seçgin et al. 2020). The cDNA synthesis from extracted RNA was carried out through oligo-dT primers and reverse transcription (TaKaRa) as per supplier's protocol. The quanti ed cDNA was subjected to reverse transcription qPCR (Step One Plus Real Time PCR System) using P. patens Actin3 gene as internal reference. Gene speci c primers (Table S2) were obtained from qPrimerDB version 1.2 (Bustin and Huggett 2017).

Screening of Genome and Transcription Factors Databases
The full-length gene, protein, and coding sequences of all members of Arabidopsis thaliana NLP (AtNLP) gene family were retrieved from Arabidopsis genome database (TAIR: http://arabidopsis.org/). In total, three genomes and one plant-TF databases were screened for identi cation of putative PpNLPs. First, the AtNLPs protein sequences were used as BLAST-query in screening NCBI (

Statistical Analysis
The results were statistically validated with signi cance (P < 0.05) and graphs were developed using GraphPad Prism 8.

Results
Genome Wide Identi cation and Analysis of Physcomitrella patens NLP Homologues In the present study, three genome databases (NCBI, Phytozome.v12, Phytozome.v13) and one plant TF database (iTAK) were screened to identify NLPs in Physcomitrella patens genome (Taxonomy ID: 3218) using Arabidopsis thaliana NLPs protein sequences as well as pfam accessions of RWP-RK (PF02042) and PB1 domain (PF00564) as queries. Initially, 62 sequences were obtained comprising 25 from NCBI, 24 from Phytozome, and 13 from iTAK. All the sequences and their information obtained from updated version of Phytozome (v13) were similar to those in v12 except their accession numbers. The spliced variants, repeated/redundant sequences, and short or incomplete fragments were excluded from retrieved sequences simultaneously validated through conserved domain identi cation. Finally, 6 PpNLPs were identi ed that contained both RWP-RK and PB1 domains (Table S3) and were labeled from 1 to 6 with respect to their location in chromosomes. Accession numbers of same or redundant sequences found in selected databases are enlisted in Table 1, while, the physical and chemical properties of A. thaliana and P. patens NLP gene families are summarized in Table 2.

Sequence Alignment and Phylogenetic Relationship of PpNLPs Gene Family
The percent similarities of PpNLPs and AtNLPs were matched to con rm the appropriate selection as well as singularity of each identi ed PpNLP gene used for further analysis (Table S4) Up to 15 consensus motifs were gured out using MEME in PpNLP proteins (Fig. 3, Table S5) compared with AtNLPs. All the sequences contained signi cantly conserved motifs in both A. thaliana and P. patens proteins. All the AtNLPs and all PpNLPs contained all motifs except AtNLP4, -8, and − 9 that contain 14 motifs while AtNLP3 has 11 motifs.
Appropriate localization of genes upon chromosome (Fig. 4

Identi cation of cis-Regulatory Elements in Promoter Regions of PpNLPs
The recognition of cis-regulatory elements in upstream promoter regions (2000 bp) is a signi cant approach in proposing the gene function and regulation. Three categories of cis-regulatory elements in promoter regions of both AtNLPs and PpNLPs were devised to categorize the identi ed cis-elements in three groups including phytohormone (PR), stress (SR), and plant growth and development (PGD), shown in table 3. Comparatively, AtNLPs possess higher number of regulatory elements than PpNLPs. Highest total number of cis-elements (87) identi ed in AtNLPs were responsive to phytohormones, while, total numbers of AtNLPs cis-elements responsive to SR and PGD were 45 and 46, respectively (Fig. 5). All AtNLPs contained higher number of PR cis-elements except AtNLP7 whose number of PGD responsive cis-elements were higher than SR and PR. Likewise, in PpNLPs, PpNLP4 possess higher number of PGD responsive cis-elements while remaining PpNLPs have higher number of cis-elements in PR group. The total number of PGD, SR, and PR cis-elements identi ed in PpNLPs are 19, 21, and 35, respectively.

Protein-Protein Interaction of PpNLPs
The interacting NLP proteins networks were predicted online through STRING (Table S7). All the PpNLP proteins were suggested to interact with plethora of N related genes. Among them, 10 genes were commonly interacting with all PpNLP proteins. Most of these 10 genes are un-annotated predicted proteins, however, three NIA: nitrate reductases genes (PP1S58_252V6.1, PP1S58_249V6.1, and PP1S79_76V6.2) have been identi ed as signi cant putative N related genes interacting with PpNLPs. Figure 6 shows schematic model of all PpNLPs interacting with cellular proteins.

Expression Pattern of PpNLPs Gene Family
The real time quantitative PCR was executed to assess the expression level of PpNLP in rhizoid, stem, and phylloids of P. patens while Actin3 was taken as internal control. Three N treatments 0 (de cient), 5 (limiting), and 10 mM (su cient) were provided for 0, 6, 12, 24, 48, 72 hours. Results indicated a signi cant differential pattern common in all PpNLPs in rhizoid, stem and phylloids (Fig. 7, 8). Expression of PpNLPs increased with increasing time of treatment from 6 to 72 hours under limiting (5 mM) and su cient (10 mM) N supply, while no changes were observed in N de cient (0 mM) conditions. Thus, indicated that PpNLPs are highly regulated with N availability. The overall expression pattern showed signi cant up-regulation of all PpNLPs with immediate response due to expression increment within 0 to 6 hours in all three plant parts.

Discussion
The In current study, we identi ed 6 NLPs genes through genome-wide in silico analysis in P. patens genome-databases and compared their attributes with NLPs of A. thaliana. The in silico studies are largely based on comparison algorithms, therefore, the similarities observed in comparing genomic information can be used to predict function of a gene. We observed that gene lengths, protein lengths, and molecular weights of PpNLPs were found higher as compared to AtNLPs, however, the pI and GRAVY values of both gene families were found in proximity indicating putative functional homology among the members of both gene families.
The study of evolutionary relationship among AtNLPs and PpNLPs clustered them into three distinct clades in a phylogenetic tree, as shown in Fig. 1. All the PpNLPs were clustered in a sub-clade while sister-group contained 6 members with 2 members from each of A. thaliana (AtNLP8, 9), O. sativa (OsNLP2, 5), and Z. mays (ZmNLP2, 9). Two logical explanations can be inferred from this phylogenetic relationship. First, all the PpNLPs are grouped in a separate subcluster which may be due to the evolutionary lineage among vascular and non-vascular plants. Second, the presence of PpNLPs in close relationship with NLPs from vascular plants in sister-group con rms the ancestral lineage of NLPs among bryophytes and vascular plants. In a relevant study of assessing the signi cance of evolution in amino acid permeases (AAPs) gene families of 17 plants con rmed that bryophytes and vascular plants had common ancestor and gene duplications occurred in evolutionary phases ). The evolutionary relationship can also be linked with the properties of NLPs genes and protein sequence (Yandell et al. 2006). The gene structure analysis (Fig. 2) showed that members of PpNLPs had 3-4 introns while it varied between 4 and 6 among members of AtNLPs. It is evident from previous reports that gene structure evolution is suggested by loss or gain of introns (Zhang et al. 2014). Our ndings entail higher phylogenetic divergence with higher ancestral linkage among members of vascular and non-vascular NLPs. Presence of one or both of the two protein domains (RWP-RK, and PB1) also explicates the evolutionary relationship among members of AtNLPs and PpNLPs. Likewise, presence of consensus protein motifs among all the PpNLPs further con rms both the ancestral relationship as well as evolutionary divergence of NLP gene families in bryophytes and vascular plants.
Identi cation of cis-elements in promoter region of a gene is an effective parameter in proposing the role and regulation of a gene. It was observed in our study that PpNLPs have higher frequency of cis-elements responsive to plant growth and development that can be related with the growth and development of plant affected by N supply and regulation. The results suggested that more the number of cis-elements -higher will be the associated function. Although it is purely suggested through in silico tools from our study that all PpNLPs are primarily involved in plant growth development mechanisms while stress as well as phytohormone responses may be their secondary role, however, this statement can be con rmed through detailed investigations led by advance molecular techniques.
Analysis of predicting proteins interacting with a gene family is yet another preliminary procedure in directing functional characterization. Comparing with expression pro les suggest that the predicted proteins enlisted might have conserved function in N uptake, transport, and assimilation. As demonstrated in previous studies, functional characterization of NLP genes in rice showed that they are responsive to N and are signi cant in improving overall NUE (Alfatih et al. 2020;Wu et al. 2020).
It is concluded on the basis of our ndings in this study compared with those reported earlier, that PpNLPs are responsive to as well as are signi cantly regulated by N availability. NLPs are promising group of transcription factors that could potentially contribute in improving crop's N use e ciency (NUE). Our study provides only a hypothetical basis for the study of NLPs thus highlights questions for further detailed investigations. First, detailed structural and functional characterization by employing mutant studies can truly speck their molecular attributes. Our aim in studying NLPs in Physcomitrella patens was to ll the gap due to lack of relevant reports. Physcomitrella patens shall be focused for such studies, particularly for N transport, because it lies on the borderline of algae and vascular plants -thus can be promising for exploiting detailed mechanisms and key factors involved in N regulation for improving crop's NUE.
Declarations Funding Not applicable.    signi cantly conserved motifs in both A. thaliana and P. patens proteins. All the AtNLPs and all PpNLPs contained all motifs except AtNLP4, -8, and -9 that contain 14 motifs while AtNLP3 has 11 motifs. Highest total number of cis-elements (87) identi ed in AtNLPs were responsive to phytohormones, while, total numbers of AtNLPs cis-elements responsive to SR and PGD were 45 and 46, respectively ( Figure 5). All AtNLPs contained higher number of PR cis-elements except AtNLP7 whose number of PGD responsive cis-elements were higher than SR and PR. Likewise, in PpNLPs, PpNLP4 possess higher number of PGD responsive cis-elements while remaining PpNLPs have higher number of cis-elements in PR group. The total number of PGD, SR, and PR cis-elements identi ed in PpNLPs are 19, 21, and 35, respectively.

Figure 6
The interacting NLP proteins networks were predicted online through STRING (Table S7). All the PpNLP proteins were suggested to interact with plethora of N related genes. Among them, 10 genes were commonly interacting with all PpNLP proteins. Most of these 10 genes are un-annotated predicted proteins, however, three NIA: nitrate reductases genes (PP1S58_252V6.1, PP1S58_249V6.1, and PP1S79_76V6.2) have been identi ed as signi cant putative N related genes interacting with PpNLPs. Figure 6 shows schematic model of all PpNLPs interacting with cellular proteins.

Figure 7
The real time quantitative PCR was executed to assess the expression level of PpNLP in rhizoid, stem, and phylloids of P. patens while Actin3 was taken as internal control. Three N treatments 0 (de cient), 5 (limiting), and 10 mM (su cient) were provided for 0, 6, 12, 24, 48, 72 hours. Results indicated a signi cant differential pattern common in all PpNLPs in rhizoid, stem and phylloids ( Figure 7). Expression of PpNLPs increased with increasing time of treatment from 6 to 72 hours under limiting (5 mM) and su cient (10 mM) N supply, while no changes were observed in N de cient (0 mM) conditions. Thus, indicated that PpNLPs are highly regulated with N availability. The overall expression pattern showed signi cant up-regulation of all PpNLPs with immediate response due to expression increment within 0 to 6 hours in all three plant parts.