An LTR retrotransposon insertion inside CsERECTA for an LRR receptor-like serine/threonine-protein kinase results in compact (cp) plant architecture in cucumber

The compact (cp) phenotype in cucumber (Cucumis sativus L.) is an important plant architecture-related trait with a great potential for cucumber improvement. In this study, we conducted map-based cloning of the cp locus, identified and functionally characterized the candidate gene. Comparative microscopic analysis suggested that the short internode in the cp mutant is due to fewer cell numbers. Fine genetic mapping delimited cp into an 8.8-kb region on chromosome 4 harboring only one gene, CsERECTA (CsER) that encodes a leucine-rich repeat receptor-like kinase. A 5.5-kb insertion of a long terminal repeat retrotransposon in the 22nd exon resulted in loss-of-function of CsER in the cp plant. Spatiotemporal expression analysis in cucumber and CsER promoter-driven GUS assays in Arabidopsis indicated that CsER was highly expressed in the stem apical meristem and young organs, but the expression level was similar in the wild type and mutant cucumber plants. However, CsER protein accumulation was reduced in the mutant as revealed by western hybridization. The mutation in cp also did not seem to affect self-association of CsER for formation of dimers. Ectopic expression of CsER in Arabidopsis was able to rescue the plant height of the loss-of-function AtERECTA mutant, whereas the compact inflorescence and small rosette leaves of the mutant could be partially recovered. Transcriptome profiling in the mutant and wild type cucumber plants revealed hormone biosynthesis/signaling, and photosynthesis pathways associated with CsER-dependent regulatory network. Our work provides new insights for the use of cp in cucumber breeding.


Introduction
Plant height is an important yield-related trait in many crops. Plants with dwarf, semi-dwarf or bushy growth habit usually have short internode and strong stem, which thus allows manipulation of planting density to increase productivity per unit area (Hedden 2003;Teng et al. 2013). The most well-known examples are wheat reduced height (Rht) genes (Peng et al. 1999) and rice semi-dwarf (sd1) gene (Monna et al. 2002;Sasaki et al. 2002;Spielmeyer et al. 2002) that played an important role in the "Green Revolution", which are involved in the signaling and biosynthesis of the plant hormone gibberellic acid (GA), respectively. Cucumber, Cucumis sativus L. is an important vegetable crop. Most cucumber varieties have indeterminate growth habit with tall vines, multiple lateral branches, and long internodes that adapt to production systems with trellis support. Cucumbers with compact growth could be of advantageous to increase yield and reduce labor cost with mechanical harvesting (Cramer and Wehner 2000). A number of cucumber vine length or plant architecture mutants have been reported and characterized including compact, super compact, dwarf and short internode. The compact (cp) gene was found in two plant introduction (PI) lines PI 308915 and PI 308916 (Kauffman and Lower 1976), which has been mapped in a 220-kb genomic region on cucumber chromosome (Chr) 4 (Li et al. 2011a, b, c). A chemically induced compact mutant, cp-2 (Kubicki et al. 1986) displays a strong reduction of internodes length and small seeds. The compact cucumber mutant reported by Crienen et al. (2009) is different from cp and cp-2 which exhibits semi-dominant inheritance. The super compact mutant was controlled by a recessive gene (scp) that shows severely reduced stem length, no branches, dark-green and wrinkled leaves (Niemirowicz-Szczytt et al. 1996). The EMS-induced cucumber mutant, short internode (si) was due to a mutation in the gene for an F-box protein (Lin et al. 2016). A deletion inside a cyclin-dependent protein kinase inhibitor gene (CsSMR1) leads to dwarf and determinate growth (Li et al. 2021a, b). The cucumber gene CsCLAVATA1 for a CLAVATA-type receptor-like kinase was believed to be underlying the dwarf phenotype in the Csdw mutant that could be partially rescued by GA 3 application (Xu et al. 2018). Interestingly, several compact or super compact cucumber mutants were all shown to be caused by mutations in genes in the brassinosteroid (BR) pathway which include super compact-1 (scp-1) encoding a cytochrome P450 protein CsCYP85A1 (Wang et al. 2017a, b), super compact-2 (scp-2) encoding the steroid 5-alphareductase (CsDET2) (Hou et al. 2017), and compact plant architecture (cpa) for the 7-dehydrocholesterol reductase . These studies suggested that there are diverse mechanisms in control of plant height, and the BR biosynthesis/signaling pathway plays a prominent role in regulating plant architecture in cucumber.
Reports on cloning of plant architecture-related genes in other cucurbits (family Cucurbitaceae) are sporadic. In a recent study, Yang et al. (2020) characterized a short internode (si) mutant in melon (Cucumis melo) which encodes CmERECTA (CmER), a leucine-rich repeat (LRR) receptor-like kinase (RLK) protein. Since the melon si locus is located in a region that is syntenic to where cp is located, this raised the possibility that ER is also underlying the cp locus. ERs belong to the large RLK family in plants (Walker 1994;Shiu and Bleecker 2001;Diévart et al. 2011). Each ER protein has an extracellular ligand-binding LRR domain to perceive signals, a transmembrane domain to anchor the protein on cytomembrane, and an intracellular kinase domain with Ser/Thr kinase activity to transduce the signals received by ligand-binding domain (Walker 1994;Torii et al. 1996;Lease et al. 2001). In Arabidopsis, ERs play an important role in extra-cellular to intra-cellular signal transduction. For example, ERs can perceive signals from the small secreted peptides epidermal patterning factor-like 4 (EPFL4) and EPFL6 (CHALLAH) to regulate organogenesis in the phloem (Abrash et al. 2011;Uchida et al. 2012). EPFL9/ STOMAGEN is reported as the ligand for ER to promote root cortex cell proliferation in redox-mediate pathways (Cui et al. 2014). The two ligand genes EPF1 and EPF2 are expressed in the epidermis and can activate ER family proteins to prevent stomatal differentiation (Hara et al. 2007(Hara et al. , 2009Hunt and Gray 2009).
So far, the functions of ERECTA have been characterized primarily in Arabidopsis. It has been demonstrated that ERECTA plays an important role in regulating organ shape and inflorescence architecture Liu et al. 2021). The widely used Arabidopsis ecotype Landsberg erecta (Ler or Ler-0) is due to loss-of-function of AtER, which exhibits dwarf phenotype with short internodes compared to the WT; its inflorescence is compact with clustered flower buds, short pedicels and siliques (Torii et al. 1996). ER has two paralogs in the Arabidopsis genome, ER-LIKE1 (ERL1) and ER-LIKE2 (ERL2) that have partially redundant functions with ER. Loss-of-function of all the three paralogs leads to extreme dwarf and compact phenotype (Shpak et al. 2004). ERECTA is a negative regulator of leaf size and number of stomata, which affect transpiration, and may improve water use efficiency and thermotolerance (Masle et al. 2005;Shanmugam et al. 2020;Li et al. 2021a, b). In addition, as a member of the LRR family, ER also affects the resistance to pathogens (Sánchez-Rodríguez et al. 2009;Jordá et al. 2016;Caiet al. 2021). Therefore, understanding the functions of ER is of importance for crop improvement. However, little is known about ER functions and its practical potentials in crop plants.
Phytohormones play critical roles in cell division and cell expansion, and thus the development of plant architecture. Most dwarf phenotypes reported in various plant species are due to mutations in genes of GA or BR biosynthesis or signaling pathways (e.g., Ashikari et al. 2002;Tanabe et al. 2005;Yang et al. 2014Yang et al. , 2020Tian et al. 2019;Ren et al. 2020). ER has been shown to be directly or indirectly involved in regulation of biosynthesis and signaling of auxin/IAA (Qu et al. 2017;Du et al. 2018;Yang et al. 2020), BR , GA (Du et al. 2018), or cytokinin (CK) (Guo et al. 2020). However, it is not well understood if ER involves multiple plant hormones at the same time.
We previous mapped the cp locus into a 178-kb region on cucumber Chr4 (Li et al. 2011a, b, c;Yong et al. 2013). In this study, we cloned and functionally characterized this gene. We show that cp is due to loss-of-function in CsERECTA (CsER) caused by the insertion of a 5.5-kb long tandem repeat retrotransposon (LTR-RT) in the 22nd exon. Complementary and ectopic expression assays in Arabidopsis allowed to recover the er105 mutant. Transcriptome profiling revealed that biosynthesis of BR and ethylene, signaling of auxin, cytokinin and ethylene were inhibited in the cp mutant.

Plant materials and mapping populations
For fine mapping the cp locus, two parental lines WI7201 (PI 308915,cpcp,mutant) and WI7123 (aka'CCMC', CpCp, wild type, WT) (Li et al. 2011a, b, c) were employed to develop an F 2 population consisting of 9545 plants. The population was qualitatively scored as either regular vining (WT) or compact (dwarf). The χ 2 test was used to evaluate the deviation of the expected 3:1 ratio in the F 2 population. All plants were grown in plastic greenhouses under natural conditions at the Horticulture Research Station of Northwest A&F University, China, or the University of Wisconsin Madison Hancock Agricultural Research Station (HARS). Only compact mutant F 2 plants were used for fine mapping of the cp locus. Three hundred cucumber accessions with normal vine types (WT) including WI7200 (PI 249561) (Bo et al. 2016) were also genotyped to examine allelic diversity at the cp locus,

Histological investigation
We measured 100 cortex cells for average cell size and number in internode tissues of the WT and mutant plants with the ImageJ software (https:// imagej. nih. gov/ ij/ downl oad. html). The samples (5 mm thickness) were fixed overnight in the solution containing ethanol, water, formaldehyde and acetic acid (50:35:10:5 by volume), and dehydrated with a graded series of ethanol (50, 70, 85, 95 and 100%) for 90 min at each concentration. The samples were then embedded in paraffin, sectioned (10 μm thickness), stained with toluidine blue, examined and imaged under a microscope (Olympus, Model BX63). Each sample had three biological and three technical replicates.

Map-based cloning of cp and sequence analysis
For screening recombinant plants, 2243 compact F 2 plants were genotyped with markers UW084680 and UW084870 that flanked the cp locus. New molecular markers were developed from DNA sequences in the target region of the two parental lines by Sanger sequencing. Primers were designed using Primer3 (http:// bioin fo. ut. ee/ prime r3-0. 4.0/). "dCAPS Finder 2.0" (http:// helix. wustl. edu/ dcaps/ dcaps. html) was used for designing CAPS and dCAPS markers (Neff et al. 1998). The Gy14v2.0 cucumber reference genome (https:// ww. cucur bitge nomics. org/) was used for determining the physical locations of all the primers and gene annotations in the final mapping region.
To examine allelic diversity of the cp locus, two pairs of primers "cp-locus", "cp-CK" were designed to amplify the band of cp locus and the positive control band in 300 cucumber accessions including WI7201 and WI7039 (PI 308916) with compact vining type. PCR products were loaded for electrophoresis on agarose gels. Information for all primers used in this study are provided in supplemental Table S1.

Phylogenetic analysis of ER homologs in plants
The deduced full-length amino acid sequence of CsER was used to search for the homologs in other species by BLASTP at NCBI (https:// www. ncbi. nlm. nih. gov/). Protein sequence alignment was accomplished with ClustalW in the MEGA7 software package. Phylogenetic analysis was carried out using neighbor-joining method with 1000 bootstrap replicates. Protein information and accession numbers are listed in supplemental Table S2.

Quantitative real-time PCR (qPCR) analysis and in situ hybridization
Total RNAs of cotyledon, hypocotyl, leaf, stem and SAM of cucumber and young inflorescences of Arabidopsis were extracted and the reverse-transcription was performed as the same as of gene cloning. RT-qPCR was performed with the CFX96 Touch Real-Time PCR Detection System (Bio-Rad). For each gene, three biological and technical replicates were used. UBI-ep (Wan et al. 2010) and actin (Shpak et al. 2003) were used as the housekeeping genes in cucumber and Arabidopsis, respectively. Relative expression level was calculated by 2 −△△C T method (Livak and Schmittgen 2001). Cucumber SAM tissues were fixed in 3.7% formaldehydeacetic acid-alcohol (FAA) solution and in situ hybridization was carried out as described by Zhang et al. (2013). The product from 1509 to 2026 bp of the CDS was amplified and used for synthesis of sense and anti-sense probes using the SP6 and T7 polymerase, respectively.

Prokaryotic expression and western blotting
Prokaryotic expression of amino acids 317-517 of the ER protein was conducted in strain BL21 (DE3). PCR primers are provided in Supplemental Table S1. After purification, the ER peptide was used to produce polyclonal antibodies in rabbit. Transmembrane protein from cucumber SAM was extracted and measured for concentration. A total of 30 μg protein was used for western blotting analysis.

Subcellular localization
To determine the subcellular localization of CsER, the CDS of CsER without the stop codon was cloned from WT and inserted into the modified pCambia3301 (3301 m) vector, fused with eGFP. The 3301 m vector was modified from pCambia3301 by replacing the GUS gene with an eGFP gene. The cleavage sites were Nco I and Bgl II, respectively. The recombinant vector was subsequently transformed into Agrobacterium tumefaciens GV3101 strain. Co-infiltration into tobacco leaves was performed as described by Walter et al. (2004). The GFP fluorescence was observed and imaged using a fluorescence microscope with excitation at wavelength of 488 nm. DAPI (4,6-diamidino-2-phenylindole) was applied to locate nucleus. The empty vector of 3301 m was used as the positive control.

Transformation in Arabidopsis
Several sets of recombinant vectors were constructed. The full-length genomic CsER of 9812 bp including a 2053-bp upstream region of the start codon as the native promoter and 896-bp downstream region of the stop codon as terminator was amplified and sub-cloned into pCambia3301 through EcoR I and Hind III restriction sites. A flag tag was introduced into the genomic CsER sequence by replacing the stop codon (8940 bp), without a terminator structure to be inserted in pCambia3301 through the same Eco RI and Hind III sites. Both constructs were transformed into Col and er105 Arabidopsis ecotypes. The vector used for subcellular localization assay was also used for ectopic expression analysis of CsER in er105.
The 2053-bp promoter was inserted into pCambia1391Z vector for analyzing the promoter activity of CsER through Hind III and Eco RI sites. Transformation was performed using the floral dip method (Clough and Bent 1998). Transgenic lines were screened on MS medium with 25 mg L −1 hygromycin or by spraying diluted BASTA (2000 times of dilution of 13.5%) in media.
Promoter activities in T 2 promoter-transgenic Arabidopsis plants was assessed with GUS staining (Meng et al. 2012) which was performed in tissues/organs of different development stages. For complementary and ectopic expression assays, T 3 transgenic Arabidopsis were used for phenotyping. Pictures were taken by a digital camera.

Bulked Segregant RNA-Seq (BSR-Seq)
BSR-Seq was performed using WT and cp bulks. Each bulk was constructed by pooling equal amount of the SAM tissues from five WI7200 × WI7201 F 2 seedling plants at 14 days after germination. Each plant was genotyped with markers to confirm genotype at the cp locus (CpCp or cpcp) for selected plants. There were two replications for each genotype for BSRseq. Raw reads were mapped to 9930v3.0 genome by TopHat (Trapnell et al. 2009). Read summarization was conducted by featureCounts (Liao et al. 2013). Differentially expressed genes (DEGs) were obtained by DESeq2 (Love et al. 2014). GO enrichment was performed at CuGenDBv2 (http:// cucur bitge nomics. org/ v2/) and KEGG enrichment were performed by clusterProfiler (Yu et al. 2012).

Phytohormone treatments
The WT and cp mutant were grown on MS media supplemented with four hormones at different treatments: IAA (0.1 µM), 6-BA (0.1 mg/L), GA3 (10 µM) and epi-BR (low: 0.01 µM and high: 5 µM). All test materials were grown under 24 °C day/18 °C night and 14 h light/10 h dark photoperiod in a growth chamber. Data were collected at 21 days after germination.

Measurement of photosynthesis-related parameters
The WI 7200 and cp mutant plants were used to measure photosynthesis-related parameters. Total chlorophylls were extracted from 0.1 g leaf tissue by immersing in solution made of ethanol and acetone (1:1, V/V) and then kept at room temperature for 24 h dark. Chlorophyll a and b were measured at OD645 and OD663 with a spectrophotometer. Internal CO2 concentration (ICC), transpiration rate (TR) and net photosynthetic rate (NPR) were measured with the photosynthesis instrument (LI-6400-40). Water use efficiency (WUE) was calculated as the ratio of CO 2 assimilated/transpiration. The stoma number was counted by software ImageJ. Five plants were measured per biological rep and each biological rep had three technical reps.

Short internode in the mutant is due to less cell numbers longitudinally
WI7201 displays compact phenotype with significantly reduced internode length with the petiole, flowers and tendrils all clustered on the main stem (Li et al. 2011a, b, c;Fig. 1A). When the cotyledons were fully expanded, the hypocotyl of compact plants (~ 2.44 cm) was significantly shorter than that of WT (~ 3.76 cm). The pedicel of male and female flowers (Fig. 1A1, A2), and seed size (Fig. S1) of the mutant was also shorter or smaller than the WT. At adult stage, the difference of plant height between WT and mutant was even more obvious since the mutant had extremely short internode but the fruit set of WI 7201 was not affected (Fig. 1A3-A5).
We counted cell numbers and measured cell size of WT and cp mutant stem by histological analysis (Fig. 1C-E).
Longitudinally, the most notable difference was that WT had more cortex cells than the mutant while the average cell size was only ~ 50% that of the mutant. Given the fact that WT was much taller with more cells per unit area than the mutant, the much-reduced internode length must be due to reduced longitudinal cell proliferation. The larger cortex cells in the mutant were probably the result of lower division rate which potentially provided some compensation of cell elongation because of longer cell cycle.
Fine mapping placed the cp locus into 8.8-kb region harboring only one gene Using segregating populations derived from WI7200 (CpCp) × WI7201 (cpcp), we previously mapped the cp locus into a 178-kb region on Chr4 (Li et al. 2011a, b, c;Yong et al. 2013). Due to severe segregation distortion of markers in this region in this population, we developed a new F 2 population from the cross between WI7201 and WI7123 (WT) for fine mapping and cloning of the cp locus. The F 1 plants had the same vine length as WI7123. Among 9545 F 2 plants, 7302 were WT and 2243 were mutant, which was consistent with the expected 3:1 ratio (P = 0.026 in χ 2 test). We genotyped the 2,243 mutant F 2 plants with UW084680 and UW084870 that were flanking the cp locus, which were physically 340 kb apart (Gy14v2.0; Fig. 2A). Sixty recombinants were identified between the two markers.
Based on resequencing reads alignment of the two parental lines against the Gy14v2.0 draft genome, 10 new polymorphic markers located in this region were developed to genotype the 60 recombinants. Combined genotypic and phenotypic data of these recombinants allowed to delimit the cp locus into a 13.2 kb region flanked by markers NWSTS009 and NWSTS014-indel while NWSTS012-dCAPS was cosegregating with the cp locus (Fig. 2B). Since there were still 6 recombinants between the two flanking markers, 4 new CAPS/dCAPS markers were developed which further narrowed down the cp locus into an 8.8-kb region defined by markers NWSTS30-dCAPS and NWSTS035-CAPS. Interestingly, NWSTS035-CAPS was located inside the cp candidate gene indicating intragenic recombination (Fig. 2C).

The LRR receptor-like serine/threonine-protein kinase gene (CsERECTA) is a candidate for cp
In the Gy14v2.0 genome, only one gene, CsGy4G024070 was annotated in this 8.8 kb region (Fig. 2C). It was predicted to encode a leucine-rich repeat (LRR) receptor-like serine/threonine-protein kinase. CsGy4G024070 has the highest homology to the Arabidopsis ERECTA (ER) gene (AT2G26330). As such, the cp candidate gene will be referred to CsERECTA (CsER) hereinafter. CsER was predicted to have 27 exons (Figs. 2D, S2A). We cloned the fulllength CDS of CsER from WI7200 (WT), WI7201 (cpcp) and WI7123 (WT). Sanger sequencing indicated that CsER from the two WT lines had the same length (2976 bp), but it was 72-bp shorter in the mutant (2904-bp) (Fig. 2E). Alignment of the CDS between WT and cp indicated that the 22nd exon was missing in the mutant. Besides, there were 15 SNPs inside CsER among the three lines, only three of which were unique but synonymous in the mutant compared to the two WT lines (Fig. S2B) suggesting that the deletion of the 22nd exon in the mutant is the causal polymorphism

Loss of the 22nd exon in the mutant is due to an LTR retrotransposon insertion
To find the reason why there was a 72-bp deletion in the mutant cDNA, we tried to clone the CsER genomic DNA (gDNA) sequences from the two WT and cp plants. Cloning from the two WT resulted in the expected sequences, which, however, was unsuccessful in the mutant. We thus designed different combinations of PCR primers to clone CsER gDNA from cp plants. All PCR amplifications were successful expect for the one using the primer pair targeting the 22nd exon. We performed genome walking from one side toward the unknown region, which revealed a 5548-bp insertion in the 22nd exon of CsER in the mutant. Annotation of this 5548-bp insertion suggested that it belongs to a Copia-type long terminal repeat-retrotransposon (LTR-RT) (Wicker et al. 2007) that had 5 typical conserved domains: RNase_HI_RT_Ty1, RVT_2, rve, gag_pre-integrs and Ret-rotran_gag 2 and 3 super families (Fig. S2A). The sequence "GTA TCT " in CsER was the target site duplication (TSD) and the LTR-RT was located between the two duplicates.
To check if this LTR-RT insertion in the CsER gene also occurred in natural cucumber populations, two pairs of primers targeting the insertion sequences were designed (Table S1). The forward primer of "cp-locus" primer pair was within the LTR-RT and the reverse primer in the 23rd exon. Primer pair "cp-CK" was used to amplify a band as the positive control. The cp-CK band was 889 bp and the amplicon of the "cp-locus" was 633 bp in length. Thus, any cucumber lines carrying this insertion would have two PCR bands in agarose gel electrophoresis while the WT will only have the cp-CK band (Fig. S5). The two markers were used to genotype 300 cucumber accessions (Bo et al. 2016) with the two compact lines, PI 308,915 (WI7201), PI 308,916 (WI7202), and WI7123 × WI7201 F 1 as the controls. Only the three controls amplified the LTR-RT-specific band and the rest only had cp-CK bands (Fig. S5). In addition to Gy14v2.0 and 9930v3.0, draft genomes of 10 other cucumber lines are also publicly available (https:// www. cucur bitge nomics. org/). We examined the promoter, gene body and 3' UTR sequences of CsER from these 10 genome assemblies. None of the 10 lines (all WT at Cp locus) carried the LTR-RT insertion. These observations further confirmed that the LTR-RT insertion was responsible for the compact vining type.
In addition to the missing 24 amino acids in the cp mutant, the alignment of the WT and mutant protein sequences suggested that the 72 bp deletion also resulted in an amino acid change (Fig. 2E). The deletion of the amino acids did not seem to affect its conserved domains such as the leucine-rich repeat (LRR) and the protein kinase domain (Fig. S2C). It is well known that ERECTA is a transmembrane protein with extra-cellular receptor domain and intracellular protein kinase domain (Torii et al. 1996;Shpak et al. 2003). We predicted the topological structure of WT and mutant proteins using Protter and Phobius (Käll et al. 2007;Omasits et al. 2013). It was interesting to see that the WT protein was predicted to have a typical single transmembrane domain with an extra-cellular amino terminal LRR domain and intra-cellular carboxy terminal kinase domain (Fig S6), while the mutant is predicted to have two transmembrane domains changing the location of the kinase domain from intra-cellular to extra-cellular. This suggests that cp may be a loss-of-function mutation of CsER.
To characterize CsER as a typical LRR receptor-like serine/threonine-protein kinase, subcellular localization of CsER was carried out by co-infiltration of A. tumefaciens strain GV3101 carrying 35S::gCsER::eGFP plasmid into tobacco (Nicotiana benthamiana) leaves. Since the introns in AtER (in Arabidopsis) gDNA are essential for stabilizing its expression (Karve et al. 2011), we assume that it could be the same to CsER. The construct contained the whole length gDNA from the start codon to the stop codon (but without the stop codon) including all the introns. The 35S::gCsER-eGFP protein was found to be located on the plasma membrane (Fig. 5A), which is consistent with its predicted function of transporting extracellular signals into cell.

ER seems to have conserved functions in plant development
The LRR receptor-like kinase proteins are a large family that have been reported to regulate cell division and differentiation, in which the proteins with similar structures often have redundant or overlapping functions (Shpak et al. 2004;Hord et al. 2006;Chen et al. 2013). We conducted phylogenetic analysis of ER proteins, and other structurally or functionally similar proteins with ER including ER-like (ERLs), CLAVATA1, and BAMs from nine plant species (Table S2). The resulting phylogram is presented in supplemental Fig. S7. In Arabidopsis, the ER protein family has three members: AtER, AtERL1 and AtERL2, while there were two annotated in cucumber so far: CsER and CsERL1. To find more potential ER family proteins in cucumber, we used the protein sequence of CsER to do BLASTP against the 9930v3.0 genome. Based on the similarity of protein sequences, CsaV3_6G044540 was the closest protein to CsER and CsERL1, which, however, belonged to the CLAVATA protein family. This suggested that there are only two members in the CsER protein family. Recently, CsCLAVATA1encoding an LRR receptor-like kinase protein was suggested as a putative candidate gene for a dwarf mutation Csdw in cucumber (Xu et al. 2018), which is homologous to BAM proteins in Arabidopsis. Interestingly, although the authors did not mention that loss-of-function of AtBAM proteins resulting in dwarf plant, the bam1bam2 double mutant and the bam1bam2bam3 triple mutant were significantly shorter than the wild-type Ler-1 . These results suggest that LRR receptor-like kinase family genes may perform similar functions that are important to plant growth especially stem elongation.

CsER is predominantly expressed in shoot apical meristem (SAM)
We compared spatial-temporal expression of CsER in WT and the mutant. We first performed qPCR in different organs (cotyledon, hypocotyl, true leaf, main stem) and SAM (from growing tips) at 7 and 21 days after germination (DAG) (Fig. 3A). At 7 DAG, the expression of CsER in SAM was slightly higher than that in the cotyledon or hypocotyl. At 21 DAG, CsER expression exhibited a significant increase in the SAM. Using the expression level in the cotyledon at 7 DAG as the reference, CsER expression in SAM at 21 DAG was the highest among all the organs examined, which was 66 folds higher compared to the reference. However, there was no difference in the expression level of CsER in any organs examined between WT and mutant. mRNA in situ hybridization showed that transcripts of CsER strongly and peripherally accumulated in young organs such as leaf primordia, floral buds and especially in SAM (Fig. 3B). These observations suggest that CsER plays a very important role in development of meristematic tissue, which is critical for morphogenesis of aerial organs. To confirm whether the CsER protein level is different in SAM between WT and mutant, we conducted western blotting to detect abundance of CsER protein in SAM of the WT and mutant. The result revealed Pro ER -GUS is strongly expressed at stem, leaf vein, pedicels and flower buds D3, and is also highly expressed at stoma, which is indicated by red arrows in D4 and D5. Scale bars = 0.5 mm that CsER protein in the mutant was much weaker than that in the WT (Fig. 3C).
To further characterize the expression pattern of CsER, we perform GUS assay using CsER promoter-driven GUS gene in Arabidopsis. We cloned the promoter (− 2053 to − 243 bp upstream the start codon) of CsER from WT plants and inserted it to pCambia1391Z to express a GUS reporter gene. To visualize promoter activities, whole transgenic Arabidopsis plants carrying PRO ERECTA ::GUS were stained at cotyledon (Fig. 3D1), 4-true-leaf leaf (Fig. 3D2), and inflorescence (Fig. 3D3) stages. The GUS gene was expressed at leaf veins, stomata (Fig. 3D4-5), stem, flower, and most remarkably at SAM at all the three stages. In general, the GUS activity at the cotyledon stage was weaker than that of two later stages. Taken together, these data confirmed that CsER may predominantly control SAM development, where stem cells are proliferated and differentiated.

Ectopic expression of CsER in Arabidopsis rescues er105 dwarf phenotype
The Arabidopsis er105 mutant carries loss-of-function mutations in the AtER gene (AT2G26330, homolog of CsER), which exhibits reduced plant height and pedicel, smaller rosette radius and more compact inflorescence than the WT (Col background) (https:// www. arabi dopsis. org/). Since the cucumber cp mutant exhibits stronger compact phenotype than er105, we asked if CsER and AtER perform similar functions, and WT CsER could rescue the er105 phenotype. Two gDNA sequences of CsER with 9812 and 8940 bp were ligated into the binary vector pCambia3301, respectively. One (PRO ERECTA -CsER) included the sequence from the 2053-bp promoter to the 896-bp of 3'UTR, and the second (PRO ERECTA -CsERf) was fused with a flag tag but without the native 3'UTR. The two constructs were transformed into Arabidopsis Col-0 (WT) and er105, respectively. Two representative lines (L1 and L2) were phenotypically characterized. CsERer105 (L2) could completely rescue the plant height of er105 (the same as Col-0), while that of CsERer105 (L1) and CsERfer105 (L1 and L2) was only partially recovered (Fig. 4A, B). We examined the expression of both endogenous AtER and exogenous CsER genes in nontransgenic and transgenic Arabidopsis (Fig. 4C). The plant height and expression level among these plants showed a positive correlation. No significant difference was observed between non-transgenic Col-0 and transgenic Col-0 in the plant height. Interestingly, we observed that the expression level of native AtER was down-regulated, which might be due to the accumulation of the exogenous CsER in transgenic Col-0. However, the total expression of ER (AtER + CsER) was the highest in transgenic Col-0 plants (Fig. 4C). This might imply that there is an expression cap of ER, beyond which it would not increase the effect of ER, behaving a 'response' cap manner in transgenic Arabidopsis. CsER could also partially rescue the smaller rosette radius and increase the length of pedicels (Fig. S8) in er105. In addition, the PRO ERECTA -CsER and PRO ERECTA -CsERf transgenic plants did not show any significant differences (Fig. 4A-C) which may suggest that the effect of CsER protein is not affected by the fused flag tag. We also overexpressed CsER in er105 and Col-0 using constructs driving by the 35S promoter. The resulting transgenic plants had a similar phenotype of the CsER plants driven by CsER promoter (Fig. S9).

Mutated Cser does not affect homo-dimerization of ER proteins
Like most mammalian receptor kinases, an ER protein forms dimers to perform its functions . We asked if the mutation in CsER would impact the self-association of CsER protein. The full length CDS of CsER from WT and WI7201 was cloned into BiFC vectors to test the selfinteraction in tobacco leaves. We found that CsER protein could associate with itself on cytomembrane. The mutated CsER protein from WI7201 did not seem to affect such selfassociation with both WT CsER and mutant CsER proteins, though (Fig. 5B).

Transcriptome profiling reveals hormone biosynthesis/signaling and photosynthesis pathways associated with CsER-dependent regulatory network
Since CsER showed the highest expression in SAM, to explore the CsER involved regulatory network in cucumber plant development, we performed BSR-Seq with pooled SAM tissues for the WT (CpCp) and mutant (cpcp) F 2 plants. The complete transcriptome datasets have been deposited into NCBI under BioProject accession PRJNA858107. Differentially expressed genes (DEGs) in the mutant vs WT were identified using P value ≤ 0.05 and | log2 (Fold Change) |≥ 0.5 as the cutoffs. Among 1796 DEGs detected, 1042 and 754 were up-and down-regulated in the mutant, respectively (Table S3). Nearly 11% (193 of 1796) of these DEGs were transcription factors which are known to play important roles in plant growth and development ( Fig. 6A; Table S3). KEGG enrichment analysis revealed that the DEGs for basic and secondary metabolic pathways were the most enriched (Fig. 6B). In addition, GO enrichment analysis also identified highly enriched biological processes (response to stimulus, hormone-related pathways, photosynthesis pathways), cellular components (nucleus, cell periphery, cell membrane, extra-cellular region and photosystem), and molecular functions (DNA binding, transporter activity, transcription regulators chlorophyll binding) in the mutant as compared with the WT (Fig. S10). Many DEGs were involved in major hormone biosynthesis/signaling pathways such as auxin/IAA, ethylene (ET), cytokinin (CK), BR, and GA. The DEGs for the auxin/IAA, ET and CK biosynthesis/signaling were particularly enriched. Cyclin-dependent kinase genes controlling cell cycle transition and progression were also down-regulated and enriched. Some representative DEGs are illustrated in Fig. S11.
Since RNA-seq clearly indicated involvement of major phytohormone biosynthesis/signaling pathways by CsER, we investigated effects of exogeneous application of phytohormones on plant growth in both WT and mutant plants. Both plants were treated with IAA, 6-BA, GA and BR. While each hormone could significantly increase plant height in WT, such effect was not found in the mutant (Fig. 7) suggesting that the signaling transduction of these hormones may be impaired in the mutant. We also observed that 6-benzylaminopurine (6-BA) could induce branch development in WT plants (Fig. 7). Interestingly, the cp mutant was hyper-sensitive to high concentration of BR treatment. Although the WT and the mutants both showed defects on root development after BR treatment, the mutant did not show any root development at all, suggesting that CsER might play an important role in BR signaling pathway in a dominant-negative manner.
Many photosynthetic pathway genes in the mutant were up-regulated including those encoding chlorophyll a-b binding proteins ( Fig. S11C; Table S3). This seems consistent with the dark green leaf of the mutant as seen in the field (Fig. 1A4). We speculate that the photosynthesis might be enhanced in the mutant. We measured the chlorophyll contents and the photosynthetic parameters in the WT and mutant (Table S4). Indeed, the mutant had higher chlorophyll a and chlorophyll b contents than the WT. The mutant had a photosynthetic rate of 8.76 ± 0.40 μmol m −2 s −1 , which was significantly higher than that in the WT (3.97 ± 0.30 μmol m −2 s −1 ). The stomatal conductance and transpiration rate were also higher in the mutant (Table S4). We further noticed that the stoma density was higher in the mutant (283 ± 13.47/mm 2 ) than that in the WT (130 ± 11.54/mm 2 ) (Fig. S12). Collectively, these data may suggest that CsER has a negative effect on chlorophyll content and stoma density, which further affects photosynthesis in cucumber.

CsER as the candidate gene for cp in cucumber
In this study, through fine mapping with nearly 10,000 F 2 plants, we delimited the cp gene to an 8.8 kb region on Chr4, which only contains a single gene, CsER (Fig. 2C). Further analysis revealed that a 5.5-kb LTR-RT insertion in the 22nd exon of CsER caused a deletion of 24 amino acid residues and one amino acid substitution in the CsER protein ( Fig. 2D-E). A marker specific for the LTR-RT insertion showed complete co-segregation with the compact phenotype in the very large F 2 population and the natural population with 300 accessions. The results showed that the LTR-RT insertion in CsER was only present in the mutant WI7201 and PI308916, which is allelic to WI7201 (Kauffman and Lower 1976). Moreover, ectopic expression of CsER in Arabidopsis rescued the er105 mutant phenotype (Fig. 4). All these data provided convincing evidence to support CsER as the candidate gene for cp.

Structure and pleiotropic effects of ERCTA gene in plants
Previously reported ERECTA mutants exhibited defects in plant architecture. For example, Arabidopsis er mutants have varying degrees of short stem, pedicels and siliques and compact inflorescence (Torii et al. 1996;Lease et al. 2001;Shpak et al. 2004). The CRISPR-generated mutant oser1 of rice, which carries a pre-mature stop codon after the 46 th amino acid residue, shows increased spikelet number, reduced grain length but no other significant changes on plant architecture (Guo et al. 2020). The BdERECTA mutant vasc of Brachypodium distachyon encoding a protein with only 20 amino acid residues (978 in WT) exhibits severe sterility and short internode length, especially in lower internodes (Sakai et al. 2021). The melon semi-dwarf si mutant is due to a SNP resulting a premature stop codon in the kinase domain of CmER protein (Yang et al. 2020). Recently, a loss-of-function cucumber mutant cser, generated with gene editing in the first exon displayed reduced plant height (Xin et al. 2022). In the present study, the cp mutant showed significantly reduced internode and extremely dwarf plant, which was caused by the insertion of a 5548-bp LTR-RT in the 22nd exon of CsER (Fig. 2D-E). These observations suggest that the main conserved function of ER in plants is regulation of stem elongation. However, the effects of allelic mutations in different plants vary. For example, the melon si mutant was semi-dwarf, while the plant height (vine length) of the cucumber cer and cp mutants were ~ 34% and 20% shorter than their WT, respectively. While these differences could be due to the locations of mutations inside the ER gene in different plant species, it is also possible that different regulation mechanism of ER as in the cases of the melon si and cucumber cp mutants described above where significantly reduced cell size was in the stem of si and reduced cell proliferation but compensatory cell expansion was in stem of cp (Fig. 1C).
In addition to the effect on stem elongation, ER may also play an important role in regulating seed size. The seeds of WI7201 were significantly smaller than the WT (Fig. S1), but the germination appears not affected. In Arabidopsis, the seed size from er105 was only slightly smaller than that in Col-0. The rice oser1 mutant also had reduced grain length but increased width compared with WT (Guo et al. 2020) indicating that ERECTA homologs among species might have a conserved positive effect on seed shape/size. Interestingly, ectopic expression of CsER genomic DNA could rescue fully the plant height, but partially the compact inflorescence of Arabidopsis er105 (Fig. S8) even though CsER was strongly expressed at inflorescence as reflected from the GUS signal intensity (Fig. 3D). This may suggest divergent functions of the ER gene in different plant species. The regulatory mechanisms by ER may also be species specific. All ER proteins share an extracellular receptor domain, a transmembrane domain and a kinase domain (Torii et al. 1996). In the cucumber cp mutant, most of the deleted 24 amino acid residues in the mutant CsER protein were hydrophobic (Baeza-Delgado et al. 2013) which are predicted to be in a transmembrane structure (data not shown) by TMHMM v2.0. The putative topological structure of CsER protein in the mutant WI7201 was predicted to re-locate the kinase domain to extra-cellular. Given that the nature of a receptor-like kinase protein is to perceive extra-cellular signals and transduce them to other downstream proteins by intra-cellular kinase domain (Shpak 2013), and that overexpression of truncated AtER without the kinase domain results in an even more severe dwarf phenotype compared with the er mutant (Shpak et al. 2003;Villagarcia et al. 2012), we speculate that the very compact plant architecture in WI7201 was due to the dominant-negative disruption of the ER-mediated signaling pathway, which again implies that the intra-cellular kinase domain is very important for signaling transduction.
Several studies revealed transcript regulation of ERmediated plant growth and development (Shapk et al. 2003;Qu et al. 2017;Yang et al. 2020). For example, in melon, Yang et al (2020) found that there was a positive correlation between plant height and the transcript abundance of CmERECTA. In our work, no expression difference was found between WT and the cp mutant at various organs. Instead, the CsER protein level in SAM of WI7201 was lower than that of WT (Fig. 3C) indicating CsER may regulate plant height at the translation level, although further evidence is needed to confirm this.
Previous studies suggested that ER interacts with receptor-like kinase and cytomembrane proteins for its function (Shpak et al. 2003;Lee et al. 2012;Wang et al. 2017a, b;Yang et al. 2020). During its self-association or interaction with other LRR receptor-like proteins (for instance, the TOO MANY MOUTHS protein, aka. TMM in Arabidopsis), the LRR region seems to be preponderantly important to form heterodimers since the TMM protein does not have a kinase domain. We found that CsER can interact with itself in both WT and mutant forms (Fig. 5B) implying that the putative topological change did not seem to affect the interactions, at least, with itself. This again re-emphasizes the importance of the LRR region for dimerization. Since the expression level of CsER was similar in WT and the mutant, we speculate that the transcripts of WT and mutant alleles of CsER were in a ~ 1:1 ratio in heterozygous plants. However, from western hybridization, the protein level in the mutant was much lower than that in the WT (Fig. 3C). This might imply that in heterozygous plants, the WT CsER proteins were advantaged forms. Although Cser protein could interact with WT CsER protein and potentially produce non-functional dimers, the CsER-CsER homodimers were still the preponderant form, which might explain why the cp mutant behaves in a recessive manner under the phenomena that Cser could interact with WT CsER.

CsER-dependent phytohormone signaling for stem elongation
Phytohormones, especially auxin, cytokinin, GA and BR and their crosstalks are critical for cell division and cell expansion (Vandenbussche and Van Der Straeten 2007;Pacifici et al. 2015), thus affect stem elongation and overall plant architecture (Nakaya et al. 2002;Domingo et al. 2009;Li et al. 2011a, b, c;Plett et al. 2014). In the present study, we found that the plant height of the cp mutant was not responsive to treatments of four hormones, while all the treated WT plants were significantly higher (Fig. 7) suggesting that the mutant may lose its ability to transduce signals rather than lack of sufficient phytohormones. In Arabidopsis, ERECTA physically interacts with BKI1 (BRI1 KINASE INHIBITOR 1, a key regulator in BR signaling) to control plant architecture (Wang et al. 2017a, b). We found that the cp mutant was hyper-sensitive to high concentration of BR hinting possible connection of CsER with the BR signaling pathway. Indeed, in the WT and mutant transcriptomes, we found that eight members from the BR-responsive xyloglucan endotransglucosylase/ hydrolase genes  were expressed higher in the mutant. The enhanced BR-signaling pathway could probably explain the hyper-sensitivity of the mutant to high level BR. Also, one gene (CsaV3_5G038650) encoding a brassinosteroid-6-oxidase 2 for BR biosynthesis was down-regulated in the mutant. There are studies showing a possible feedback regulation on BR-biosynthesis genes DWF4 and CPD in BR signaling pathway (Chung and Choe 2013), which probably is one of the reasons that the expression level of CsaV3_5G038650 was lower in the mutant. In the melon si mutant, many DEGs are involved in auxin signaling, and it was proposed that the short internode in si was due to defects in auxin transport or signaling in the main stem (Yang et al. 2020). Unlike the melon si mutant that has significantly smaller cells in the stem than the WT, the cucumber cp mutant had larger cell and fewer cell numbers compared with WT (Fig. 1C). This may suggest that different hormone signaling pathways are involved in stem elongation in the two mutants. Transcriptome analysis showed that many genes involved in the cytokinin biosynthesis/signaling, and cell cycle regulating pathways were all down-regulated in the cp mutant (Table S3). Some examples included genes for the wellknown "APRRs", Longly Guy-like (LOGL) encoding cytokinin riboside 5'-monophosphate phosphoribohydrolase, cyclin D proteins and cyclin-dependent kinases. This may explain the fewer cell numbers but larger cell size in the cucumber cp mutant compared with the melon si mutant that attributed to influenced cytokinin pathway.

Potential use of compact mutant in cucumber improvement
A number of compact, super compact or dwarf mutants have been reported in cucumber (See Introduction). However, most of them may have limited use in cucumber breeding for manipulation of plant architecture because of various defects associated with the mutations. For example, scp-1 and scp-2 are male and female sterile, which are difficult to set fruit without hormone treatment (Hou et al. 2017;Wang et al. 2017a, b). The cp mutant plants could grow in both greenhouse and field conditions with relatively normal fruit set (Fig. 1). Plus, it has been shown that the truncated Arabidopsis ERECTA gene driven by its native promoter could be transformed to tomato to enhance drought tolerance in tomato (Villagarcia et al. 2012). In this study, we found, unexpectedly that the chlorophyll content and some photosynthesis parameters such as internal CO 2 concentration (ICC), transpiration rate (TR), net photosynthetic rate (NPR) and water use efficiency (WUE) were all enhanced in the cp mutant (Table S4). Considering the extremely compact architecture of the mutant, a reasonable explanation was that the mutant plants might have a smaller surface area in contact with the air. Consequently, the total loss of water was decreased. Nevertheless, how these parameters affect productivity and yield in cucumber merit further investigation.