Experimental design for microarray dataset
Rice leaf ontogeny, i.e., the developmental process from initiation to maturation, is described in Itoh et al. (2005) . Briefly, according to the staging system based on plastochron numbers (Pn), the P1 leaf primordium protrudes from the SAM and then grows to surround the SAM at stage P2. During the P1 and P2 stages, the leaf primordium consists of undifferentiated cells with no morphological characters. During the P3 stage, the boundary between the blade and sheath is established, and the future blade and sheath parts can be distinguished. In addition, the ligule primordium is formed in the boundary region at this stage. Although most of the P3 leaf primordium is comprised of undifferentiated cells, the outermost cells on the distal side of the primordium begin to differentiate into epidermal cells. During stage P4, the leaf blade elongates rapidly, and the difference between the blade and sheath becomes more pronounced. The P4 leaf primordium exhibits a clear gradient of cell states along its longitudinal axis; cells in the proximal region remain undifferentiated, whereas those in the distal region are differentiated. During stage P5, the leaf sheath elongates rapidly, and the growth and maturation of the leaf are completed by the P6 stage, whereas bending of the lamina joint occurs between stages P5 and P6.
To obtain a comprehensive transcriptome of leaf development in rice, we sampled 12 leaf parts representing various stages and components along the longitudinal axis (Fig. 1b–d; Table 1). Rice seedlings at the four-leaf stage were dissected into 12 parts: shoot apex containing the SAM and P1 and P2 leaf primordia (Fig. 1d); entire P3 leaf primordium (Fig. 1c); apical, middle, and proximal parts of P4 leaf blade; P4 leaf sheath; and the leaf blade, sheath, and boundary region of P5 and P6 leaves. Three biological replicates were prepared for each part, and their RNA was hybridized to a 44K rice microarray (Agilent Technologies, Santa Clara, CA) [26, 29]. Out of 43,144 probes corresponding to 29,864 genes on the rice 44K microarray platform, 31,996 probes corresponding to 24,022 genes were expressed in at least one sample. Normalized expression levels of these 31,996 probes corresponding to 24,022 genes were used in this study. Pearson correlation analysis showed strong correlations among the three replicates, indicating that our dataset was highly reproducible (Supplemental Fig. 1).
Transcriptome dynamics during rice leaf development
To elucidate the transcriptome dynamics that occur during leaf development, principal component analysis (PCA) was performed using all samples. The first, second, and third principal components (PC1, PC2, and PC3) explained 60.9%, 13.2%, and 8.1% of the total variance among samples, respectively (Supplemental Fig. 2). Plotting the samples within the three-dimensional space defined by PC1, PC2, and PC3 allowed the relationships among the samples to be visualized, reflecting the two properties of tissue differentiation state and tissue identity (Supplemental Fig. 2a). However, these properties were poorly represented by each principal component (Supplemental Fig. 2b and c) due to the fact that PC2 was excessively attracted toward P4Bm, which was located distant from the other samples. Generally, PCA is not robust to outliers, and principal components tend to be attracted toward them, interfering with the detection of the overall dataset structure. To avoid excessive attraction of PC2 toward the outlier P4Bm, we modified PC1, PC2, and PC3 while retaining positional relationships between the samples, as follows. PC1, PC2, and PC3 scores were treated as variables, and PCA was again applied using all samples except P4Bm, resulting in modified principal components (mPC1, mPC2, and mPC3) that were not excessively affected by P4Bm. Altogether, mPC1, mPC2, and mPC3 explained 60.9%, 11.9%, and 9.5% of the total variance among samples, respectively, and successfully captured the characteristic patterns of the dataset (Fig. 2).
Two groups were separated by mPC1, namely, immature tissues represented by SA, P3, P4S, and P4Bb, mature tissues represented by P4Ba, and samples derived from P5 and P6 stage leaves. This result suggests that mPC1 represented the differences between immature and mature tissues (Fig. 2b). Conversely, mPC2 characterized samples with intermediate tissue differentiation, most notably P4Bm, suggesting that mPC2 represented the transient state of the transcriptome during tissue differentiation (Fig. 2b). Thus, an arrow fitting the distribution of the samples in the space defined by mPC1 and mPC2 would represent the change in transcriptome dynamics associated with tissue differentiation from the immature state through the transient state to the mature state. Collectively, mPC1 and mPC2 explained 72.8% of the total variance among samples, suggesting that tissue differentiation state has profound effects on the leaf transcriptome. Moreover, samples derived from the P4 leaf exhibited large transcriptomic variations, whereas all P4-stage leaf samples including sheath samples were aligned along the arrow. This result indicates that the shift in the transcriptome associated with leaf maturation is found throughout the leaf during P4, coinciding with intensive basipetal tissue differentiation at stage P4 .
In addition, mPC3 separated samples of the leaf bladeࣧP4Bm, P4Ba, P5B, and P6Bࣧfrom those of the leaf sheath and blade-sheath boundary regionࣧP5S, P6S, P5BS, and P6BS, indicating that mPC3 represented differences between the leaf blade and sheath (Fig. 2c). On the other hand, only slight differences were observed among immature leaf samples such as P3, P4S, and P4Bb, suggesting that the transcriptomic difference between the leaf blade and sheath becomes more pronounced during maturation.
Overall, our results suggest that the transcriptome of each part of the leaf changes with the progression of tissue differentiation and the acquisition of tissue identity.
Gene expression patterns during leaf development and their associations with gene function and transcriptional regulation
To uncover the major gene expression patterns during rice leaf development, we conducted cluster analysis of genes based on their expression patterns. Prior to cluster analysis, analysis of variance (ANOVA) was applied to detect differentially expressed genes among different parts of the leaf. Of 31,996 probes corresponding to 24,022 genes, 31,043 probes corresponding to 23,350 genes were extracted (p-value = 0.001 when adjusted for the false discovery rate [FDR]). K-means clustering was performed on these probes, and 28 clusters with distinct expression patterns were obtained (Supplemental Fig. 3; Supplemental Table 2). Consistent with the results of PCA, the expression pattern of each cluster was affected by differences in tissue differentiation state among samples. For Clusters 1 to 8, higher expression levels were found in samples containing immature tissues (Supplemental Fig. 3a). For Clusters 9 to 14, transiently elevated expression was found in samples with intermediate tissue differentiation states (Supplemental Fig. 3b). Genes in Clusters 15 to 24 were most strongly expressed in samples containing mature tissues (Supplemental Fig. 3c). For some of the clusters, there were large differences in expression levels among samples and characteristic expression patterns, suggesting that some groups of genes undergo similar changes in gene expression, and that such changes are associated with events during leaf development.
To evaluate how the gene expression patterns and dynamics of these gene clusters are related to the functions of the genes, we conducted Gene Ontology (GO) enrichment analysis on each cluster (Supplemental Fig. 4). In addition, given the importance of transcriptional regulation to development, transcription factors and transcriptional regulators were extracted from each cluster (Supplemental Fig. 5). These analyses identified the characteristic functions of genes within each cluster, including several genes that may be involved in specific processes during rice leaf development (Fig. 3; Table 2).
Cluster 1, a group of genes that was specifically expressed in the shoot apex, was enriched in genes involved in transcriptional regulation (regulation of transcription, p = 4.8e−07). Within this cluster were class I KNOX genes, which are important for SAM maintenance , and OsNAM/OsCUC3 genes, which may be involved in organ boundary formation . Thus, Cluster 1 was predicted to include genes related to SAM function and leaf initiation. This cluster also included genes in the BBM clade of the PLETHORA family , which are expressed in crown root primordia , and OsTB1, which is expressed in axillary buds . Because the shoot apex tissue used in this study contained stem tissue as well as the SAM and leaf primordia, the presence of root- and axillary bud-related genes in this cluster was not surprising.
Cluster 2 contains genes that were highly expressed in tissues undergoing active cell proliferation. In this cluster, GO terms associated with cell division and cytokinesis (microtubule-based movement, p = 8.6e−05) were detected. Moreover, it contained ANT clade genes of the PLETHORA family , GRF family genes , and OsGIF1/MKB3 , which have been described as promoters of cell proliferation in leaf primordia. Thus, Cluster 2 was expected to contain important genes related to cell proliferation in leaf primordia.
Cluster 9 genes were highly expressed in the middle parts of the P4 leaf blade and P5 leaf sheath. GO analysis revealed that this cluster contains class III peroxidases (response to oxidative stress, p = 4.2e−07). Some class III peroxidases regulate reactive oxygen species homeostasis in the apoplast, thereby affecting cell-wall stiffness [36, 37]. GO terms for cell-wall remodeling enzymes including XTHs (carbohydrate metabolism, p = 1.3e−3)  were also enriched in Cluster 9. Thus, Cluster 9 appears to be enriched in genes involved in the control of cell-wall extensibility and cell elongation.
Cluster 12 genes were mostly expressed at high levels in the mature leaf blade. GO terms for genes involved in photosynthesis (photosynthesis, p = 7.9e−21; carbon utilization by fixation of carbon dioxide, p = 8.5e−10; and electron transport, p = 3.3e−07) were enriched. This cluster included C2C2-CO-like family genes  and phytochrome-interacting bHLH factors (PIFs) . Thus, Cluster 12 was expected to contain a high concentration of genes associated with photosynthesis and light-mediated signal transduction.
Cluster 7 genes were highly expressed in immature samples and samples from mature tissues from the sheath and blade-sheath boundary region. This cluster was enriched in genes involved in carbohydrate metabolism (main pathways of carbohydrate metabolism, p = 3.7e−05). The leaf sheath is believed to act as sink tissue for carbohydrates prior to heading . In addition, this cluster includes OsBOP genes that are important for sheath development . Thus, Cluster 7 was enriched in genes involved in the carbohydrate sink function and sheath development.
Cluster 15 consisted of genes that were preferentially expressed in the mature sheath and blade-sheath boundary region. This cluster was enriched in genes related to GO terms for protein kinases (protein amino acid phosphorylation, p = 6.2e−05). The blade-sheath boundary contains the lamina joint, which bends between stages P5 and P6. Various phytohormones and environmental stressors affect the bending process [42, 43], and many protein kinases exhibit temporal changes in expression during lamina-joint bending . Thus, we assumed that Cluster 15 contains genes involved in lamina-joint bending.
Taken together, these analyses revealed genome-wide gene expression patterns during rice leaf development, and relationships between expression pattern and function were found in certain gene clusters.
Identification of genes with localized expression during leaf development
To identify novel genes that play important roles in early development and subsequent morphogenesis and tissue formation in the rice leaf, we selected genes from Clusters 1 to 8 using K-means clustering analysis. These clusters contain genes that tended to be highly expressed during the early stages and weakly expressed during the later stages, and thus were expected to include candidate genes. Forty-nine genes, most of them involved in transcriptional regulation, were selected from the gene clusters, and their spatial expression patterns around the shoot apex were examined through in situ hybridization. Examples of these gene expression patterns are described below.
Cluster 2 included the PLATZ family transcription factor Os02g0172800, which is a co-ortholog of ORESARA15 . This gene was expressed in the basal part of immature leaves, and its expression was strongest in the abaxial side of the leaf primordium (Fig. 4a, b).
Tissue-specific expression patterns of three genes in Cluster 3 were detected. Expression of OsbHLH080, a member of the bHLH gene family, was observed in the abaxial base of leaf primordia and developing ligules (Fig. 4c). Another bHLH gene, OsbHLH166, was expressed mainly in the presumptive lysigenous aerenchyma areas of leaf primordia (Fig. 4d, e). Os05g0363500, which encodes a WD40 repeat-containing protein, was expressed in the developing diaphragms of leaf sheaths (Fig. 4f).
OsGH3-4, a member of the GH3 family involved in auxin conjugation, was placed in Cluster 7 and exhibited elevated expression levels mainly in the leaf sheath and blade-sheath boundary region. Expression of OsGH3-4 was detected in both the central domain of the SAM and tissues adaxially adjacent to vascular bundles in the leaf sheath and midrib, where bundle sheath extension cells differentiate (Fig. 4g, h).
Additionally, Cluster 7 contained four OsARF paralogs belonging to the ARF6/8 subfamily . These genes, OsARF6/12/17/25, were expressed at the basal part of the leaf primordium around stage P3, with especially high expression levels at the margin (Fig. 5a–d, arrowheads). Moreover, these four OsARFs were also expressed in the developing ligule and marginal parts of the blade-sheath boundary region that were expected to differentiate into auricles (Fig. 5a–d). In addition to these common expression patterns among the four OsARFs, the characteristic expression of each gene was also observed. OsARF6 was expressed in the epidermis of the basal region of P4 leaf blades (Fig. 5a), whereas OsARF12 was expressed throughout the sheath and basal region of the blades (Fig. 5b). OsARF17 expression was detected in the adaxial epidermis at the blade-sheath boundary at stage P3 and in the lamina joint of P4 leaves (Fig. 5c), whereas OsARF25 was strongly expressed in the lamina joint of P4 leaves and at the base of sheaths in stages P4 and P5 (Fig. 5d).
In addition, ARF6/8 orthologs including the four OsARFs listed above are known targets of miR167, which is an evolutionarily conserved microRNA in seed plants . To clarify the relationships between the expression patterns of the four OsARFs and miR167, the accumulation patterns of mature miR167 in the shoot apex were examined through in situ hybridization using a probe containing BNANC, which is a bridged nucleic acid derivative of a locked nucleic acid (Fig. 5e). Signals specific to mature miR167 were detected at higher levels in the distal part and lower levels in the basal part, indicating that the accumulation of miR167 and the expression of the four identified OsARFs were largely exclusive.
In addition to the genes described above, we explored a number of genes with localized expression during leaf development based on the list obtained through K-means clustering (Supplemental Fig. 6). Therefore, our strategy of using K-means clustering facilitated the selection of genes that were differentially expressed among stages or tissues. In addition, the genes identified here should be further analyzed to uncover their functions in rice leaf development.