Application of the funRiceGenes database in rice and other crops
Since being published in December 2017, the funRiceGenes database has been visited by more than 49,000 users, with more than 129,000 sessions and over 490,000 page views (Figure 1) (Yao et al., 2017). An average of 3~4 pages was accessed per session. The number of users and page views of the funRiceGenes database were steadily rising in the last five years. Most of the sessions were from countries including China, the United States, India, Japan, and South Korea. To be noted, users from China, India, Japan, and South Korea accounted for more than 50% of all accessions, consistent with the reality that rice is the main food in Asia. Nevertheless, visits from the United States accounted for 13.3% of all accessions, ranking second among all countries. An in-depth investigation revealed that the funRiceGenes database was frequently visited by several cities including Beijing, New Delhi, Hangzhou, Wuhan, Seoul, and Shanghai, which was probably attributed to well-known rice research institutions located in these cities, including the Institute of Genetics and Developmental Biology in Chinese Academy of Sciences in Beijing; the China National Rice Research Institute in Hangzhou; the National Key Laboratory of Crop Genetic Improvement in Huazhong Agricultural University in Wuhan; the Institute of Plant Physiology and Ecology in Chinese Academy of Sciences in Shanghai, etc.
As of November 2021, the funRiceGenes database and the corresponding publication in GigaScience have been cited by 89 times (Yao et al., 2017). funRiceGenes provides the most accurate number of functionally characterized rice genes, which was referenced by many studies (Wing et al., 2018; Yang et al., 2020). The datasets in funRiceGenes were utilized to train models in bioinformatics studies (Gupta et al., 2021). The information of more than 4100 genes collected in funRiceGenes was frequently utilized to annotate the functions of gene sets obtained in many studies in rice (Kim et al., 2020; Qin et al., 2021; Fan Zhang et al., 2021). Furthermore, the funRiceGenes database was also used to disclose the potential functions of homologous genes in non-model plants including wheat, rye, sorghum, barley and pacaya palm (Hosni et al., 2021; Pang et al., 2020; Li et al., 2021; Wittern et al., 2021; Dhaka et al., 2020; Wang et al., 2021).
Features update of the funRiceGenes database
A static website (https://funricegenes.github.io/) was developed for browsing and searching functional rice genes deposited in funRiceGenes. Eight menus were deployed in the static website, including HOME, GENE, FAMS, KEYS, NEWS, DOCS, CITE, and LINK. The symbols of more than 4100 functionally characterized rice genes are listed in the GENE menu. Detailed information of a gene can be viewed by clicking on the corresponding gene symbol, including published articles related to the gene, MSU and RAPdb genomic locus of the gene, GenBank accession number, key information on the function of the gene, and information of related genes. The information of more than 6000 gene family members was recorded in the FAMS menu. Information of all members and related publications can be viewed by clicking the name of a specific gene family. Based on the published literature, we extracted more than 400 commonly used keywords concerning various functions of rice genes. All the keywords and the genes related to each keyword can be viewed in the KEYS menu, allowing rapid identification of all the functionally characterized genes related to a specific agronomic trait or keyword. The detailed updating records of the funRiceGenes database since being released in 2014 can be found in the NEWS menu. More than 7000 literatures on rice functional genomics studies are listed in the DOCS menu. To promote the usage of funRiceGenes, publications citing the funRiceGenes database are listed in the CITE menu. The LINK menu provides links to some useful plant databases and bioinformatics applications. Finally, key information including gene list, gene family list, keywords list, and other datasets collected in funRiceGenes can be downloaded from the HOME page of the database. To further facilitate the convenient usage of the static website, in-site searching was enabled with the help of Google and Bing, which were placed on the HOME page of the static website. Real-time visitor statistics of the static website since Oct 29 2021 recorded by RevolverMaps (https://www.revolvermaps.com/) is also displayed on the HOME page.
We further developed an R/Shiny web application for interactive queries of the funRiceGenes database in the previous study (Jia et al., 2021; Yao et al., 2017). The URL of the interactive web application was moved from http://funricegenes.ncpgr.cn/ to https://venyao.xyz/funRiceGenes/. In addition to query by gene symbols or keywords, the web application can be searched by the MSU or RAPdb genomic locus. In this study, we developed a new functionality under the ‘Download’ menu of the web application, which can be utilized to batch retrieve data from the funRiceGenes database by multiple user-input MSU or RAPdb genomic loci, or a user-input genomic region. The IDConversion functionality of the web application can be used to perform the conversion between MSU and RAPdb genomic locus, as well as the identification of orthologous genes between indica and japonica rice. To enable searching of the genic sequences, CDS, or protein sequences of all collected functional genes by sequence similarity, a BLAST interface was deployed in the web application, in this study. To further expand the application of the datasets collected in funRiceGenes, we developed a new feature under the ‘Annotation’ menu of the web application for functional annotation of gene sets obtained in experimental or high-throughput studies in rice. For an input gene set, a word cloud would be created utilizing key messages on the functions of input genes deposited in funRiceGenes.
Data update of the funRiceGenes database
Since being published in December 2017, the funRiceGenes database has grown by the addition of over 1300 newly cloned rice genes. More than 1950 published literature on these genes was deposited in funRiceGenes (Figure 2A). It was found that the keywords including rice, gene, regulates, protein, grain, and tolerance were enriched in the titles of these literature (Figure 2B). Similarly, the keywords including gene, plants, expression, stress, and grain were most frequently used in the abstracts of these literature (Figure 2C). These enriched keywords represent the research focuses on rice functional genomics studies in recent years. A list of representative functionally characterized rice genes is demonstrated in Figure 3 (Yiming Yu et al., 2019).
Understanding the genetic mechanisms underlying the size and weight of grains is critical to the improvement of rice yield. Up to November 2021, 162 genes regulating grain size and weight in rice were collected in funRiceGenes, including Grain Size and Abiotic stress tolerance 1 (GSA1), which is a positive regulator of grain size (Dong et al., 2020). Overexpression of GSA1 led to enhanced grain size and weight, as well as improved resistance to abiotic stresses, including high salt, drought, and high temperature. Unlike abiotic stress resistance genes, disease resistance genes usually lead to reduced yield in rice. Among all the 185 disease related genes collected in funRiceGenes, Ideal Plant Architecture 1 (IPA1) is the first gene identified in rice that can promote yield and disease resistance at the same time by maintaining the balance between growth and disease resistance (Wang et al., 2018). IPA1 binds to the promoter of the pathogen defense gene WRKY45 after being infected by the fungus Magnaporthe oryzae, leading to enhanced blast disease resistance (Haitao Zhang et al., 2016). Within 48 hours after infection, IPA1 binds to the promoter of DENSE AND ERECT PANICLE1 (DEP1), a yield-related gene, to support the growth needed for high yield (Huang et al., 2009). In addition to blast disease, blight disease caused by Xanthomonas oryzae pv. oryzae has also put rice production under threat. A recent study reported that ALEX1, a long noncoding RNA whose expression is highly induced by Xoo infection, activated the JA signaling pathway and contributed to blight disease resistance in rice (Yang Yu et al., 2020).
Plant hormones are biochemicals playing critical roles in the regulation of all aspects of plant growth and development, including auxin, gibberellins (GA), abscisic acid (ABA), cytokinins (CK), salicylic acid (SA), ethylene (ET), jasmonates (JA), brassinosteroids (BR), and strigolactones. A total of 623 hormone related genes were collected in funRiceGenes, including 171 auxin related genes, 142 GA related genes, 126 ethylene related genes, 105 SA related genes, 101 JA related genes, 89 BR related genes, 68 CK related genes, and 21 strigolactone related genes. Among all the 623 genes, ACE1 (ACCELERATOR OF INTERNODE ELONGATION 1) and DEC1 (DECELERATOR OF INTERNODE ELONGATION 1) are GA responsive genes identified in recent years, which act antagonistically in the regulation of internode stem elongation in rice (Nagai et al., 2020).
In the last five years, the funRiceGenes database has also witnessed notable progress in the identification of genes related to various phenotypic traits in rice, including asexual reproduction and herbicide resistance. BABY BOOM1 (BBM1) is an AP2/ERF transcription factor expressed in sperm cells, which can induce parthenogenesis in rice (Khanday et al., 2019). An apomixis system can be established by ectopic expression of BBM1 in the egg cell to achieve asexual reproduction of rice seeds, by replacing meiosis with mitosis. As homologous BBM-likegenes and MiMe (substitute mitosis for meiosis) genes were also present in other cereal crops including maize (Lowe et al., 2016; Horstman et al., 2014), this method can be directly applied to other cereal crops for maintaining heterosis through seed propagation. Benzobicyclon (BBC) is a β-triketone herbicide extensively used in weed control. However, some crops are also susceptible to BBC, which discourages its usage and diminishes its value in weed control. A recently cloned gene HIS1 (HPPD INHIBITOR SENSITIVE 1) encodes an oxidase that detoxifies BBC herbicides by catalyzing their hydroxylation, resulting in enhanced resistance against BBC and other β-triketone herbicides in rice, which would be useful in herbicide-resistant plant breeding (Maeda et al., 2019).
Updated interaction networks of functionally characterized rice genes
Based on the >4100 functionally characterized rice genes, a total of 219 interaction networks comprising 1825 genes were constructed using the approach proposed in the previous study (Yao et al., 2017). In total, 2819 connections between genes supported by 6041 pieces of evidence were extracted from the titles and abstracts of published literature. The number of genes in the largest network increased from 762 to 1281 (Figure 4). Accordingly, genes involved in the same biological pathway or with similar functions were found to be clustered together. We also found that genes associated with flowering are still the key component of the largest interaction network. These gene interaction networks were built by inspecting the concurrence of the symbols of two or more genes in the same sentence of published literature. The approach and the built interaction networks would shed light on future studies on gene functions in other crops.