ExtRamp Online: Enabling ramp sequence calculations via an intuitive web interface.

Motivation: Ramp sequences are an understudied evolutionarily-conserved mechanism for regulating protein translational eciency. Slowly-translated codons concentrated at the 5' end of genes form ramp sequences that counterintuitively increase overall translational eciency by evenly spacing ribosomes at initiation, which limits downstream ribosomal collisions. We previously developed ExtRamp, which is the only algorithm to identify translational ramp sequences in single genes. ExtRamp currently lacks a web interface to facilitate wider adoption and application for non-programmers. Additionally, ExtRamp currently identies ramp sequences using only species-wide codon eciencies that may lack the specicity of tissue and cell type-specic codon usage biases. Results: We present an online interface for ExtRamp to facilitate wider adoption and application for non-programmers, along with a signicant improvement to the underlying algorithm to calculate tissue and cell type-specic ramp sequences (https://ramps.byu.edu/ExtRampOnline). ExtRamp Online contains all options available in the original ExtRamp algorithm with additional pre-set default values to enable researchers to calculate human tissue-specic or genome-wide ramp sequences on any web browser. Human tissue and cell type-specic codon usage biases have been precomputed and can be applied with a simple drop-down menu. Hover-over hints provide users with detailed information on all available options, which will help facilitate future creative analyses using ramp sequences. Availability: ExtRamp Online is publicly available at https://ramps.byu.edu/ExtRampOnline. All associated scripts are publicly available at https://github.com/ridgelab/ExtRampOnline.


Motivation
Ramp sequences are an important evolutionary-conserved mechanism that plays an essential role in regulating translational e ciency. Synonymous codon usage biases directly affect protein translation e ciency (i.e., expression) because codons with more common cognate anticodons are generally translated faster than suboptimal codons with rare cognate anticodons (Dana and Tuller, 2014).
Surprisingly, suboptimal codons are over-represented at the 5' end of highly expressed genes, forming a ramp sequence that limits downstream ribosomal collisions by evenly spacing the ribosomes at translation initiation (Tuller, et al., 2010). We previously developed ExtRamp , which is the only algorithm to identify gene-speci c translational ramp sequences in silico. Notably, ExtRamp has been used to identify ramp sequence conservation across all domains of life (McKinnon, et al., 2021; and population-speci c ramp sequences within different human populations (Hodgman, et al., 2020). However, ExtRamp currently is limited to command-line applications, which require users to navigate various options that are tailored toward genome-wide analyses. Here, we present ExtRamp Online, which is an intuitive web interface that allows researchers to calculate translational ramp sequences on any web browser. We fully integrate tissue and cell type-speci c codon optimality into ExtRamp Online to facilitate creative ramp sequence analyses that capitalize on underlying differences in tissue-speci c codon usage biases (Dittmar, et al., 2006;Kames, et al., 2020) caused by local tRNA pools and tissue-speci c RNA binding proteins (Dittmar, et al., 2006;Payne and Alvarez-Ponce, 2019). Therefore, we anticipate that ExtRamp Online will allow researchers to easily calculate both genome-wide and tissue-speci c ramp sequences, compare the effects of tissue-speci c translational ramp sequences on gene expression using their own datasets, and utilize ExtRamp more fully without requiring any bioinformatics expertise.

Implementation
We built ExtRamp Online using an ASP.NET Core 3.1 (https://dotnet.microsoft.com/apps/aspnet) framework on a Microsoft Windows server hosted at Brigham Young University. ExtRamp Online runs a modi ed version of ExtRamp within the web browser using Pyodide (https://pyodide.org/en/stable/). All parameters are customizable and preset to a default test case that allows users to see the expected output. Additionally, scroll-over hints teach users how to change parameters and effectively use those parameters in ramp sequence calculations. All associated scripts and information about the website are available at https://github.com/ridgelab/ExtRampOnline.

Tissue and cell type-speci c codon e ciencies
The ExtRamp algorithm requires the relative adaptiveness of each codon, which is generally calculated from a set of highly-expressed genes . ExtRamp determines the relative synonymous codon usage (i.e., codon adaptiveness) by counting the occurrences of each codon that encodes for the same amino acid. The relative synonymous codon usage is a metric from 0-1 that divides the number of codon occurrences for a speci c codon by the number of codon occurrences for the most common codon that encodes the same amino acid and is usually calculated from a FASTA le of highly expressed genes because codon usage biases from highly expressed genes correspond with tRNA levels (Post, et al., 1979).
ExtRamp Online provides options to integrate pre-computed genome-wide relative synonymous codon usages, as well as tissue and cell type-speci c relative synonymous codon usages. We used the Human Protein Atlas (Ponten, et al., 2011;Thul and Lindskog, 2018;Uhlén, et al., 2015), GTEx Project (Consortium, 2015), and FANTOM5 datasets (Lizio, et al., 2019;Lizio, et al., 2015), as well as a consensus le combining information from each database to calculate the relative synonymous codon usage within each tissue. All isoforms for genes with multiple annotated isoforms were required to have expression values within the same quartile for a given tissue. In total, we included 18,388 genes that spanned 45 tissues from the FANTOM5 dataset, 34 tissues from GTEx, 43 tissues from the Human Protein Atlas, and 62 tissues from the consensus dataset. We also downloaded 66 tissue-strati ed cell type-speci c expression datasets with annotated expression levels for each gene from the Human Protein Atlas database in April 2021.
For each tissue in each dataset, the relative synonymous codon usage was calculated from a FASTA le containing the longest GRCh38 isoform of highly expressed genes (i.e., genes with expression in the fourth quartile). Each tissue le contained at least 4,000 highly expressed genes with an average of 9,307 genes per le. Highly expressed genes were previously labeled by the Human Protein Atlas for cell typespeci c data, and those labels were used here. We removed genes with uncertain reliability scores, and we removed cell types containing fewer than 900 highly expressed genes. All relative synonymous codon usages for both tissue-speci c and cell type-speci c analyses are fully integrated in ExtRamp Online with an interactive drop-down menu.

ExtRamp Online Usage
Users can access all features and options from the original ExtRamp software using ExtRamp Online through an intuitive interface that does not require programming experience. Users can either paste a FASTA formatted gene header and sequence into a text box or upload a small FASTA le of coding sequences. Because ramp sequences are computed using tRNA abundances and codon usage biases, users can upload species tRNA adaptation index values or use provided tissue or cell type-speci c relative synonymous codon usage values. By default, relative synonymous codon usage values are calculated from the human reference genome GRCh38, but users can upload any reference genome. All parameters are set to the default values recommended by ExtRamp (see https://github.com/ridgelab/ExtRamp). ExtRamp Online generates a variety of output les, including a FASTA le containing the identi ed ramp sequences, a list or CSV of codon e ciency values, the headers of genes that did not contain a ramp sequence, the headers of genes that could not be processed by ExtRamp, and the segments of genes after identi ed ramp sequences. Examples of these les are available at ExtRamp Online, with checkboxes allowing users to select which les are downloaded. Because ExtRamp Online runs within the browser, the computer speci cations of the user, Internet connection, and web browser may affect performance. For more intensive calculations (e.g., parallel comparisons or genome-wide analyses), we provide an automatic command generator that provides users with the command to run ExtRamp from the command line on their own machines in either a Linux, Mac, or Windows environment.
Ramp sequences created or destroyed by altered tRNA levels could signi cantly change protein expression and become pathogenic. These types of tissue-speci c analyses are now possible to anyone, regardless of programming experience, with ExtRamp Online.
We expect that researchers will use ExtRamp Online to investigate the role ramp sequences play in variant pathogenicity and subsequent disease development in speci c tissues or cell types. Pathogenic variants identi ed in genome-wide association studies can easily be run through ExtRamp Online to identify whether they create, destroy, or do not affect ramp sequences. If a genetic variant alters a ramp sequence, it could signi cantly alter protein expression levels of that gene, which may indicate a functional mechanism for its disease association. Users can also analyze how pathogenic variants affect ramp sequences in different tissues or cell types, which may indicate cell type-speci c pathogenicity that can be targeted through personalized therapeutics. Additionally, analyzing the effects of variants of unknown clinical signi cance on ramp sequences may provide additional insights into their potential pathogenicity by identifying a mechanism that may alter gene expression. Therefore, ExtRamp Online not only enables researchers to understand why some variants are pathogenic, but also to prioritize variants of unknown clinical relevance for further investigation.
ExtRamp Online signi cantly improves the accessibility of ramp sequence calculations, which will enable future innovative adaptations of ramp sequences in diverse analyses. By employing pre-computed tissue and cell type-speci c relative synonymous codon usage values, ExtRamp Online can identify ramp sequences more precisely than using whole genome relative synonymous codon usage values. Additionally, ExtRamp Online computes the codon landscape by calculating translational e ciencies along the gene, which may support more complete protein folding predictions, since varied translational rates assist in protein folding (Zhang, et al., 2009) and mutations that alter the codon landscape can signi cantly alter protein functionality (Kimchi-Sarfaty, et al., 2007). We developed ExtRamp Online as a hypothesis-generating tool that can be used for a variety of studies ranging from species-wide ramp sequence calculations to single-cell disease pathogenicity analyses.