AMBAR - Interactive alteration annotations for molecular tumor boards

Background: Providing suitable treatments strategies that take into account cancer speciﬁc alterations is a crucial task for successful cancer treatment. To this end, molecular tumor boards (MTBs), that bring together clinicians as well as scientists with diverse expertise, are increasingly established in the clinical routine for therapeutic interventions. Molecular proﬁling from sequencing data is an integral part of the decision making process of an MTB. To debate variant calling results from next generation sequencing NGS analyses, detailed information about the detected mutations are mandatory. Further, these results need to be combined with knowledge and up to date evidence from databases. At the moment, few tools are available that aim at managing this amount of required information. As a result, the whole process of analysis and documentation of patients data becomes time consuming and diﬃcult to manage for MTBs. Results: To overcome these limitations, we developed an interactive web application AMBAR (Alteration annotations for Molecular tumor BoARds) to visualize not only annotated mutations, but also evidence for possible therapeutic drug targets. Found mutations can be evaluated, discussed and exported to clinical information systems. The application is based on R shiny and allows customization, interactive ﬁltering and visualization. Conclusion: AMBAR is an interactive application to not only support MTBs in decision making, but to act as interface between results of NGS analyses, result visualization and export into clinical information systems.


Background
Despite decades of research, cancer is still one of the leading causes of death worldwide [1]. Understanding the underlying mechanisms of carcinogenesis and finding suitable treatments is the most challenging question for successful cancer therapy [2]. In this context, analysis of next generation sequencing (NGS) data is becoming more and more common practice in oncology all over the world especially in light of personalized treatment approaches [3]. Moreover, lower sequencing costs and secure high performance compute servers have ensured the feasibility of using even large gene panels for targeted sequencing and in house analysis with NGS pipelines [4,5,6,7]. Due to these improvements, an increasing number of databases for genes and molecular targets as possible drug targets in cancer therapy have been published [8,9,10,11]. * Correspondence: hans.kestler@uni-ulm.de 1 Institute of Medical Systems Biology, Ulm University, Albert-Einstein-Allee 11, 89081 Ulm, Germany Full list of author information is available at the end of the article In NGS analysis, after sequencing alignment, variant calling and / or copy number variation analysis are necessary steps to search for mutations and aberrations. Annotation of alterations like single nucleotide variations (SNVs) as well as copy number variations (CNVs) of patients are then combined with databases for cancer treatments. Taken together, this process is crucial for personalized cancer treatment [12,13]. Nevertheless, all the information driven by these analyses has to be revised and interpreted. Therefore, a group of clinicians, scientists, pathologists and geneticists discuss in molecular tumor boards (MTB) these patients in the context of clinical data, see Figure 1. Clinical interpretation of genetic variants is often done manually in a very time-consuming process.
Moreover, available databases and tools are queried often one by one and results are assembled for a report. Hence, even if the primary focus of molecular diagnostics is to propose a suitable personalized therapy, time consuming analyses and limited staff make this process costly and inefficient. Also printed stan-

Alteration
Calling Databases

Molecular Tumor Board
Sequencing AMBAR Personalized Report Figure 1 Simplified workflow in a molecular tumor board. AMBAR uses alteration analyses and public databases as input to enrich data with annotations and map mutations to possible drugs targets. Members of molecular tumor boards can filter, refine and discuss results. Visualizations and annotated data can be exported for personalized patient reports.
dardized forms have to be documented and transferred into digital format to the clinical information system. As a result, documentation and progress are prone to mistakes. Existing tools to generate alteration reports are often not open source [14] or only generate static (PDF) reports [15], that are not interactive and cannot be transferred into clinical information system entries right away.
To tackle these problems, we developed in close cooperation with clinicians and pathologists an interactive application AMBAR (Alteration annotations for Molecular tumor BoARds) to enable meaningful and feature rich alteration annotation analysis and visual-ization with various export formats. Hence, our tool will crucially help MTBs in discussing and analyzing patients data with the final aim of providing personalized treatment suggestions.

Implementation
We developed an interactive R shiny application AM-BAR (Alteration annotations for Molecular tumor BoARds) to visualize, annotate, enrich and refine variants of patients with publicly available databases. AM-BAR consists of a frontend module, a processing module, a visualization module and an export module. All modules can be customized and extended. Especially scripts to import and update databases are available.
Evidence-driven treatment options in molecular tumor boards can be taken from databases like Gene Drug Knowledge Database (GDKD) [8], Clinical Interpretation of Variants in Cancer database (CIViC) [9], and Tumor Alterations Relevant for Genomicsdriven Therapy database (TARGET) [10]. By linking and integrating different layers of information from these databases we form a basis for variant annotations in cancer context. By using functions introduced by [15] in the MTB-Report (https://github.com/ jperera-bel/MTB-Report) we preprocess these public databases to use in our application.
High-throughput sequencing data is processed by different tools, for example quality control (like ClinQC [16], alignment (like bwa [17]) and variant calling (like MuTect2 [18]) or copy number alteration (like SeqCNV [19]). Variant calling data is uploaded to analysis in Variant Call Format (VCF), processed by annovar [20] and parsed by maftools [21]. Copy Number Variations (CNV) are processed by pureCN [22]. To use these NGS pipeline results in AMBAR, we integrated parser and methods to process and prepare data (Table ) Besides representations of data in tables we also implemented methods for visualising alterations. Principles from [23] were used. An overview of positions and regions where mutations appear can be visualised in an ideogram plot [24]. Using packages ggbio [25], GenomicRanges [26], BSgenome [27] and VariantAnnotation [28] we generate marker for positions and regions within the human genome. generate_visu_variants_ideo() returns a ggplot object, containing a bar per chromosome with markers for centromeres and alterations. This allows a quick outline of mutation hot spots on a chromosomal level.
Large-scale analyses have shown a spectrum of mutational signatures across human cancer types. Such a mutation signature provides information about underlying causes for mutations [29]. We integrated this analysis in AMBAR by using MutationalPatterns [30] and Mutational Signatures (v3 -May 2019) from COSMIC [31]. In generate_mutsig() we construct an object containing different mutation signature plots and a mutation contribution table. For visualization in the application four parts of out mutation signature object are used: mutsigspectrumplot generates as combined bar plot for the single nucleotide substitution landscape. In a profile bar plot mutsigprofileplot trinucleotide mutation counts are displayed. contribution_table contains a table of the different contributions to the COSMIC Mutational Signatures (v3). A bar plot for contributions to different mutation signatures is build and saved into contribution_barplot.
Various export formats are provided to save, archive and version-control results. If provided, annotated variants based on the uploaded vcf file can be downloaded as tab separated value (TVS) file. Copy number variations are splitted by gene symbol and can be downloaded in comma separated value (CSV) format. Due to latest developments in health care records, Fast Healthcare Interoperability Resources (FHIR) become more and more standard as part of health care systems. These allow linkage and integration of results from different sites and centers for exchanging electronic health records [32]. We implemented an export function generate_fhir() for FHIR's genetic variant assessment in Extensible Markup Language (XML) format. We implemented subfunctions to build different parts of the FHIR xml scheme entries. Besides core data information, non-synonymous variants and their annotations are parsed and converted to xml entries. IDs, symbols and names are extracted from HUGO Gene Nomenclature Committee [33,34]. For every variant Logical Observation Identifiers Names and Codes (LOINC) numbers and descriptions are appended.
To make all functions, results and visualisations public available we developed a R shiny web application. It is responsive, interactive and customizable. Our user interface allows input of patient's core data and upload of variant calling file (VCF) as well as copy number variations (CNV). VCFs in standard 4.1 or 4.2 format can be parsed and -with optional genotype fieldsfor example depth and allelic frequency can be annotated as well. CNVs are parsed from OncoScan and get annotations for gene symbols, gain, loss or loss of heterozygosity (LOH) and Cytoband as well. A table of non-synonymous variants and a table for CNVs are displayed. Combined with a linked knowledge databases (based on [15]) a third table with evidence level of possible drug targets, their effect prediction, status and publication is shown. Marking, filtering, sorting, searching, and export is possible in all result tables. Additionally SNVs and CNVs are visualized in a separate tab as ideogram plot for a quick overview of patients mutational landscape. The program also generates in a third tab a Mutation Signature (v3) of patient data [31]. The user interface is responsive, so even using it on a tablet or smart phone is possible. Results can be exported and downloaded in various formats (for example Excel, csv, tsv, or FHIR xml). Customisation of the user interface and highlighting of data fields as well as integration of other knowledge sources is possible.

Results
To
The application automatically switches to the result table tab called Summary. Depending on input basic patient data, SNP variant table, CNV table  and Found Evidence table are generated and shown (Figure 4, 5, 6). All tables allow pagination, filtering, searching, sorting and exporting of all and filtered data.
Uploaded data is visualized by an ideogram on the tab Ideogram. Using methods from the GenomicRanges package, annotated variants and copy number variations are converted into GRanges objects. These objects are passed into the ideogram plot function. Each chromosome is marked with a centromere region in gray. Positions of SNPs are denoted as blue line inside the corresponding chromosome and regions of copy number alterations are filled in green for gain, in red  for loss, and in orange for loss of heterozygosity (see Figure 7). Information about underlying causes for mutations can be described and summarised by a mutational signature. As this can influence treatment drug choices  we show overview plots and a signature contribution table in the tab Mutation Signature. The first plot in the tab shows a spectrum plot as stacked bar plot with relative contributions of single nucleotide mutations (see Figure 8). The second bar plot displays the relative contributions of trinucleotide mutation counts of the patient's variants (see Figure 9). The last two objects on the web tab are a table and a stacked bar plot with the signature contributions (>0), based on COSMIC Mutational Signatures (v3, May 2019) (see Figure 10 and 11).

Conclusions
We developed an interactive, highly customizable and configurable R-shiny application AMBAR to support molecular tumor boards in analyzing patients molecular data and to optimize personalized cancer treatment. AMBAR allows annotation, visualization and refinement of imported alterations and export of results in various formats. Integration in clinical information systems is possible, especially adaption of new FHIR standards for genetic variants. As AMBAR is a web-based application, it is platform-independent. In addition, modern web technologies make AMBAR responsive and reactive. Hence, displaying and sharing results on monitors or tablets is easy and can be helpful during interdisciplinary meetings of MTBs. As no external connection to web resources after the preparation steps is necessary, self-hosting AMBAR within a restricted and secure network of a hospital is possible. Different issues have to be faced when personalized medicine approaches are implemented. First, when analyzing patients alterations data, available databases and tools are queried often one by one. Moreover, the alteration report is frequently written by hand or filled out in standardized forms that then have to be further digitalized. With AMBAR, we provide a unique tool that allows annotation and visualization of NGS patients data together with clinical and molecular parameters taking into account information available in the databases. Moreover, results in AMBAR can be explored interactively and further filtered. Finally, alteration reports are exported and presented in a digitalized form. Altogether, we presented here our Rshiny application, AMBAR, that tackles significant issues encountered by Molecular Tumor Boards in discussing and have the goal of expanding the state of precision medicine treatment approaches.