A CRI-led paper published in PLOS ONE in August 2015 introduced the scientific community to ExScalibur, a new set of state-of-the-art pipelines for the bioinformatics analysis of Whole Exome Sequencing (WES) data.

WES analysis is a powerful tool for detecting the genetic variants, insertions, and deletions that are responsible for many human diseases and disorders. In particular, the analysis of somatic mutations from tumor/normal pairs has been instrumental in the study of cancer genomics. ExScalibur is designed particularly for this identification of germline and somatic variants, bringing together the entire analysis workflow from raw reads to variant calling and annotation.

While previously existing software solutions enabled the detection of these variants, they have generally been limited in scope, with the integration of variants detected by different methods requiring deep software knowledge and programming skills as well as access to a high-performance computing (HPC) environment. ExScalibur provides an unprecedented, fully automated framework for integrated analysis of multiple alignment and variant calling algorithms, usable by researchers who may not have access to the necessary expertise and hardware to run these analyses otherwise. Its additional features include a high degree of scalability and fine control over parameters, real-time progress monitoring, robust and intuitive error handling and documentation, and a data analysis report utility offering interactive exploration and visualization of results.

ExScalibur is available as an open source project for implementation across platforms, as well as in a ready-to-use virtual image deployable on Amazon EC2, making this analysis available to researchers working without access to HPC resources. The release of this suite of pipelines represents an important contribution to the bioinformatics community, improving access to and simplifying the work of WES analysis and the vital biomedical research it enables.

The ExScalibur paper was authored by CRI bioinformaticians Riyue Bao, Kyle Hernandez, and Lei Huang, scientific programmer Wenjun Kang, director of bioinformatics Jorge Andrade, and director Sam Volchenboum, as well as former CRI bioinformatician Elizabeth Bartom and Kenan Onel of the department of pediatrics.

The full paper is available to read at PLOS ONE: ExScalibur: A High-Performance Cloud-Enabled Suite for Whole Exome Germline and Somatic Mutation Identification.