Mutation signatures are the hallmarks of mutagenic processes in cancer that can provide clues about the biochemical mechanisms by which DNA is altered in cancer. The extraction of such signatures from next generation sequencing data has traditionally been formulated as an unsupervised learning problem and solved using non-negative matrix factorization. We present an entirely novel approach based on convolutional filtering, inspired by technologies used in computer vision and image processing for genomic data analysis. We show that our approach has state-of-the-art performance compared to standard methods but also generalizes to allow consideration of larger sequence contexts using deep layering of convolutional networks providing a tool that could potentially reveal the impact of high-level genome structure on mutational density.
Researchers: