Our research is primarily concerned with the development of statistical and machine learning methodologies with a particular focus on applications in the biomedical sciences.
Tumour heterogeneity describes the genetic diversity both within (intra-tumour heterogeniety) and between tumours (inter-tumour heterogeneity). Genetic differences within and between tumours give rise to different disease outcomes and patient responses to therapies. Understanding and characterising tumour heterogeneity is therefore important in developing clinical approaches that are tailored to a patient (individualised medicine).
Our group is interested in developing statistical methods to analyse genome sequencing data that comes from heterogeneous tumour samples using advanced machine learning.
Read more about our cancer research and media coverage of our work:
- Premalignant SOX2 overexpression in the fallopian tubes of ovarian cancer patients: Discovery and validation studies - EBioMedicine
- Two research discoveries offer hope for managing ovarian cancer - University of Oxford
- Ovarian cancer test on horizon as scientists find earliest signs of disease - The Telegraph
- Ovarian cancer can be detected BEFORE it becomes deadly: Scientists identify key enzyme which makes the disease spread - The Daily Mail
- Researchers identify how ovarian cancer hijacks natural cell process to survive - Ovarian Cancer Action
- Research finds cancer hijacks natural cell process to survive - University of Birmingham
Single Cell Sequencing
Advances in single cell technology now allow large-scale experimentation on single cells in a high-throughput fashion providing new insight into cellular function. Heterogeneity, both biological and technical, confound the simple interpretation of single cell data and sophisticated statistical methods are required to handle different sources of noise and signal.
Our group is working on approaches using Bayesian hierarchical modelling to integrate information from multiple sources across different spatiotemporal scales in order to better understand cellular function and dynamics.
- ZIFA: Dimensionality reduction for zero-inflated single-cell gene expression analysis - Genome Biology
- Order Under Uncertainty: Robust Differential Expression Analysis Using Probabilistic Models for Pseudotime Inference - PLOS Computational Biology
- Uncovering pseudotemporal trajectories with covariates from single cell and bulk expression data - Nature Communications
Bayesian Statistical Machine Learning
Bayesian methods are now ubiquitous in statistical modeling applications across a wide range of disciplines. A major challenge for Bayesian approaches is the significant computation required for exact inference in large models applied to big datasets (terabyte or more scale). For this type of data, Markov Chain Monte Carlo (MCMC) simulation approaches are not feasible despite recent advances in massively parallel computational hardware (e.g. graphics programming units or GPUs). Instead, it is necessary to develop approximate methods that are able to give “good” answers that, whilst not guaranteed to be exact or optimal, are sufficient for downstream decision processes and further scientific inquiry.
Our group is developing approximate inference methods to fit complex statistical models to large datasets that respect the practical computational and time limitations that govern real-life scientific studies. We are also interested in using decision-theoretic ideas to produce meaningful results for scientists via loss functions that are tailored to the task at hand.
Our recent work in this area has featured in the leading statistical and machine learning journals and conferences:
- Probabilistic Boolean Tensor Decomposition - International Conference on Machine Learning
- Testing and learning on distributions with symmetric noise invariance - Advances in Neural Information Processing Systems
- The Hamming Ball Sampler - The Journal of the American Statistical Association
- Hamming Ball Auxiliary Sampling for Factorial Hidden Markov Models - Advances in Neural Information Processing Systems (NIPS)
- Statistical Inference in Hidden Markov Models Using k-Segment Constraints - The Journal of the American Statistical Association