Viruses control much of the world indirectly through infection of microbial cells. Modern metagenomics provides the raw data to investigate their dynamics and patterns of co-occurrence with the microbial hosts, but extracting signal from these datasets at the large scale remains challenging. Marker-gene based approaches, such as SingleM, have shown great promise for microbial data, converting metagenomic datasets into community profiles by concentrating on reads which derive from conserved sections of prevalent genes.
This project will investigate ways that viral metagenome datasets can be profiled by extending SingleM to target the marker genes encoded in viral genomes. This will be achieved by determining which genes are conserved phylogenetically across substantial numbers of viruses, and then applying the optimised workflows and cloud-run assets already developed for SingleM to profile >200,000 public metagenomes.
These viral community profiles can then be interpreted in the context of the microbial profiles of those same samples, providing the means to elucidate global patterns of viruses and their hosts through the application of bioinformatic and machine learning approaches.
You may be eligible to apply for a research scholarship.
Contact the supervisor for more information.