Patterns in the Unknown

Lisa Crossman (University of East Anglia & SequenceAnalysis.co.uk, UK)

10:50 - 10:55 Wednesday 15 April Morning

+ Add to Calendar

Abstract

Metagenomics is the study of all microbiota members from a given habitat, where whole genome shotgun approaches sequence all the DNA extracted from those habitat samples.  Metagenomic studies often rely on taxonomic classification to describe microbial diversity, however, in some circumstances this approach can introduce significant bias. Alpha diversity (within sample) and Beta diversity (between sample) measures often rely on taxonomy classification at the outset.   Traditional taxonomic calls depend on reference databases, that although as comprehensive as possible, still remain incomplete, unevenly annotated, and skewed towards particular taxa and geographical areas.  As a result, our view of microbial life is still dominated by known knowns, while the unknown unknowns remain at the edge of our detection.  These biases can obscure patterns among the datasets, and lead to incomplete views of community structure and function particularly in highly complex and poorly characterised metagenomes.     Here we describe an alternative approach that focusses on measuring sequence diversity directly, without assigning reads to predefined taxa. This strategy reduces our heavy reliance on reference databases and avoids assumptions about evolutionary relationships or taxonomic boundaries.  We have implemented the strategy as a pipeline that results in less biased investigations of diversity across sample datasets and the provision of data visualisations.     By emphasising inherent sequence variation rather than taxa labels, these pipeline methods reveal patterns between communities and can better capture the breadth of microbial diversity present in complex samples.

More sessions on Registration