aiSourcePro: A Scalable Framework for Machine Learning–Based Source Attribution in Bacterial Genomics

Ben Pascoe (University of Oxford, UK)

12:20 - 12:30 Thursday 16 April Morning

+ Add to Calendar

Abstract

Machine learning (ML) and artificial intelligence (AI) are transforming the way microbiologists interpret complex genomic datasets, enabling new insights into pathogen ecology, evolution, and transmission. Trained AI models are increasingly important in understanding attributing the source of human infections. Trained models using genotypic data to identify the reservoir or origin of bacterial isolates is an essential step for tracking infection sources and guiding control strategies. Most advances have focused on Campylobacter, a pathogen with strong host–environment associations. Recent ML approaches, such as aiSource, trained on core-genome multilocus sequence typing (cgMLST) data, have surpassed the accuracy of classical probabilistic methods (e.g., iSource, STRUCTURE) and other ML frameworks using seven-locus MLST or k-mer data. However, extending these approaches to new organisms or to alternative prediction tasks (e.g. geography, disease type, or sampling time) remains a major challenge We present aiSourcePro, a Python library and web interface that simplifies ML-based source attribution and general genomic prediction tasks. The platform automates data cleaning, label harmonisation, and hyperparameter optimisation, while enabling reproducible model evaluation and export of all intermediate outputs. Distributed as a ready-to-use container for local or high-performance computing environments, aiSourcePro lowers technical barriers to experimentation. By unifying model training and deployment, it empowers researchers to explore diverse AI applications in microbiology, from outbreak tracing and pathogen surveillance to predictive modelling of microbial transmission.

More sessions on Registration