Treffer: Leveraging Genomic and Phenotypic Data to Model AMR in Mycobacterium tuberculosis

Title:
Leveraging Genomic and Phenotypic Data to Model AMR in Mycobacterium tuberculosis
Publisher Information:
BioCARLA
Publication Year:
2024
Collection:
Zenodo
Document Type:
Fachzeitschrift text
Language:
English
DOI:
10.5281/zenodo.14999424
Rights:
Creative Commons Attribution 4.0 International ; cc-by-4.0 ; https://creativecommons.org/licenses/by/4.0/legalcode
Accession Number:
edsbas.C4D6D933
Database:
BASE

Weitere Informationen

Mycobacterium tuberculosis (MTB), the bacterium responsible for tuberculosis, is a major global public health threat, causing millions of infections and deaths each year. A significant challenge in treating MTB is its antimicrobial resistance (AMR), which allies the bacterium to survive against many antibiotics, thereby reducing the number of effective treatment options. Moreover, effective and timely antibiotic treatment depends on accurate and rapid in silico predictions. Researchers are leveraging advancements in high-throughput DNA sequencing to facilitate the rapid and precise identification and characterization of emerging pathogens. Additionally, the integration of sequence data with machine learning methods cansignificantly improve the prediction of AMR profiles. In this work, we propose a computational framework to predict antibiotic resistance in MTB using genomic and phenotypic data based on generalized linear models (GLM) given continuous and/or categorical predictors. To achieve our goal, we downloaded the whole-genome sequencing (WGS) data for 10,510 MTB isolates consisting of different countries and downloaded from the open access repository called from the Genbank database. Additionally, we got the relevant information of lineage and phenotypic data from CRyPTIC Consortium and the 100,000 Genomes project. All descriptive information about the data, including attributes, such as genome name, genome status, country name, isolation sources, was processed. Then, to detect genetic features associated with AMR, we annotated the resistance profiles using the Comprehensive Antibiotic Resistance Database. Phenotypic information was converted into a binary variable representing 'resistant' and 'susceptible'. Finally, we use generalized linear models (GLMs) to predict AMR in MTB. The GLM approach allowed us to model the relationship between the binary phenotypic outcome and the genomic features. For this approach, we implemented the algorithms with scikit-learn in python. Our results highlight mutations in ...