GithubHelp home page GithubHelp logo

colbyford / pfhrp_mlmodel Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 14.23 MB

Modeling Plasmodium falciparum Diagnostic Test Sensitivity using Machine Learning with Histidine-Rich Protein 2 Variants

Home Page: https://www.frontiersin.org/articles/10.3389/fitd.2021.707313

R 1.03% Python 2.82% PostScript 95.74% Jupyter Notebook 0.41%
malaria malaria-detection machine-learning model-explanation plasmodium-falciparum bioinformatics azure-machine-learning azure-ml

pfhrp_mlmodel's Introduction

Modeling Plasmodium falciparum Diagnostic Test Sensitivity using Machine Learning with Histidine-Rich Protein 2 Variants

Colby T. Ford, Gezahegn Solomon Alemayehu, Kayla Blackburn, Karen Lopez,
Cheikh Cambel Dieng, Eugenia Lo, Lemu Golassa, and Daniel Janies

Abstract

Malaria, predominantly caused by Plasmodium falciparum, poses one of largest and most durable health threats in the world. Previously, simplistic regression-based models have been created to characterize malaria rapid diagnostic test performance, though these models often only include a couple genetic factors. Specifically, the Baker et al., 2005 model uses two types of particular repeats in histidine-rich protein 2 (PfHRP2) to describe a P. falciparum infection, though the efficacy of this model has waned over recent years due to genetic mutations in the parasite.

In this work, we use a dataset of 100 P. falciparum PfHRP2 genetic sequences collected in Ethiopia and derived a larger set of motif repeat matches for use in generating a series of diagnostic machine learning models. Here we show that the usage of additional and different motif repeats in more sophisticated machine learning methods proves effective in characterizing PfHRP2 diversity. Furthermore, we use machine learning model explainability methods to highlight which of the repeat types are most important with regards to rapid diagnostic test sensitivity, thereby showcasing a novel methodology for identifying potential targets for future versions of rapid diagnostic tests.

Important Supplementary Data

  • Model metrics for all trained models are in the /models folder. Note: The top performing models' .pkl files are also available.
  • PfHRP2 sample sequences, motif matches, and metadata are available in the pfHRP2_withMeta.csv file.
  • The histidine-based motif repeat finder is provided in the H_motif_finder.R R script.

Paper and Citation

Frontiers in Tropical Diseases: frontiersin.org/articles/10.3389/fitd.2021.707313

@article {Ford2021,
	author = {Ford, Colby T. and Alemayehu, Gezahegn Solomon and Blackburn, Kayla and Lopez, Karen and Dieng, Cheikh Cambel and Lo, Eugenia and Golassa, Lemu and Janies, Daniel},
	title = {Modeling Plasmodium falciparum Diagnostic Test Sensitivity using Machine Learning with Histidine-Rich Protein 2 Variants},
	publisher = {Frontiers},
	journal = {Frontiers in Tropical Diseases},
	volume = {2},
	pages = {28},
	month = {October},
	year = {2021},
	paper = {707313},
	doi = {10.3389/fitd.2021.707313},
	url = {https://www.frontiersin.org/article/10.3389/fitd.2021.707313},
	issn = {2673-7515}
	
}

pfhrp_mlmodel's People

Contributors

colbyford avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.