GithubHelp home page GithubHelp logo

borgwardtlab / maldi_amr Goto Github PK

View Code? Open in Web Editor NEW
30.0 2.0 10.0 2.58 GB

Code for the paper "Antimicrobial resistance prediction in clinical isolates through machine learning on MALDI-TOF mass spectra"

License: BSD 3-Clause "New" or "Revised" License

Python 0.69% R 0.11% Jupyter Notebook 99.11% Shell 0.08%
antimicrobial-resistance maldi-tof-ms maldi-tof-mass-spectrometry python

maldi_amr's Introduction

Direct Antimicrobial Resistance Prediction from MALDI-TOF mass spectra profile in clinical isolates through Machine Learning

This code accompanies the paper “Direct Antimicrobial Resistance Prediction from MALDI-TOF mass spectra profile in clinical isolates through Machine Learning” by Caroline Weis et al.

This repository is a work in progress. See below for some details about how to reproduce some of the figures of our preprint and stay tuned for more information!

Installation

It is recommended to use poetry to install and interact with the code provided in this repository. This ensures that all required dependencies are installed correctly. If you have installed poetry (using your local package manager or the installation instructions on its official website), the following commands are sufficient to install everything:

poetry install
poetry shell

Example: plotting E. coli AMR prediction results

To reproduce a part of Figure 3 in the paper (AUROC and AUPRC curves for antimicrobial resistance prediction using logistic regression), it is sufficient to issue the following commands:

poetry shell # Not necessary if you are already in the virtual environment
python plot_fig4_curves_per_species_and_antibiotic_2panels.py

Afterwards, the output file fig4.png will be created, which reproduces the E. coli panel of Figure 3 in the paper:

E. coli AMR prediction results

You can also call the script with the --help option, i.e. python plot_fig4_curves_per_species_and_antibiotic_2panels.py --help in order to see which other options are available.

Example: creating performance tables

To get a glimpse of the performance of AMR prediction in certain scenarios, the script collect_results.py can be used. In the absence of a more complicated matching procedure, the script makes heavy use of your shell's capabilities to list files. For example, to analyse all results of all trained classifier for E. coli, use the following commands:

poetry shell # Not necessary if you are already in the virtual environment
python collect_results.py ../results/fig4_curves_per_species_and_antibiotics/*/*Escherichia*

This will result in the following output:

                                                        accuracy       auprc      auroc
                                                            mean   std  mean  std  mean  std
species          antibiotic                  model
Escherichia coli Amoxicillin-Clavulanic acid lightgbm      77.06  0.82 43.83 1.83 67.02 1.41
                                             lr            75.93  0.76 40.96 2.86 65.81 1.41
                                             rf            75.71  0.15 41.13 1.29 66.27 1.76
                                             svm-linear    54.09 10.54 30.84 1.63 56.93 2.08
                                             svm-rbf       62.48 13.59 39.91 1.84 64.23 1.63
                 Cefepime                    lightgbm      88.99  0.63 69.85 2.90 88.17 1.47
                                             lr            87.54  0.82 63.18 3.07 85.59 1.22
                                             rf            84.91  0.41 66.99 2.65 86.92 1.75
                                             svm-linear    71.46 19.17 47.35 4.90 76.04 3.49
                                             svm-rbf       58.30 35.40 64.24 1.94 85.24 1.51
                 Ceftriaxone                 lightgbm      88.42  0.84 79.41 2.13 89.55 1.36
                                             lr            86.65  0.74 74.38 2.20 87.36 1.26
                                             rf            84.21  0.85 77.01 2.24 87.63 1.52
                                             svm-linear    77.61  2.04 61.04 3.26 79.17 2.31
                                             svm-rbf       83.68  1.70 74.83 2.03 86.81 1.43
                 Ciprofloxacin               lightgbm      82.20  1.02 77.61 1.59 85.32 0.94
                                             lr            79.56  1.14 70.58 2.14 81.00 1.27
                                             rf            77.67  0.74 75.65 1.93 84.25 1.54
                                             svm-linear    67.32  2.13 55.95 3.62 71.40 3.01
                                             svm-rbf       56.58 23.14 67.63 2.98 79.60 2.23
                 Piperacillin-Tazobactam     lightgbm      92.59  0.40 21.12 2.78 71.54 3.90
                                             lr            92.75  0.20 22.01 3.77 71.18 3.54
                                             rf            92.71  0.00 18.46 1.62 69.83 3.10
                                             svm-linear    87.55  0.88 16.41 3.69 66.77 3.32
                                             svm-rbf       38.76 41.48 26.42 5.10 70.77 3.81
                 Tobramycin                  lightgbm      87.10  0.70 35.21 3.78 75.05 2.90
                                             lr            87.13  0.64 32.68 3.87 73.14 2.97
                                             rf            86.99  0.00 35.25 3.34 74.12 2.93
                                             svm-linear    73.80  2.95 23.47 3.11 65.06 3.01
                                             svm-rbf       66.29 28.26 33.33 3.66 71.09 2.89

Feel free to experiment with other settings and other scenarios, the script is quite 'smart' and supports different reporting types out of the box.

Contact

This code is developed and maintained by members of the Machine Learning and Computational Biology Lab of Prof. Dr. Karsten Borgwardt and the Applied Microbiology Lab of Prof. Dr. Adrian Egli:

maldi_amr's People

Contributors

acuenod111 avatar cvweis avatar pseudomanifold avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

maldi_amr's Issues

is there a quick start manual?

Firstly, appreciation for your large and detailed amount of work. I am excited after reading the paper about your research findings.

When I tried to run it, I encountered some difficulties.

How to run the code corresponding wih the paper? Is there a quick start manual?

Too many code, i don't know how to run it.

For example, i can't find plot_fig4_curves_per_species_and_antibiotic_2panels.py?

Issue with histogram function

Not really an issue, but I noticed you are using my uniplot library and added a comment that something was not working as you expected.

uniplot.histogram(
np.max(y_score, axis=1),
bins=10,
# The histogram function has some issues with plotting
# everything properly under certain circumstances.
x_min=0.49,
x_max=0.99,
title='Prediction Probabilities'
)

I was just curious what that problem might have been? I'd be interested in potentially improving the histogram function

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.