GithubHelp home page GithubHelp logo

yangziyi1990 / spldextratrees Goto Github PK

View Code? Open in Web Editor NEW
2.0 2.0 1.0 3.69 MB

SPLDExtraTrees: Robust machine learning approach for predicting kinase inhibitor resistance

License: MIT License

Jupyter Notebook 98.84% Python 1.16%

spldextratrees's Introduction

SPLDExtraTrees: Robust machine learning approach for predicting kinase inhibitor resistance

This is the repository for the SPLDExtraTrees project.

communication E-mail: [email protected]

Please 🌟star🌟 the repo if you like our work, thank you.

Abstract

Drug resistance is a major threat to the global health and a significant concern throughout the clinical treatment of diseases and drug development. The mutation in proteins that is related to drug binding is a common cause for adaptive drug resistance. Therefore, quantitative estimations of how mutations would affect the interaction between a drug and the target protein would be of vital significance for the drug development and the clinical practice. Computational methods that rely on molecular dynamics simulations, Rosetta protocols, as well as machine learning methods have been proven to be capable of predicting ligand affinity changes upon protein mutation. However, the severely limited sample size and heavy noise induced overfitting and generalization issues have impeded wide adoption of machine learning for studying drug resistance. In this paper, we propose a robust machine learning method, termed SPLDExtraTrees, which can accurately predict ligand binding affinity changes upon protein mutation and identify resistance-causing mutations. Especially, the proposed method ranks training data following a specific scheme that starts with easy-to-learn samples and gradually incorporates harder and diverse samples into the training, and then iterates between sample weight recalculations and model updates. In addition, we calculate additional physics-based structural features to provide the machine learning model with the valuable domain knowledge on proteins for this data-limited predictive tasks. The experiments substantiate the capability of the proposed method for predicting kinase inhibitor resistance under three scenarios, and achieves predictive accuracy comparable to that of molecular dynamics and Rosetta methods with much less computational costs.

If you find this code useful in your research then please cite:

@article{
  title={SPLDExtraTrees: Robust machine learning approach for predicting kinase inhibitor resistance},
  author={Zi-Yi Yang, Zhao-Feng Ye, Yi-Jia Xiao, and Chang-Yu Hsieh},
  year={2021},
}

Installation

Follow the steps in the defined order to avoid conflicts.

  1. The models.zip file will have to be downloaded and unzipping.
git clone https://github.com/yangziyi1990/SPLDExtraTrees.git
  1. Create an environment:
conda env create -f requirements_env.yml

conda activate SPLDExtraTrees

Usage

In this repo, we compare the proposed method (i.e., SPLDExtraTrees) with other two machine learning methods, ExtraTrees and SPLExtraTrees in three scenarios. For the first scenario, we trained the machine learning methods on the Platinum dataset and tested them on the TKI dataset to evaluate the model's extrapolating capability. In the second scenario, a small part of the TKI dataset along with the Platinum dataset was used to train the models, and the rest of the TKI dataset was used for testing. For the third scenario, the machine learning methods were trained and tested on the TKI dataset such that we could evaluate the interpolative capability of the model.

We provide some notebooks called S*.ipynb which contains the analysis performed in the manuscript for anyone intersted and who wants to reproduce our results. The analysis was made in python. Input data for the machine learning methods is provided in the file Data.

Results_State.ipynb can plot scatter plots of the experimental versus calculated $\Delta\Delta$G values.

spldextratrees's People

Contributors

yangziyi1990 avatar

Stargazers

 avatar lyingjay avatar

Watchers

James Cloos avatar  avatar

Forkers

pmobio

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.