GithubHelp home page GithubHelp logo

martin-sicho / qsprpred Goto Github PK

View Code? Open in Web Editor NEW

This project forked from cddleiden/qsprpred

0.0 0.0 0.0 47.56 MB

A tool for creating Quantitative Structure Property Relationship (QSPR) models.

Home Page: https://cddleiden.github.io/QSPRpred/docs/

License: MIT License

Shell 0.11% Python 25.08% Jupyter Notebook 74.82%

qsprpred's Introduction

GitHub Marketplace

QSPRpred

QSPRpred is open-source software libary for building Quantitative Structure Property Relationship (QSPR) model developed by Gerard van Westen's Computational Drug Discovery group. It provides a unified interface for building QSPR models based on different types of descriptors and machine learning algorithms. We developed this package to support our research, recognizing the necessity to reduce repetition in our model building workflow and improve the reproducibility and reusability of our models. In making this package available here, we hope that it may be of use to other researchers as well. QSPRpred is still in active development, and we welcome contributions and feedback from the community.

QSPRpred is designed to be modular and extensible, so that new functionality can be easily added. A command line interface is available for basic use cases to quickly, explore varying scenarios. For more advanced use cases, the Python API offers extra flexibility and control, allowing more complex workflows and additional features.

Internally, QSPRpred relies heavily on the RDKit and scikit-learn libraries. Furthermore, for scikit-learn model saving and loading, QSPRpred uses ml2json for safer and interpretable model serialization. QSPRpred is also interoperable with Papyrus, a large scale curated dataset aimed at bioactivity predictions, for data collection. Models developed with QSPRpred are compatible with the group's de novo drug design package DrugEx.

Quick Start

Installation

QSPRpred can be installed with pip like so (with python >= 3.10):

pip install git+https://github.com/CDDLeiden/QSPRpred.git@main

Note that this will install the basic dependencies, but not the optional dependencies. If you want to use the optional dependencies, you can install the package with an option:

pip install git+https://github.com/CDDLeiden/QSPRpred.git@main#egg=qsprpred[<option>]

The following options are available:

  • extra : include extra dependencies for PCM models and extra descriptor sets from packages other than RDKit
  • deep : include deep learning models (torch and chemprop)
  • pyboost : include pyboost model (requires cupy, pip install cupy-cudaX, replace X with your cuda version)
  • full : include all optional dependecies (requires cupy, pip install cupy-cudaX, replace X with your cuda version)

Multiple Sequence Alignment Provider for Protein Descriptors

If you plan to optionally use QSPRpred to calculate protein descriptors for PCM, make sure to also install Clustal Omega. You can get it via conda:

conda install -c bioconda clustalo

or install MAFFT instead:

conda install -c biocore mafft

This is needed to provide multiple sequence alignments for the PCM descriptors. At the moment, we do not support protein descriptor calculation for PCM on Windows.

Use

After installation, you will have access to various command line features, but you can also use the Python API directly (see Documentation). For a quick start, you can also check out the Jupyter notebook tutorials, which documents the use of the Python API to build different types of models. This tutorial shows how a QSAR model can be trained. This tutorial shows how to use a QSAR model to predict the bioactivity of a set of molecules. The tutorials as well as the documentation are still work in progress, and we will be happy for any contributions where it is still lacking.

To use the commandline to train the same QSAR model as in the tutorial use (run from tutorial folder):

python -m qsprpred.data_CLI -i ./data/parkinsons_pivot.tsv -o qspr/data -pr GABAAalpha -pr NMDA -r true -sp random -sf 0.15 -fe Morgan
python -m qsprpred.model_CLI -dp ./qspr/data/GABAAalpha_REGRESSION_df.pkl -o ./qspr/models -m PLS -o bayes -nt 5 -me -s

Contributions

Contributions and issue reports are more than welcome. Pull requests can be made directly to the main branch and we will transfer them to contrib when scheduled for the next release.

Workflow

image

Current Development Team

qsprpred's People

Contributors

hellevdm avatar martin-sicho avatar david-araripe avatar sohviluukkonen avatar lindeschoenmaker avatar olivierbeq avatar gorostiolam avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.