GithubHelp home page GithubHelp logo

marcamil30 / infinity1.0 Goto Github PK

View Code? Open in Web Editor NEW
1.0 2.0 0.0 178.57 MB

This is the offical page for the InFinity 1.0 ligand-affinity centered protein re-engineering pipeline, as developed by the Imperial College London team for the 2022 iGEM competition. https://2022.igem.wiki/imperial-college-london/software

License: Creative Commons Attribution 4.0 International

Python 57.34% Shell 1.45% Jupyter Notebook 37.21% Roff 1.75% HTML 2.25%

infinity1.0's Introduction

InFinity1.0

This is the offical page for the InFinity 1.0 ligand-affinity centered protein re-engineering pipeline, as developed by the Imperial College London team for the 2022 iGEM competition. https://2022.igem.wiki/imperial-college-london/software

Description

InFinity 1.0 is a protein reengineering framework that employs five in silico steps to conduct high-throughput mutagenesis, screening of sampled mutants for their affinity to a selected ligand, followed by ranking and multiple sequence alignment to identify potential motifs capable of altering ligand specificity/affinity. What sets this tool apart is its innovative use of several highly efficient recently released plug-and-play tools, which improve accessibility. With the exception of the first step, each of the stages in the current framework employs published or pre-published open-source tools, and therefore credit should be given to the respective authors. Preliminary testing has shown similar scoring power to established scoring functions, although we recognize that further improvements are necessary, particularly in terms of energy minimization and fold prediction following mutagenesis, if we are to directly screen the top mutants generated from this pipeline. Nevertheless, InFinity 1.0 serves as an additional tool in the synthetic biologist's toolkit, aiding in rational design and potentially narrowing down screening efforts.

To use the framework, users require access to a computing cluster with GPU-capability and 1TB of storage per 1,000,000 mutants to be screened. The current release is designed for PBS-type HPC schedulers, but it can easily be adapted for SLURM schedulers as well.

Screenshot 2023-04-30 at 11 54 09

Figure 1: Overview of proposed framework for computational protein engineering. The framework will benefit from advances in structural modelling and molecular docking. Adapting these for use in computational protein engineering could allow for high-throughput screening of mutants, aiding in design and testing to be carried out in the lab.

Screenshot 2023-05-02 at 15 16 57

Figure 2: Overview of implented Infinity 1.0 pipeline: Process begins with a csv file (input_trial.csv) in which users can input the sequeence and mutations of choice. For each stage, a script has been created to interface between the user and incorporated tool. Aside from the initial sequence information and computational resource specification, the pipeline requires minimal user input to run.

Installation

First clone InFinity to a directory within the computing cluster. Next install MGLTools if this is not already on your system (MGLTools can be downloaded from the Center for Computational Structural Biology's webpage). MSMS can then be downloaded from Center for Computational Structural Biology's webpage and installed as such:

cd [USER_DIR]/delta_LinF9_XGB/software/
mkdir msms
tar -zxvf msms_i86_64Linux2_2.6.1.tar.gz -C msms
cd msms
cp msms.x86_64Linux2.2.6.1 msms

Next install AlphaSpace2 as such:

cd [USER_DIR]/InFinity/Scoring/Delta_LinF9_XGB/software/
tar -zxvf AlphaSpace2_2021.tar.gz
cd AlphaSpace2_2021
pip install -e ./

With this done, it's time to ensure the pipline knows what and where to run. Therefore, perform the following edits: Edit InFinity/Docking/EquiBind/configs_clean/inference.yml, substituting [USER_DIR] in the highlighted filepaths for the filepath of InFInity in your cluster. Simmilarly the following files should be edited, replacing [USER_DIR]: InFinity/Scoring/Delta_LinF9_XGB/script/runXGB.py
InFinity/Scoring/Delta_LinF9_XGB/script/calc_vina_features.py
InFinity/Scoring/Delta_LinF9_XGB/script/prepare_betaAtoms.py
InFinity/Scoring/Delta_LinF9_XGB/software/msms/pdb_to_xyzr
InFinity/Scoring/Delta_LinF9_XGB/script/calc_bridge_wat.py
InFinity/Scoring/Delta_LinF9_XGB/script/featureSASA.py

The final installation step is to get the conda modules in order. For compatability reasons we advice you use python 3.7. Navigate to InFinity and install the two conda environments from the supplied environment.yml file thorugh conda env create -f environment_docking.yml and conda env create -f environment_scoring.yml.

Usage

Step 1: Combinatorial mutagenesis

  1. First navigate to the input_trial.csv file in the main folder. Insert the sequence of the protein in SEQUENCE column (making sure they are all in capital letters), and the positions wished to be mutated in the POSITION TO MUTATEcolumn, seperating each with a comma. Currently the software only accepts more than one positions to mutate. Example shown in Figure 2.

Screenshot 2023-05-02 at 15 10 41

Figure 2: CSV File of InFinity 1.0 for user input. The csv is divided into three sections. The two most important are the amino acid sequence and the positions you want to mutate in the sequence.

  1. Navigate to the file modified_permutation.py and in the variable called WT make sure to put in your wild type sequence making sure that it is comma separated. For example WT = ['A', 'A', 'A', 'A']. Additionally in modified_permutation.py make sure to input your own .pdb file in the part cmd.load(). For example cmd.load('3b5r.pdb'). Additionally, make sure to put in your .pdb file in the same directory as modified_permutation.py.

  2. Navigate to the InFinity directory and edit the mutate.sh array job argument e.g. #PBS -J 0-19 according to the computational resources available

  3. Run combinatorial and structural mutagenesis:

    qsub -v file_dir="[USER_DIR]/InFinity",limit="1000000" mutate.sh
    

The framework will benefit from advances in structural modelling and molecular docking. Adapting these for use in computational protein engineering could allow for high-throughput screening of mutants, aiding in design and testing to be carried out in the lab.

Step 2: Structural Mutagenesis 4. In the framework's current implementation, structural mutagenesis is done immediately after combinatorial mutagenesis, as part of the same script. This step has been left included to allow for future improvments, where splitting up the two jobs is more appropriate.

Step 3: Docking

  1. Navigate to InFinity/Docking/EquiBind and run:
    qsub -v file_dir="[USER_DIR]/InFinity/Docking/EquiBind" runeq.sh
    

Step 4: Scoring

  1. Navigate to InFinity/Docking/Delta_LinF9_XGB and run:

    qsub -v file_dir="[USER_DIR]/InFinity" move.sh
    
  2. Edit the #PBS -J 0-99 argument of scoring.sh according to the computational resources avaliable.

  3. Perform docking by running:

    qsub -v processors="100" scoring.sh
    

    Edit processors to suit the configuration

  4. The previous step will generate n score.csv files where n is the number of processors used to run the step. These can be concatnated cat scores*.csv > final_scores.csv These can then be ranked with sort -k2 -n -t, final_scores.csv

Step 5: Multiple Sequence Allignment

  1. Using the scores generated from the previous step, mutants with a desired alteration in ligand affinity/specificity can then be uploaded and screened using an MSA tool such as Clustal Omega, with enriched motifs serving as strarting point for further rational design/ screening analysis

Contributions

With the interchangability of the invidual steps, we encourage community contributions to test out other tools and help us continuously improve the framework.

Authors and acknowledgments

The framework was developed by Marc Amil and Rasmus Hildebrandt. With the exception of the tool performing combinatorial mutagenesis remaining steps heavily relief on the adaptation of several published tools were used and sduch we thank the authors for making their work freely available:

Liu, Zhihai; Su, Minyi; Han, Li; Liu, Jie; Yang, Qifan; Li, Yan; Wang, Renxiao., 2017, ‘Forging the Basis for Developing Protein-Ligand Interaction Scoring Functions’, Accounts of Chemical Research, 50 (2): pp. 302-309. Available at: http://www.pdbbind.org.cn/index.php

Schrödinger, L. & DeLano, W., 2020. PyMOL, Available at: http://www.pymol.org/pymol.

Stärk, H. et al., 2022, ‘EquiBind: Geometric Deep Learning for Drug Binding Structure Prediction’. Available at: https://doi.org/10.48550/arXiv.2202.05146.

Yang, C. and Zhang, Y., 2022, ‘Delta Machine Learning to Improve Scoring-Ranking-Screening Performances of Protein–Ligand Scoring Functions’, Journal of Chemical Information and Modeling, 62(11), pp. 2696–2712. Available at: https://doi.org/10.1021/acs.jcim.2c00485.

infinity1.0's People

Contributors

rasmushildebrandt avatar marcamil30 avatar igemsoftwareadmin avatar

Stargazers

Wei Lu (陆威) avatar

Watchers

Kostas Georgiou avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.