GithubHelp home page GithubHelp logo

zqfelix422 / prorefiner Goto Github PK

View Code? Open in Web Editor NEW

This project forked from veghen/prorefiner

0.0 0.0 0.0 14.9 MB

Code for ProRefiner: An Entropy-based Refining Strategy for Inverse Protein Folding with Global Graph Attention

Python 100.00%

prorefiner's Introduction

ProRefiner: An Entropy-based Refining Strategy for Inverse Protein Folding with Global Graph Attention

This is a demo code for paper ProRefiner: An Entropy-based Refining Strategy for Inverse Protein Folding with Global Graph Attention. You can also run the demo online through Colab or Code Ocean for easier environment setup.

File structure

We provide the ProRefiner implementation in folder model. We put the code provided by ProteinMPNN in folder ProteinMPNN. run.py contains the sequence design pipeline.

Environemnt setup

The program is written in Python. Please first install Python on the machine. Then run the following script in the terminal to setup the environment. It will automatically install the latest version of the packages.

pip install torch torchvision torchaudio
pip install biopython
pip install fairseq

We recommend running on Linux systems. The code has been tested on the latest version of the above dependencies. The setup should be completed within few minutes.

Run protein sequence design

This demo demonstrates sequence design with base model ProteinMPNN. Full sequence design and partial sequence design are supported. Designing one protein is fast with few seconds on CPUs.

Full sequence design

Run the following script to start design.

python run.py PDB_CODE CHAIN

For example:

python run.py 8flh A

The program will download the PDB file of the given PDB code, and run sequence design on the specified chain (only single chain design is supported). Here is an example output of the above script:

Design 265 residues from 8flh chain A (ignore residues without coordinates)

native sequence:
YGSWEIDPKDLTFLKELGTGQFGVVKYGKWRGQYDVAIKMIKEGSMSEDEFIEEAKVMMNLSHEKLVQLYGVCTKQRPIFIITEYMANGCLLNYLREMRHRFQTQQLLEMCKDVCEAMEYLESKQFLHRDLAARNCLVNDQGVVKVSDFGLSRYVLDDEYTSSGSKFPVRWSPPEVLMYSKFSSKSDIWAFGVLMWEIYSLGKMPYERFTNSETAEHIAQGLRLYRPHLASEKVYTIMYSCWHEKADERPTFKILLSNILDVMDE

sequence by ProteinMPNN: (recovery: 43.774	nssr: 58.113)
LKPYEIDPKDLTIEEHLGTGGGGTVWKGLYKGKTPVAIKELKPGRFDEDALIAYMEEKMNIKHPNIVQLFGISSSGTPILKVKEYCAKGGLLAYLRDASRNLTPAQLLQLCIDIAKGMAYLESKNILHRDLKTGNCLVDENDVAKVADYGGILFVKDPEARTVGSKFPVRWSPLEVLENGDYSFASDVWSFGVTMYEIFSRGATPFAGMTDEEIRAYIAAGGTLTRPPLASPAMWAIADSCLARDPSDRPTFAEILAALEAEAAA

sequence by ProRefiner + ProteinMPNN: (recovery: 55.472	nssr: 71.321)
MGEWEINPKDLTFLEHLGTGALGVVYKGLYKGKKKVAVKELKEGAFDIESLIADSKVRMNLKHENLVQLYGICTSSSPILLVVEYMANGNLLDYLRDKSRNFSTEQLLQMCLDVAKAMAYLESKNELHRDLKSENCLVDENGVVKVSDYGLIRFVKNEEARTVGSKFPVRWSPPEVLENNDYSFKSDVWSFGVTMWEIFSLGATPFEDMSDEETAEWIRAGKTLTRPALASDAVWAILSSCLQRDASKRPTFAELLKQLREVQKK

Note that when invalid chain code is provided, the program will return an error. For example, the output of script python run.py 8flh F will be

Chain F not found in 8flh (chains: ['A'])

Partial sequence design

Run the following script for partial design, where the indexes of residues to design (index starting from 1 not 0) are separated by comma.

python run.py PDB_CODE CHAIN INDEX1,INDEX2,INDEX3

Please note that there is no space betweem indexes. For example, to design the first 10 residues of chain A, run:

python run.py 8flh A 1,2,3,4,5,6,7,8,9,10

The program will only output the sequences for designable residues. An example output for the above command will be:

Design 10 residues from 8flh chain A (ignore residues without coordinates)

native sequence:
YGSWEIDPKD

sequence by ProteinMPNN: (recovery: 40.000	nssr: 50.000)
LEPYEIDISD

sequence by ProRefiner: (recovery: 50.000	nssr: 90.000)
MGAWEVNPED

prorefiner's People

Contributors

veghen avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.