GithubHelp home page GithubHelp logo

olivert1 / immunebuilder Goto Github PK

View Code? Open in Web Editor NEW

This project forked from brennanaba/immunebuilder

0.0 0.0 0.0 32 MB

Predict the structure of immune receptor proteins

License: BSD 3-Clause "New" or "Revised" License

Python 13.04% Jupyter Notebook 86.96%

immunebuilder's Introduction


ImmuneBuilder: Deep-Learning models for predicting the structures of immune proteins


Abstract

Immune receptor proteins play a key role in the immune system and have shown great promise as biotherapeutics. The structure of these proteins is critical for understanding what antigen they bind. Here, we present ImmuneBuilder, a set of deep learning models trained to accurately predict the structure of antibodies (ABodyBuilder2), nanobodies (NanoBodyBuilder2) and T-Cell receptors (TCRBuilder2). We show that ImmuneBuilder generates structures with state of the art accuracy while being much faster than AlphaFold2. For example, on a benchmark of 34 recently solved antibodies, ABodyBuilder2 predicts CDR-H3 loops with an RMSD of 2.81Å, a 0.09Å improvement over AlphaFold-Multimer, while being over a hundred times faster. Similar results are also achieved for nanobodies (NanoBodyBuilder2 predicts CDR-H3 loops with an average RMSD of 2.89Å, a 0.55Å improvement over AlphaFold2) and TCRs. By predicting an ensemble of structures, ImmuneBuilder also gives an error estimate for every residue in its final prediction.

Colab

To test the method out without installing it you can try this Google Colab

Install

Requirements

This package requires PyTorch. If you do not already have PyTorch installed, you can do so following these instructions.

It also requires OpenMM and pdbfixer for the refinement step. For details on how to install OpenMM please follow these instructions.
Alternatively, OpenMM and pdbfixer can be installed via conda using:

$ conda install -c conda-forge openmm pdbfixer

It also uses anarci for trimming and numbering sequences. We recommend installing ANARCI from here, but it can also be installed using (maintained by a third party):

$ conda install -c bioconda anarci

Install ImmuneBuilder

Once you have all dependencies installed within one enviroment, you can install ImmuneBuilder via PyPI by doing:

$ pip install ImmuneBuilder

Usage

Antibody structure prediction

To predict an antibody structure using the python API you can do the following.

from ImmuneBuilder import ABodyBuilder2
predictor = ABodyBuilder2()

output_file = "my_antibody.pdb"
sequences = {
  'H': 'EVQLVESGGGVVQPGGSLRLSCAASGFTFNSYGMHWVRQAPGKGLEWVAFIRYDGGNKYYADSVKGRFTISRDNSKNTLYLQMKSLRAEDTAVYYCANLKDSRYSGSYYDYWGQGTLVTVS',
  'L': 'VIWMTQSPSSLSASVGDRVTITCQASQDIRFYLNWYQQKPGKAPKLLISDASNMETGVPSRFSGSGSGTDFTFTISSLQPEDIATYYCQQYDNLPFTFGPGTKVDFK'}

antibody = predictor.predict(sequences)
antibody.save(output_file)

ABodyBuilder2 can also be used via de command line. To do this you can use:

ABodyBuilder2 --fasta_file my_antibody.fasta -v

You can get information about different options by using:

ABodyBuilder2 --help

I would recommend using the python API if you intend to predict many structures as you only have to load the models once.

Happy antibodies!!

Nanobody structure prediction

The python API for nanobodies is quite similar than for antibodies.

from ImmuneBuilder import NanoBodyBuilder2
predictor = NanoBodyBuilder2()

output_file = "my_nanobody.pdb"
sequence = {'H': 'QVQLVESGGGLVQPGESLRLSCAASGSIFGIYAVHWFRMAPGKEREFTAGFGSHGSTNYAASVKGRFTMSRDNAKNTTYLQMNSLKPADTAVYYCHALIKNELGFLDYWGPGTQVTVSS'}

nanobody = predictor.predict(sequence)
nanobody.save(output_file)

And it can also be used from the command line:

NanoBodyBuilder2 --fasta_file my_nanobody.fasta -v

TCR structure prediction

It is all pretty much the same for TCRs

from ImmuneBuilder import TCRBuilder2
predictor = TCRBuilder2()

output_file = "my_tcr.pdb"
sequences = {
"A": "AQSVTQLGSHVSVSEGALVLLRCNYSSSVPPYLFWYVQYPNQGLQLLLKYTSAATLVKGINGFEAEFKKSETSFHLTKPSAHMSDAAEYFCAVSEQDDKIIFGKGTRLHILP",
"B": "ADVTQTPRNRITKTGKRIMLECSQTKGHDRMYWYRQDPGLGLRLIYYSFDVKDINKGEISDGYSVSRQAQAKFSLSLESAIPNQTALYFCATSDESYGYTFGSGTRLTVV"}

tcr = predictor.predict(sequences)
tcr.save(output_file)

And it can also be used from the command line:

TCRBuilder2 --fasta_file my_tcr.fasta -v

Fasta formatting

If you wish to run the model on a sequence from a fasta file it must be formatted as follows:

>H
YOURHEAVYCHAINSEQUENCE
>L
YOURLIGHCHAINSEQUENCE

If you are running it on TCRs the chain labels should be A for the alpha chain and B for the beta chain. On nanobodies the fasta file should only contain a heavy chain labelled H.

Issues and Pull requests

Please submit issues and pull requests on this repo.

Known issues

  • Installing OpenMM from conda will automatically download the latest version of cudatoolkit which may not be compatible with your device. For more information on this please checkout the following issue.
  • After following install instructions I get an Import Error: `GLIBCXX_3.4.30' not found. This is an issue with OpenMM, and can be solved by doing conda install -c conda-forge libstdcxx-ng. See issue here.

Citing this work

The code and data in this package is based on the following paper ImmuneBuilder. If you use it, please cite:

@article{Abanades2023,
	author = {Abanades, Brennan and Wong, Wing Ki and Boyles, Fergus and Georges, Guy and Bujotzek, Alexander and Deane, Charlotte M.},
	doi = {10.1038/s42003-023-04927-7},
	issn = {2399-3642},
	journal = {Communications Biology},
	number = {1},
	pages = {575},
	title = {ImmuneBuilder: Deep-Learning models for predicting the structures of immune proteins},
	volume = {6},
	year = {2023}
}

immunebuilder's People

Contributors

brennanaba avatar fboyles avatar prihoda avatar jadolfbr avatar npqst avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.