GithubHelp home page GithubHelp logo

bjornwallner / dockq Goto Github PK

View Code? Open in Web Editor NEW
192.0 192.0 47.0 4.69 MB

DockQ is a single continuous quality measure for protein docked models based on the CAPRI evaluation protocol

License: MIT License

Python 92.24% Shell 3.70% Cython 4.06%

dockq's People

Contributors

bjornwallner avatar clami66 avatar nemo8130 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dockq's Issues

Chain mismatch , KeyError: 'A'

Hi, I found "chain mismatch" problem in almost all complexes I tested.
I've been stuck for days. I would very much appreciate it if anyone help me out.

After I ran ./DockQ.py model/7P79.pdb native/7P79.pdb, it showed:

7P79.pdb
chain mismatch A B H Cchain mismatch A B H CTraceback (most recent call last):
File "./DockQ.py", line 732, in
main()
File "./DockQ.py", line 660, in main
info=calc_DockQ(model,native,use_CA_only=use_CA_only,capri_peptide=capri_peptide) #False):
File "./DockQ.py", line 234, in calc_DockQ
if key in chain_res[chain]: # if key is present in sample
KeyError: 'A'

Any tips for fixing this?
Thank you so much.

biopython version

I have already installed 1.79 of biopython by pip.
But when I launch the DockQ.py it says
Biopython version (1.59) is too old need at least >=1.61

fnat error

When I try to run my native and model pdb files in my terminal, I am getting an error relating to the absence of an object called fnat in the larger DockQ directory. How do I go about correcting this? I tried moving the contents of the src folder out into the primary DockQ directory and attempted to compile fnat.c, but this failed.

Multichain functionality in readme

I try to run the readme example for multichain functionality and I get this.

Traceback (most recent call last):
File "./DockQ.py", line 731, in
main()
File "./DockQ.py", line 568, in main
native=make_two_chain_pdb_perm(native,nat_group1,nat_group2)
File "./DockQ.py", line 452, in make_two_chain_pdb_perm
exec_path=os.path.dirname(Path.abspath(sys.argv[0]))
AttributeError: type object 'Path' has no attribute 'abspath'

Readme:./DockQ.py <model> <native>

I want to predict the docking scores of the result of Alphafold-Multimer by DockQ. However, I have no idea which file is "model" and which one is "native".

inconsistent results

Thanks for updating DockQ, and make it more user friendly.
But I got different results by computing the DockQ of the example structure.

By doing this in the python script:
from DockQ.DockQ import load_PDB, run_on_all_native_interfaces model = load_PDB("examples/1A2K_r_l_b.model.pdb") native = load_PDB("examples/1A2K_r_l_b.pdb") chain_map = {"A":"A", "B":"B"} run_on_all_native_interfaces(model, native, chain_map=chain_map)[0]
, I got

{('A', 'B'): {'DockQ_F1': 0.4408235819178569, 'DockQ': 0.4350515761458511, 'F1': 0.38095238095238093, 'irms': 1.3939127448327777, 'Lrms': 10.304622346268628, 'fnat': 0.36363636363636365, 'nat_correct': 4, 'nat_total': 11, 'fnonnat': 0.6, 'nonnat_count': 6, 'model_total': 10, 'clashes': 1, 'len1': 198, 'len2': 106, 'class1': 'receptor', 'class2': 'ligand', 'chain1': 'A', 'chain2': 'B', 'chain_map': {'A': 'A', 'B': 'B'}}}

But if I run
DockQ 1A2K_r_l_b.model.pdb 1A2K_r_l_b.pdb in terminal,
I got:
Model : 1A2K_r_l_b.model.pdb Native : 1A2K_r_l_b.pdb Total DockQ over 3 native interfaces: 0.653 with BAC:ABC model:native mapping Native chains: A, B Model chains: B, A DockQ: 0.994 irms: 0.000 Lrms: 0.000 fnat: 0.983 fnonnat: 0.008 clashes: 0.000 F1: 0.987 DockQ_F1: 0.996

DockQ scores are quite different. Do you know why?
Thanks.

Bug in aligning model to native (cython branch)

In cython branch, an instance of Align.PairwiseAligner is created, configured, but not used:

DockQ/src/DockQ/DockQ.py

Lines 305 to 310 in 01f9703

aligner = Align.PairwiseAligner()
aligner.match = 5
aligner.mismatch = 0
aligner.open_gap_score = -10
aligner.extend_gap_score = -0.5
aln = Align.PairwiseAligner().align(model_sequence, native_sequence)[0]

Also, extracting the alignment is not accurate here:

DockQ/src/DockQ/DockQ.py

Lines 311 to 312 in 01f9703

alignment["seqA"] = aln.format().split("\n")[0] #aln.seqA
alignment["seqB"] = aln.format().split("\n")[2] #aln.seqB

and should be:

alignment['seqA'] = aln[0, :]
alignment['seqB'] = aln[1, :]

Running Readme Example

I'm following the instructions on the readme file, I clones the repo, moved to its directory, run the make command then I'm trying to run the example provided as well. but when I do an error comes up.
the example command:
bash ./DockQ.py examples/model.pdb examples/native.pdb
The error that shows up
import-im6.q16: attempt to perform an operation not allowed by the security policy 'PS' @ error/constitute.c/IsCoderAuthorized/408. from: can't read /var/mail/Bio ./DockQ.py: line 6: syntax error near unexpected token 'ignore',' ./DockQ.py: line 6: 'warnings.simplefilter('ignore', BiopythonWarning)'

np.dot(ligand_atoms_sample, rot) error.

Hello,
Most of my cases run smoothly, but I tried 'DockQ B.cif A.pdb', this give me error,
rotated_sample_atoms = np.dot(ligand_atoms_sample, rot) + tran
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: shapes (0,) and (3,3) not aligned: 0 (dim 0) != 3 (dim 0)

case.zip

Do you know why? Thanks.

Trying to run Readme example, issues with make command

When I am trying to run the Make command i am getting this:
cc -O3 -funroll-loops -Isrc/ -c src/molecule.c -lm
process_begin: CreateProcess(NULL, cc -O3 -funroll-loops -Isrc/ -c src/molecule.c -lm, ...) failed.
make (e=2): The system cannot find the file specified.
make: *** [Makefile:14: molecule.o] Error 2

PermissionError: [WinError 32]

When i try to follow the example steps to check the installation is ok, the text file 'renumber_pdb' is just popping up on my screen.

I'm not sure why this popping up, and the program isn't running. Any help would be appreciated.
(I'm trying to run the software on windows, in the Anaconda terminal)

[cython branch] Value error when using -perm1 -perm2 flags

When using the cython branch, after running

DockQ.py <model> <native> -native_chain1 A B -perm1 -perm2

I get an error (see below). Note that it works fine running
DockQ.py <model> <native>
or
DockQ.py <model> <native> -native_chain1 A -model_chain1 A

Best
Samuel

The error:

Traceback (most recent call last):
File "/proj/berzelius-2021-29/users/x_safro/programs/iTM-align/DockQ.py", line 965, in <module>
main()
File "/proj/berzelius-2021-29/users/x_safro/programs/iTM-align/DockQ.py", line 851, in main
test_info = run_on_groups(
^^^^^^^^^^^^^^
File "/proj/berzelius-2021-29/users/x_safro/programs/iTM-align/DockQ.py", line 635, in run_on_groups
info = calc_DockQ(
^^^^^^^^^^^
File "/proj/berzelius-2021-29/users/x_safro/programs/iTM-align/DockQ.py", line 139, in calc_DockQ
ref_res_distances = get_residue_distances(
^^^^^^^^^^^^^^^^^^^^^^
File "/proj/berzelius-2021-29/users/x_safro/programs/iTM-align/DockQ.py", line 533, in get_residue_distances
model_res_distances = residue_distances(
^^^^^^^^^^^^^^^^^^
File "/proj/berzelius-2021-29/users/x_safro/programs/iTM-align/operations_nocy.py", line 29, in residue_distances
atom_distances = get_distances_across_chains(atom_coordinates1, atom_coordinates2)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/proj/berzelius-2021-29/users/x_safro/programs/iTM-align/operations_nocy.py", line 6, in get_distances_across_chains
distances = ((model_A_atoms[:, None] - model_B_atoms[None, :]) ** 2).sum(-1)
~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
ValueError: operands could not be broadcast together with shapes (4712,1,3) (1,0)

Pip Install Issue

Why the code will report error like:

Traceback (most recent call last):
  File "a.py", line 1, in <module>
    from DockQ.DockQ import load_PDB, run_on_all_native_interfaces
ModuleNotFoundError: No module named 'DockQ'

After I use the pip install .
How can I fix it?
Thank for your response

I got KeyError: 'A'

1st I tried.

./DockQ.py examples/1a14_pred.pdb examples/1a14.pdb
Multi-chain model need sets of chains to group
use -native_chain1 and/or -model_chain1 if you want a different mapping than 1-1
Model chains  : ['A', 'H']
Native chains : ['N', 'H', 'L', 'A']

than I tried this.

./DockQ.py examples/1a14_pred.pdb examples/1a14.pdb -native_chain1 A H -model_chain1 A H

Traceback (most recent call last):
  File "/home/fkt/Downloads/abdockgen/DockQ/./DockQ.py", line 732, in <module>
    main()    
    ^^^^^^
  File "/home/fkt/Downloads/abdockgen/DockQ/./DockQ.py", line 569, in main
    native=make_two_chain_pdb_perm(native,nat_group1,nat_group2)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/fkt/Downloads/abdockgen/DockQ/./DockQ.py", line 447, in make_two_chain_pdb_perm
    f.write(change_chain(pdb_chains[c],"A"))
                         ~~~~~~~~~~^^^
KeyError: 'A'

Please help.

Chain size limitations?

I am having trouble running DockQ with moderately large homo-dimers. Is this a known issue for the tools here to fail when there are many residues in a chain?

I ran DockQ successfully for most of the models and references in a given benchmark set but the largest files failed.
The smallest file where I could observe a failure was when comparing the attached (4u59_2_files.zip) 4u59_2_model.pdb with 4u59_2.pdb (i.e. simple call ./DockQ.py 4u59_2_model.pdb 4u59_2.pdb).

Here the model covers more than the reference and so ./DockQ.py 4u59_2.pdb 4u59_2.pdb works (3076 residues in 4u59_2) while ./DockQ.py 4u59_2_model.pdb 4u59_2_model.pdb fails (3294 residues in 4u59_2_model).

The traceback of the error looks as follows when run with Python 3:

Traceback (most recent call last):
  File ".../DockQ.py", line 730, in <module>
    main()    
  File ".../DockQ.py", line 658, in main
    info=calc_DockQ(model,native,use_CA_only=use_CA_only,capri_peptide=capri_peptide) #False):
  File ".../DockQ.py", line 112, in calc_DockQ
    fnat_out = os.popen(cmd_fnat).read()
  File ".../python3.9/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xef in position 852: invalid continuation byte

and as follows with Python 2:

Traceback (most recent call last):
  File "../DockQ.py", line 730, in <module>
    main()    
  File "../DockQ.py", line 658, in main
    info=calc_DockQ(model,native,use_CA_only=use_CA_only,capri_peptide=capri_peptide) #False):
  File "../DockQ.py", line 118, in calc_DockQ
    assert fnat!=-1, "Error running cmd: %s\n" % (cmd_fnat)
AssertionError: Error running cmd: .../fnat 4u59_2_model.pdb 4u59_2_model.pdb 5 -all

The latter error indicates an issue in the fnat binary which indeed produces wrong looking characters before segfaulting. Here the last few lines of the output of fnat 4u59_2_model.pdb 4u59_2_model.pdb 5:

NATIVE: 25259?b 1629C 0.107644
Fnat 85805 13756 6.237642
Fnonnat -72049 13756 -5.237642
Segmentation fault

As an additional note I observed plenty of compile-time warnings when compiling using GCC 10.3.0 and it may be worth checking them as they could be indicative of some overflows or so...

The specific files do not matter and I could reproduce the same failures when downloading moderately large homo-dimers from the PDB (e.g. https://files.rcsb.org/download/6EQO.pdb).

Given that large complexed and multi-domain proteins are interesting and challenging prediction problems it would be good to fix the issue described here to be able to apply DockQ on benchmarks for such problems.

DockQ error

Hi, I'm using DockQ on several protein-peptide complexes obtained from a redocking procedure. On some complexes it works fine, while on others it gives me the following error:

            UserWarning: WARNING: It looks like cython is not working,
                     falling back on native python. This will make DockQ slower
              warnings.warn(
            Traceback (most recent call last):
            File "C:\mypath\Python\Python312\Lib\site-packages\Bio\File.py", line 72, in as_handle
              with open(handleish, mode, **kwargs) as fp:
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
          TypeError: expected str, bytes or os.PathLike object, not TextIOWrapper
          
          During handling of the above exception, another exception occurred:

          Traceback (most recent call last):
            File "C:\mypath\DockQ\DockQ.py", line 585, in load_PDB
              structure = pdb_parser.get_structure(
                          ^^^^^^^^^^^^^^^^^^^^^^^^^
            File "C:\mypath\DockQ\parsers.py", line 251, in get_structure
              self._parse(lines, chains)
            File "C:\mypath\DockQ\parsers.py", line 264, in _parse
              self.trailer = self._parse_coordinates(coords_trailer, chains)
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
            File "C:\mypath\DockQ\parsers.py", line 401, in _parse_coordinates
              except PDBConstructionException as message:
                     ^^^^^^^^^^^^^^^^^^^^^^^^
          NameError: name 'PDBConstructionException' is not defined. Did you mean: 'PDBConstructionWarning'?
          
          During handling of the above exception, another exception occurred:
          
          Traceback (most recent call last):
            File "C:\mypath\DockQ\DockQ.py", line 936, in <module>
              main()
            File "C:\mypath\DockQ\DockQ.py", line 762, in main
              model_structure = load_PDB(args.model, chains=model_chains)
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
            File "C:\mypath\DockQ\DockQ.py", line 593, in load_PDB
              structure = pdb_parser.get_structure(
                          ^^^^^^^^^^^^^^^^^^^^^^^^^
            File "C:\mypath\DockQ\parsers.py", line 25, in get_structure
              self._build_structure(structure_id, chains)
            File "C:\mypath\DockQ\parsers.py", line 38, in _build_structure
              atom_serial_list = mmcif_dict["_atom_site.id"]
                                 ~~~~~~~~~~^^^^^^^^^^^^^^^^^
          KeyError: '_atom_site.id'

I think it is a problem in the pdb file, but i don't find it, since it is generated in the same way as other files that do not raise this error. Do you have any idea or suggestions?

The command that I gave in windows prompt is: python DockQ.py --capri_peptide model.pdb reference.pdb

Many thanks.

Andrea

Receptor as the smaller molecule

I've noticed that receptor molecule is selected according to the one with larger number of residues. Is there a way to bypass this behavior?

Thanks in advance for your help, congratulations on this useful software!

DockQ.py incorrectly calculating iRMS

I’m attempting to get the DOCKQ score of a model of CAPRI target #47, from the score_set dataset. The model is named Target47_0064.pdb and the correct crystal structure is named Target47_3u4e.pdb. However, the DOCKQ script seems to be getting the interface RMSD wrong. Both are attached (but with the extention txt added, as pdb files aren't allowed to be uploaded by github) here:

Target47_0064.pdb.txt
Target47_3u4e.pdb.txt

In the DockQ folder, I’ve run

scripts/fix_numbering.pl /path/to/Target47_0064.pdb /path/to/Target47_3u4e.pdb

then

python3 DockQ.py /path/to/Target47_0064.pdb.fixed /path/to/Target47_3u4e.pdb

which produces the following output (truncated to just show Fnat through DockQ):

Fnat 0.833 45 correct of 54 native contacts
Fnonnat 0.297 19 non-native of 64 model contacts
iRMS 4.576
LRMS 1.827
DockQ 0.629 

However when I calculate the three components of DockQ myself (using protein structure tools from the C++ library Mosaist), I find:

Fnat: 0.846
iRMS: 1.019
lRMS: 1.819
DockQ: 0.829

The slight differences in Fnat and lRMS aren’t very concerning to me (I assume they come down to some slight difference in atom-matching between the structures), but the iRMS is significantly off. Examining the structure visually, it seems like the iRMS should be around 1 Å, so I think this probably comes down to a bug in the DOCKQ script where it sometimes gets iRMS wrong. This happens reproducibly with a number of other models for Target 47 as well, all having larger iRMS values than they should, lowering their DockQ scores considerably (usually a difference of ~0.2). Here's a visual of the model and crystal structure aligned in pymol, with the crystal in yellows, model in greens, and different binding partners in either lighter or darker shades. My apologies if I'm wrong, but it appears the iRMS should be much lower than 4.5 Å, and an iRMS of around 1.0 Å would be reasonable. I figured I should bring the discrepancy here, in case it represents some buggy edge case.

Screen Shot 2023-01-10 at 3 20 20 PM

IndexError: list index out of range

I’m attempting to get the DOCKQ score of a model of CAPRI target #50, from the score_set dataset. The model is named Target50_0000.pdb and the correct crystal structure is named Target50_3r2x.pdb. Both are attached (but with the extention txt added, as pdb files aren't allowed to be uploaded by github) here:

Target50_0000.pdb.txt
Target50_3r2x.pdb.txt

Running

scripts/fix_numbering.pl /path/to/Target50_0000.pdb /path/to/Target50_3r2x.pdb

works fine, but running

python3 DockQ.py /path/to/Target50_0000.pdb.fixed /path/to/Target50_3r2x.pdb -native_chain1 A B -native_chain2 C -model_chain1 A B -model_chain2 C

results in the following error:

Traceback (most recent call last):
  File "/dartfs/rc/lab/G/Grigoryanlab/home/coy/DockQ/DockQ.py", line 732, in <module>
    main()    
  File "/dartfs/rc/lab/G/Grigoryanlab/home/coy/DockQ/DockQ.py", line 510, in main
    model_chains=get_pdb_chains(model)
  File "/dartfs/rc/lab/G/Grigoryanlab/home/coy/DockQ/DockQ.py", line 387, in get_pdb_chains
    pdb_struct = pdb_parser.get_structure("reference", pdb)[0]
  File "/dartfs-hpc/rc/home/4/f002v94/.conda/envs/myenv/lib/python3.9/site-packages/Bio/PDB/PDBParser.py", line 100, in get_structure
    self._parse(lines)
  File "/dartfs-hpc/rc/home/4/f002v94/.conda/envs/myenv/lib/python3.9/site-packages/Bio/PDB/PDBParser.py", line 123, in _parse
    self.trailer = self._parse_coordinates(coords_trailer)
  File "/dartfs-hpc/rc/home/4/f002v94/.conda/envs/myenv/lib/python3.9/site-packages/Bio/PDB/PDBParser.py", line 198, in _parse_coordinates
    resseq = int(line[22:26].split()[0])  # sequence identifier
IndexError: list index out of range

Script fixnumbering.pl does not output model.pdb.fixed

Hi,

Thank you for developing this useful software. When I attempt to run the script to align the protein lengths of the separate chains of the predicted model, I am met with the error that needle is not installed. However, when I run the script without any input, it says needle is in fact installed. How do I circumvent this?

Thanks in advance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.