Comments (4)
I am having the exact same issue. Even parting from structures from different proteins (a coiled-coil dimer and a helix tetramer) I get sequences that are near 30% K or E.
from proteinmpnn.
@bwllc I know this is a little late, but I've seen this happen when intra-residue geometry is not ideal (my own method, AttnPacker, does this). It's my guess that conserved bond lengths and angles are off.
If you have access to Rosetta, you can run relax with coordinate constraints to fix the geometry while minimizing the RMSD between pre-relaxed and relaxed structures. There is also the [Idealize protocol] (https://www.rosettacommons.org/docs/latest/scripting_documentation/RosettaScripts/Movers/movers_pages/IdealizeMover) which is designed for exactly this, but I haven't tried it.
As a first step, you can try running inference with the v_48_020.pt model first. If the distribution of AA types looks better, then that's a good indication that this is your issue.
GL
from proteinmpnn.
Thanks for your reply, @MattMcPartlon.
I think that you are saying that a computational dynamics, force-field relaxation step sometimes needs to be applied to the output of RFDiffusion before passing it to ProteinMPNN. Do I understand that correctly?
If Rosetta has been open-sourced, I can use its minimizer.
I already have GROMACS, and it also has a relaxation algorithm which I can investigate. I'm not sure if it behaves differently than the Rosetta minimizer. I'm not sure whether that would matter. I could probably specify constraints on some atoms in GROMACS, but that sounds fussy, and I'd prefer to avoid that if I can.
Please let me know if I'm barking up the wrong tree. Thanks.
from proteinmpnn.
@bwllc That's exactly what I mean :).
Before spending too much time on this, you can check (for example) that the consecutive C-alpha atoms are at distance 3.8A +/- 0.1. If you see distances outside of this range, then relaxing with a forcefield should solve your problem.
I only recommend rosetta's minimizer because it can explicitly minimize RMSD between relaxed and input structures.
GROMACS or AMBER should also work fine. Good luck!
from proteinmpnn.
Related Issues (20)
- Questions about model weights
- .Fa Output reorganization question
- Sampling temperature for flexible chains
- what pdbx package does parse_cif_noX.py expect? HOT 1
- Global_score
- No use of GPU?
- `parse_cif_noX.py` misses some chains in CATH? HOT 2
- Training model
- Retrieve per-position scores or score a chain in the context of another
- whether to redesign low confidence aas
- Design complexes with unknown chains (proposed fix included)
- Amino acid sequence has too many "K/E" HOT 2
- How do I use a PSSM with proteinMPNN?
- Need of assistance and advising
- Model is adding an amino acid to the original sequence HOT 5
- Creates hydrophobic surface patches wit many Ala side chains HOT 2
- Training time HOT 1
- What is the difference between --conditional_probs_only_backbone and --unconditional_probs_only HOT 1
- Empty parsed_pdbs.jsonl file from parse_multiple_chains.py helper script? HOT 1
- RuntimeError: Class values must be smaller than num_classes. | protein_mpnn_utils.py & mask_size issue?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from proteinmpnn.