Comments (8)
from rmsd.
Hi @tccyl ,
Where is the PDB file from? From rcsb.org?
I am not a heavy .pdb fileformat user. I've read the http://www.wwpdb.org/documentation/file-format-content/format33/sect9.html#ATOM fileformat documentation and it seems PDB is column width based and not space.split as currently implemented.
You are very welcome to make a pull request solving this formatting, including a .pdb file where rmsd fails.
from rmsd.
Dear Jimmy,
calculate_rmsd.py is sometimes deployed by mine for data derived from single crystal
diffraction; natively deposit as .cif, a reading format equally recognized by openbabel.
Here, I'm able to second @tccyl as well as the documentation in and around the script
that not all .pdb are equally well suited to enter successfully the Kabsch test and
tentatively attribute different formatting as well as their content contributing to some
of the issues.
Converting .cif to .pdb with openbabel yields files generally unsuitable for
calculate_rmsd.py. Which is why I typically either
-
convert them further to .xyz with openbabel, then passing successfully; or
-
deploy Olex2 to write either .pdb, or .xyz. Both
types interact well with calculate_rmsd.py. Because it was not perceived as an obstacle,
I didn't spend additional time on this issue.
Possibly some of the documentation attached may illustrate the experience.
2019-Jun-07_calculate_rmsd_pdb_corrected.zip
from rmsd.
Probably the issue is the missing space between y- and z-component of the coordinate. Instead of a manual correction, such an omission may be corrected on the CLI with openbabel (openbabel.org) in a pattern of babel -ipdb notworking.pdb -opdb now_working.pdb where -ipdb defines the input format as .pdb, and similar, -opdb specifies the output format as .pdb. Depending on the version of calculate_rmsd.py, the .pdb generated by openbabel might not work well. In this case, in case you do not need crystallographic information like space group symmetry, you may better work with the least complex file type instead, .xyz. If so a call from the terminal in pattern of obabel .pdb -oxyz -m will convert in a batch all .pdb in your directory into .xyz files. Give this a try, and if not working post again. Norwid
…
On Wed, 05 Jun 2019 08:56:32 -0700 tccyl @.> wrote:if x_column == None: try: # look for x column for i, x in enumerate(tokens): if "." in x and "." in tokens[i + 1] and "." in tokens[i + 2]: x_column = i break except IndexError: exit("error: Parsing coordinates for the following line: \n{0:s}".format(line))
If the pdb line is like 'ATOM 383 C6 C B 122 -2.217 -2.542-103.749' (the value of x and that of z are connected), the code will exit, and the coordinates cannot be obtained.
Hi, nbehrnd,
Thanks for your nice suggestion. However, using openbabel to convert the pdb format is still not able to solve this issue.
Because as @charnley said, pdb format is column width based, not space.split as currently implemented. But neither pdb format from rcsb nor that from openbabel, the column of x, y, z coordinates are the same and they follow the format:
try:
x = line[30:38]
y = line[38:46]
z = line[46:54]
V.append(np.asarray([x, y ,z], dtype=float))
May be the way to obtain the x, y, z coordinates can directly use the above codes and not by looking for x_column.
from rmsd.
Hi @tccyl ,
Where is the PDB file from? From rcsb.org?
I am not a heavy .pdb fileformat user. I've read the http://www.wwpdb.org/documentation/file-format-content/format33/sect9.html#ATOM fileformat documentation and it seems PDB is column width based and not space.split as currently implemented.
You are very welcome to make a pull request solving this formatting, including a .pdb file where rmsd fails.
Yes, it is from rcsb.org.
from rmsd.
Hi @tccyl ,
it seems my reply by email earlier didn't pass through. Anyway, meanwhile, there was
some work on the script, aiming to enable .pdb
written by the popular openbabel
to
pass the Kabsch test because the current version 1.3.2 (released in January 2019) does
not work successfully with .pdb
by openbabel
.
For a small test molecules (benzamide) consisting of C, H, N, and O, the addition of some
keywords to the instructions in the script now allows to work with such files successfully. It
is deposit here and equally deposit as
pull request #58 -- including additional test data
(.pdb newly written by openbabel) known to work, too. Still labeled as version 1.3.2 (Jan
2019), awaiting an action by Jimmy.
Meanwhile, give it a try; perhaps your (test) data reveal additional keywords should be
added, too. Be welcome to deposit your two files in question here -- perhaps there are
additional keywords to consider which should be included. You need to know that there are
multiple 'dialects' of .pdb
files around, which contributes to the issues here (which is why
.xyz
represent a resort, at some expense, of course).
from rmsd.
@nbehrnd Thank you so much~
The two example files where rmsd failed are below:
two_fragment_files.zip
from rmsd.
Hi @tccyl
in short, after passing the .pdb
to openbabel, the RMSD calculate_rmsd.py
determines
for either variant of the Kabsch test equals to about 0.7983. Below both the script's copy
used, as well as documenting (two .zip).
The detailed story:
An initial inspection of the files in an editor revealed that both describe the same number
of atoms per atom type. The subsequent check in avogadro revealed that the mutual distance
of these atoms are beyond the van der Waals radii and in this sight not adjacent to each other.
In their original form, the two files are not suitable for a Kabsch test with either the current
version of calculate_rmsd.py
(1.3.2 by January 2019), nor my changes from last week.
I passed your .pdb
to openbabel
(version 2.4.1 by November 2018) to be rewritten:
babel -ipdb 4L81_10_CPCN.pdb -opdb 4L81_10_CPCN_babel.pdb
babel -ipdb 4L81_11_CPCN.pdb -opdb 4L81_11_CPCN_babel.pdb
In both instances, openbabel
indicated difficulties working with the orginal data. This suggests
the export from the original source file should be revised; which obviously is not the topic of this
thread. One of the error logs is included as error.log
.
However atom label, (x,y,z) and atom type seem to pass into the newly written .pdb
, which
indeed includes retention of missing a space between the y- and z-component of the
coordinates. Maybe characteristic for working with protein data, instead of small molecule data.
The openbabel-written .pdb
then passed smoothly either of the three variants of the Kabsch
test with engaged --reorder
option (default / classical Kabsch test, --use-reflections
,
--use-reflections-keep-stereo
) with the same numerical RMSD of about 0.7983. As a
comparison, the .pdb
were converted with babel into .xyz
; again, the Kabsch tests state a
RMSD of about 0.7983.
With the .xyz
in hand, it may be interesting to inspect the 'best alignment' of the the two
selected sets of atoms. Using 4L81_10_CPCN_babel.xyz as fixed model_A, and 4L81_11_CPCN_babel.xyz as model_B to be aligned in respect to model_A, the new
coordinates of model_B were harvested by
python3 calculate_rmsd.py --reorder -p 4L81_10_CPCN_babel.xyz 4L81_11_CPCN_babel.xyz > new_alignment_11.xyz
Both model_A as well as the update of model_B (new_alignment_11.xyz) were read by jmol. Their corresponding selection of atoms were connected manually ('connect strut' instruction after selection
of the atoms in question) with struts dyed either red (model_A) or blue (new_alignment_11 / updated model_B), labelled (model_A red, model_B blue) in an otherwise cpk-color scheme. They were
exported as static .png
and interactive .wrl
(e.g., view3dscene) to walk around the superposition.
The labeling in jmol
's display of the superposition is worth a word:
Except the two opposite termini, the atoms were labeled in a pattern of C1/1.1 #1
, where C1
stands for the first carbon atom in (1.1
) the first model of the first file read. By same way, 2.1
is about the first model in the second file read by jmol
. #1
refers then to the first atom in this
model read, a counting independent of the atom type or atom label met in the file read.
rmsd-babel_issue.zip
reporting.zip
from rmsd.
Related Issues (20)
- RMSD result using --reorder much higher than expected HOT 5
- How to use it HOT 2
- errors,Structures not same size HOT 2
- Strange RMSD values HOT 5
- RMSD 95 Implementation HOT 1
- msg = f"error: Parsing atomtype for the following line:" f" \n{line}" HOT 10
- Pre-specify residuals? HOT 4
- be aware of reflection operation HOT 3
- Output rotation matrix HOT 3
- printed structure does not obey --use-reflections HOT 3
- saving rotated coordinate? HOT 2
- --reorder-method qml --reorder-method none currently not available HOT 4
- error: Structures not same size HOT 3
- Incorrect values for proteins HOT 5
- error: Structures not same size HOT 4
- reordering while preparing for output HOT 4
- rmsd package requires typing_extensions but missing from setup.py HOT 1
- Willing to add the rmsd value to title line?
- Why these two pdbs can not calculate RMSD? HOT 4
- How to get full transformation in a script? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rmsd.