GithubHelp home page GithubHelp logo

JSON input molecule about cmiles HOT 11 CLOSED

openforcefield avatar openforcefield commented on August 17, 2024
JSON input molecule

from cmiles.

Comments (11)

ChayaSt avatar ChayaSt commented on August 17, 2024 1

@j-wags, not sure if you are using that snippet here openforcefield/openff-toolkit#141 (comment)
but this might be picking up spurious problems.

from cmiles.

jchodera avatar jchodera commented on August 17, 2024

Be sure to loop in @j-wags who is working on our Open Forcefield serialized representations of molecules and topologies!

from cmiles.

dgasmith avatar dgasmith commented on August 17, 2024

QCSchema is always in Bohr without exception. If it is not in Bohr it is an error in the input.

from cmiles.

ChayaSt avatar ChayaSt commented on August 17, 2024

Update on this:

I was not able to get OpenEye to perceive connectivity and stereochemistry from JSON geometry. It does perceive it if it gets read in from a file but we don't want to do that.
RDKit does a find job perceiving stereochemistry from the JSON geometry but it also need the connectivity table to generate a molecular graph. So currently cmiles uses RDKit and the connectivity table to create a molecule.

Also, cmiles assumes the QCSchema so all units are assumed to be in Bohr. @j-wags, once the Open Forcefield serialized representation is complete, we should add the ability to generate molecular id's from that representation. We will need a way to keep track of units then.

from cmiles.

j-wags avatar j-wags commented on August 17, 2024

Thanks for the ping, @ChayaSt. Right now, all quantities in the SMIRNOFF Molecule class are unit-wrapped numpy arrays (simtk.unit.Quantity objects). This should make unit conversion straightforward when we get there.

I was not able to get OpenEye to perceive connectivity and stereochemistry from JSON geometry. It does perceive it if it gets read in from a file but we don't want to do that.

I was wrestling with OE and stereochemistry two weeks ago. One helpful function had been oechem.OE3DToInternalStereo(oemol), though I mucked around with a bunch of other functions. If you're up for it, we could do a screenshare today to look into this.

from cmiles.

ChayaSt avatar ChayaSt commented on August 17, 2024

I was wrestling with OE and stereochemistry two weeks ago. One helpful function had been oechem.OE3DToInternalStereo(oemol), though I mucked around with a bunch of other functions. If you're up for it, we could do a screenshare today to look into this.

Thanks! I first wasn't even able to get Openeye to perceive connectivity from coordinates and found the function oemol.SetDimension(3) that then allows oechem.OEDetermineConnectivity(oemol) to work. I then called a bunch of functions on the molecule:

oechem.OEFindRingAtomsAndBonds(molecule)
oechem.OEPerceiveBondOrders(molecule)
oechem.OEAssignImplicitHydrogens(molecule)
oechem.OEAssignFormalCharges(molecule)
oechem.OEAssignAromaticFlags(molecule)

These functions are always called when openeye reads a molecule from an xyz file. Once I did this the molecule has the stereochemistry information and I don't need to perceive it.

from cmiles.

j-wags avatar j-wags commented on August 17, 2024

That's great. Thanks for looking into this!

We had talked about finding undefined stereochemistry the other day, and I wanted to document that somewhere: Basically, I couldn't find a built-in OE function to check for undefined stereochemistry. Here's what I'm using currently in SMIRNOFF:

from openeye import oechem
from openforcefield.topology.molecule import Molecule

oechem.OEPerceiveChiral(oemol)

# Check that all stereo is specified
unspec_chiral = False
unspec_db = False
problematic_atoms = list()
problematic_bonds = list()

for oeatom in oemol.GetAtoms():
    if oeatom.IsChiral():
        if not (oeatom.HasStereoSpecified()):
            unspec_chiral = True
            problematic_atoms.append(oeatom)
for oebond in oemol.GetBonds():
    if oebond.IsChiral():
        if not (oebond.HasStereoSpecified()):
            unspec_db = True
            problematic_bonds.append(oebond)

from cmiles.

ChayaSt avatar ChayaSt commented on August 17, 2024

Thank you!

This is really helpful!

from cmiles.

ChayaSt avatar ChayaSt commented on August 17, 2024

@j-wags, turns out the code snippet above also finds spurious missing steroechemistry.
Example SMILES: N[C@@H](CCCNC(N)=N)C(O)=O
image

It finds this bond to be missing stereochemistry: [(7, 'C', 9, 'N', 2)] (that's C=N) which is a terminal bond so it's not chiral.

Turns out, atom.HasStereoSpecified(oechem.OEAtomStereo_Tetrahedral) and bond.HasStereoSepcified(oechem.OEBondStereo_CisTrans) will be true for chiral bonds and atoms even though the specific chirality is not defined. You have to check what the handedness is and check if that is undefined.

Here is the code I'm using now.

def is_stereochemistry_defined(molecule):
    unspec_chiral = False
    unspec_db = False
    problematic_atoms = list()
    problematic_bonds = list()
    for atom in molecule.GetAtoms():
        if atom.IsChiral() or atom.HasStereoSpecified(oechem.OEAtomStereo_Tetrahedral):
            # Check if handness is specified
            v = []
            for nbr in atom.GetAtoms():
                v.append(nbr)
            stereo = atom.GetStereo(v, oechem.OEAtomStereo_Tetrahedral)
            if stereo == oechem.OEAtomStereo_Undefined:
                unspec_chiral = True
                problematic_atoms.append((atom.GetIdx(), oechem.OEGetAtomicSymbol(atom.GetAtomicNum())))
    for bond in molecule.GetBonds():
        if bond.IsChiral() or bond.HasStereoSpecified(oechem.OEBondStereo_CisTrans):
            v = []
            for neigh in bond.GetBgn().GetAtoms():
                if neigh != bond.GetEnd():
                    v.append(neigh)
                    break
            for neigh in bond.GetEnd().GetAtoms():
                if neigh != bond.GetBgn():
                    v.append(neigh)
                    break
            stereo = bond.GetStereo(v, oechem.OEBondStereo_CisTrans)
            if stereo == oechem.OEBondStereo_Undefined:
                unspec_db = True
                a1 = bond.GetBgn()
                a2 = bond.GetEnd()
                a1_idx = a1.GetIdx()
                a2_idx = a2.GetIdx()
                a1_s = oechem.OEGetAtomicSymbol(a1.GetAtomicNum())
                a2_s = oechem.OEGetAtomicSymbol(a2.GetAtomicNum())
                bond_order = bond.GetOrder()
                problematic_bonds.append((a1_idx, a1_s, a2_idx, a2_s, bond_order))
    if unspec_chiral or unspec_db:
        raise ValueError("Stereochemistry is unspecified. Problematic atoms {}, problematic bonds {}".format(problematic_atoms,
                                                                                                             problematic_bonds))
    else:
        return True

edit: fixed code snippet.

from cmiles.

j-wags avatar j-wags commented on August 17, 2024

Very interesting! I was using that snippet there. Thanks for making the connection, I'll make sure it gets passed on to the other PR.

Thanks for doing the research above. I've got a few things to finish up with Topology right now, but I'll try out your code soon.

from cmiles.

ChayaSt avatar ChayaSt commented on August 17, 2024

Addressed by #13

from cmiles.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.