GithubHelp home page GithubHelp logo

alchex's Introduction

DOI

Alchex βš›βš—πŸ˜Ž

(Alk-Ex)

Alchex uses GROMACS to robustly exchange one molecule for another using a method based on Exchange Lipids and removes clashes using the Alchembed method. Documentation is coming soon, but in the meantime, try it out for yourself.

Get started

1. Install Alchex

git clone https://github.com/tomnewport/alchex.git

2. Test installation

import alchex

References

Alchembed

Elizabeth Jefferys, Zara A. Sands, Jiye Shi, Mark S. P. Sansom, and Philip W. Fowler

Exchange Lipids

Heidi KoldsΓΈ

alchex's People

Contributors

tomnewport avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

alchex's Issues

Support for .gro files

A highly finicky file format, care needs to be taken.

  • Read editable residue from gro string
  • Write editable residue to gro string

Add better default configs

Include
- cg-default
- cg-user

cg-default should be immutable

cg-user should be persistent

  • Save and load a ResidueParameters object
  • Save and load a ResidueStructure object
  • Save and load an ExchangeMap object
  • Save and load a GromacsMDPFile object
  • Figure out a folder structure
  • Save and load a config object

A nice folder structure might be:

/config-name
    /compositions
        /mitochondrial_inner.json
        /eukaryotic.json
    /exchange_maps
         /POPC
             /POPC.json
             /POPG.json
         /POPG
            /POPC.json
            /POPG.json
    /parameters
         /POPC.itp
         /POPG.itp
    /structures
        /POPC
            /<anything>.gro
        /POPG
            /<anything>.gro
            /<anything>.gro

Gromacs wrapper

The gromacs wrapper should abstract making process calls to GROMACS. Several existing tools have been considered but rejected mainly due to their experimental status, difficulty of installation and including several tools which aren't required.

The only way to call gromacs presently is via text-based subprocess calls.

It is also important to be able to control the working directory in which processes are run. The API should behave as follows when complete:

gromacs = GromacsWrapper()
gromacs.cd("/tmp/test")
gromacs.mdrun(deffnm="em")

gromacs = GromacsWrapper("/usr/local/gromacs/bin/gmx")

Tasks:

  • Create basic container with a folder and ability to use bash
  • Generate prefix/suffix based on gromacs binaries directory
  • Make bash calls
  • With arguments
  • Capturing stdout
  • Capturing stderr
  • Gracefully capturing exceptions
  • Automatically taking unfamiliar gromacs.gwhatever(...) and running gmx whatever
  • Deleting that stupid GROMACS header
  • Including additional defaults for mdrun to make debugging easier
  • Providing callbacks and progress updates for mdrun steps

Support for .itp files

These files are nasty. The current GromacsITPFile setup can pick out bonds, angles and atoms for a single residue. For the moment this container isn't going to modify the ITP, just break it down into sections.

We'll ignore particle types for the moment as well and just focus on molecule definitions. These can be split off from the file by using [ moleculetype ] as a header. Most ITPs are well enough behaved to be able to pick out the block for a molecule of interest.

  • Split ITP into blocks for each molecule
  • Combine and write ITP blocks together with comments

Make atom container with geometry routines

  • Build container
    • Residue Container
    • Add atoms
      • x, y, z
      • Residue name
      • Atom name
      • Look up additional fields
        • Add additional fields
    • Save to PDB
    • Open as MDAnalysis
  • Build in simple modification routines
    • Molecule align
    • Direct overlay
    • Fragment align
    • Distortion
  • Build in complicated modification routines
    • Simple Bridge

declash may split polymers in two

declash should be able to cut out entire proteins but not individual residues of proteins.

Fixing this will require declash to load a topology and determine clashes by moltype, not by residue

Alchex output .gro files in the wrong order

For example:

Alchex replacement component DPPC (declash 0)
 1896
    1DPPC   C3B    1  9.8480  4.2843  5.7217  0.0000  0.0000  0.0000
    1DPPC   C2B    2  9.7713  4.4030  6.2603  0.0000  0.0000  0.0000
    1DPPC   C4B    3 10.2800  4.3940  5.3530  0.0000  0.0000  0.0000
    1DPPC   NC3    4  9.4480  5.9010  7.3740  0.0000  0.0000  0.0000
    1DPPC   GL2    5  9.8050  5.1800  7.0150  0.0000  0.0000  0.0000
    1DPPC   PO4    6  9.5440  5.5070  7.1880  0.0000  0.0000  0.0000
    1DPPC   C1A    7  9.9020  5.6870  6.5850  0.0000  0.0000  0.0000
    1DPPC   GL1    8 10.0120  5.4860  6.9470  0.0000  0.0000  0.0000
    1DPPC   C3A    9 10.2070  5.9710  5.8980  0.0000  0.0000  0.0000
    1DPPC   C2A   10 10.0360  5.9840  6.3060  0.0000  0.0000  0.0000
    1DPPC   C1B   11  9.8610  4.7660  6.6320  0.0000  0.0000  0.0000

Coordinates are all there, however the atoms are in the wrong order leading to ver

Alchex fails when loading .top file

_get_xdg_config_dir())
 βš— 2016-09-13 12:41:33,009 INFO : Done importing
 βš— 2016-09-13 12:41:34,133 INFO : Config complete
 βš— 2016-09-13 12:41:34,429 INFO : Preparing to replace molecules...
Traceback (most recent call last):
  File "exchange1.py", line 31, in <module>
    "composition":{"POPC":85,"POPC":15}})
  File "/sansom/s40/stansfeld/Projects/TMEM16A/16B/MD/exchanges/alchex/replacement.py", line 417, in auto_replace
    self._replace(replaceable_entities)
  File "/sansom/s40/stansfeld/Projects/TMEM16A/16B/MD/exchanges/alchex/replacement.py", line 261, in _replace
    original_topology.from_file(self.simulations.resolve_path("/"+name+"/input.top"))
  File "/sansom/s40/stansfeld/Projects/TMEM16A/16B/MD/exchanges/alchex/gromacs_interface.py", line 473, in from_file
    elif table_columns is not None:
UnboundLocalError: local variable 'table_columns' referenced before assignment

Figure out what to do with three clashing molecules

If A clashes with B, B with C and C with A then alchembed won't work. Here are some ideas:

  • Do a simple (A alchembed B) alchembed C?
    • Could be difficult, particles will wander in vacuum
    • Dummy particles to prevent diffusion?
  • Combine B and C and hope for the best?
  • Do A alchembed B then alchembed C and just don't worry about diffusion
  • Reduce cutoff distance so clash threshold is higher

Simulation container object

Most processes one might want to do involve:

  • Modify a structure, topology and parameters.
  • grompp a structure (gro), topology (top), parameters (map) to get a preprocessed binary file tpr
  • mdrun the tpr to get a trajectory and a structure

The simulation container will be based around the grompp step. Several GROMACS commands produce files in the working directory.

structure = GROFile("structure.gro")
topology = TOPFile("topol.top")
em_mdp = MDPFile("em.mdp")
md_mdp = mdp.MD_MDP

s = SimulationContainer()
s.add(structure)
s.add(topology)
s.add(em_mdp)

em = s.folder("em")
em.gromacs.grompp(c="/structure.gro", p="/topol.top", f="/em.mdp", o="em.tpr")
em.gromacs.mdrun(deffnm="em")

md = s.folder("md")
s.md.gromacs.grompp(c="/em/em.gro", f="/md.mdp" p="/topol.top")
s.md.bash("echo llama >> llama.txt")
s.md.bash(["echo", "llama"])

In the future this container could also keep a json dictionary to store:

  • All commands performed in the container (in order, including output, host, user, pid and exit code)
  • All files which have been added to the container from elsewhere

It should also be possible to save and load a container. Later work could allow the container to resume after a crash, or synchronise with a server using SAGA.

Tasks:

  • Simple container with a GROMACS instance
  • Add file objects to the container
  • Create container folder object
  • Resolve file paths within container
  • Run GROMACS and bash within container (or folder)

Support for .mdp files

This can be simply set up by:

  • Loading MDP to key-value pairs
  • Saving MDP from key-value pairs
  • Test loading
  • Test saving

Build alchembed wrapper

Using alchembed could work in a couple of ways:

  • Load a structure with topology
    • Use alchembed to embed a selection in the rest of the topology
    • OR
    • Use alchembed to embed a new structure and topology
a = Alchembed(structure="input.gro", topology="input.top")
a.add(structure="input2.gro", topology="input2.top")
a.add(selection="resname POPC")

The big challenge here is understanding GROMACS moltypes. Let's look at how the Protein moltype works:

[ moleculetype ]
; Name         Exclusions
Protein           1

[ atoms ]
    1    Qd     1   ALA    BB     1  1.0000 ; C
    2    P5     2   VAL    BB     2  0.0000 ; C
    3   AC2     2   VAL   SC1     3  0.0000 ; C
    4    P4     3   ALA    BB     4  0.0000 ; C
    5    P5     4   ASP    BB     5  0.0000 ; C
    6    Qa     4   ASP   SC1     6 -1.0000 ; C
    7    Nd     5   LYS    BB     7  0.0000 ; 1
    8    C3     5   LYS   SC1     8  0.0000 ; 1
    9    Qd     5   LYS   SC2     9  1.0000 ; 1
...
 2346    N0  1148   ARG   SC1  2346  0.0000 ; C
 2347    Qd  1148   ARG   SC2  2347  1.0000 ; C
 2348    Qa  1149   VAL    BB  2348 -1.0000 ; C
 2349   AC2  1149   VAL   SC1  2349  0.0000 ; C

[ bonds ]
; Backbone bonds
    1     2      1   0.35000  1250 ; ALA(C)-VAL(C)
    2     4      1   0.35000  1250 ; VAL(C)-ALA(C)
    4     5      1   0.35000  1250 ; ALA(C)-ASP(C)
...

[ constraints ]
    5     7      1   0.33000 ; ASP(C)-LYS(1)
    7    10      1   0.31000 ; LYS(1)-ALA(1)
   10    11      1   0.31000 ; ALA(1)-ASP(1)
   11    13      1   0.31000 ; ASP(1)-ASN(1)
   13    15      1   0.31000 ; ASN(1)-ALA(H)
   15    16      1   0.31000 ; ALA(H)-PHE(H)
   16    20      1   0.31000 ; PHE(H)-MET(H)
   20    22      1   0.31000 ; MET(H)-MET(H)
   22    24      1   0.31000 ; MET(H)-ILE(H)

[ angles ]
; Backbone angles
    1     2     4      2    127    20 ; ALA(C)-VAL(C)-ALA(C)
    2     4     5      2    127    20 ; VAL(C)-ALA(C)-ASP(C)
    4     5     7      2    127    20 ; ALA(C)-ASP(C)-LYS(1)
    5     7    10      2    127    20 ; ASP(C)-LYS(1)-ALA(1)
    7    10    11      2     96   700 ; LYS(1)-ALA(1)-ASP(1)
   10    11    13      2     96   700 ; ALA(1)-ASP(1)-ASN(1)
   11    13    15      2     96   700 ; ASP(1)-ASN(1)-ALA(H)
   13    15    16      2     96   700 ; ASN(1)-ALA(H)-PHE(H)
   15    16    20      2     96   700 ; ALA(H)-PHE(H)-MET(H)
   16    20    22      2     96   700 ; PHE(H)-MET(H)-MET(H)
   20    22    24      2     96   700 ; MET(H)-MET(H)-ILE(H)
   22    24    26      2     96   700 ; MET(H)-ILE(H)-CYS(H)

[ dihedrals ]
; Backbone dihedrals
    7    10    11    13      1   -120   400     1 ; LYS(1)-ALA(1)-ASP(1)-ASN(1)
   10    11    13    15      1   -120   400     1 ; ALA(1)-ASP(1)-ASN(1)-ALA(H)
   11    13    15    16      1   -120   400     1 ; ASP(1)-ASN(1)-ALA(H)-PHE(H)
   13    15    16    20      1   -120   400     1 ; ASN(1)-ALA(H)-PHE(H)-MET(H)
   15    16    20    22      1   -120   400     1 ; ALA(H)-PHE(H)-MET(H)-MET(H)
   16    20    22    24      1   -120   400     1 ; PHE(H)-MET(H)-MET(H)-ILE(H)
   20    22    24    26      1   -120   400     1 ; MET(H)-MET(H)-ILE(H)-CYS(H)
   22    24    26    28      1   -120   400     1 ; MET(H)-ILE(H)-CYS(H)-THR(H)
  • The residue name seems pretty much irrelevant
  • EVERYTHING ends up in this single itp file
  • I'll need a decent ITP parser...
  • Protein needs to be included

Build combined topology files

A topology file defines a number of molecules:

[ molecules ]
Protein 1
POPG 229
POPC 117
W 7601
NA+ 440
CL- 213

These are usually included in .itp files. Grompp can produce a preprocessed output using the -pp flag. I would like to propose the following steps:

  1. Collect: input.mdp, input.top, input.gro, moltype_name
  2. Grompp inputs to produce a combined.top for the topology
  3. Create a blank combined.itp file with a moltype specification for moltype_name
  4. For each moltype in the molecules section of combined.top:
    1. Find the moltype definition in combined.top
    2. Renumber all atoms in all tables
    3. Add all tables to tables in combined.itp
  5. Add combined.itp as an #include in combined.top

Tables

.itp files can contain a number of tables. The following are the only ones which will be considered:

Thanks to GromacsWrapper.

[ moleculetype ]

This creates a new subsection within the file and defines a new moltype. Fields are:

  • name - name of the molecule
  • nrexcl - bond-based exclusions

[ atoms ]

This defines atoms. Fields are:

  • id - atom id within the moltype
  • type - atom type (from atomtypes table)
  • resnr - residue number
  • residu - residue name
  • atom - atom name
  • cgnr - charge group number
  • charge - coulomb charge on atom

[ bonds ]

  • ai - from atom id
  • aj - to atom id
  • funct - bond function
  • c0 - bond length
  • c1 - force constant

[ angles ]

ai aj ak funct c0 c1

[ dihedrals ]

ai aj ak al funct c0 c1 c2

[ pairs ]

ai aj funct c0 c1

Adding to a single moleculegroup

  1. Fetch uncommented parts of lines
    1. if ; is in the line:
      1. Take the bit of the line before the ;
    2. strip whitespace
    3. If line != "":
      1. split line on whitespace
      2. if that produces the expected number of parts
      3. resolve any atom references (a[ijkl])
    4. add greatest existing atom_id to all the atom ids
    5. add greatest existing cgnr to all the atom cgnrs

It would also be nice to be able to do a pass over the table to check the .gro and .itp match up.

gmx grompp renumbers residues

grompp-ing results in renumbering of residues from 1. This needs to be fixed:

  • Can gromacs not renumber residues?
  • Can residue ids be copied from the pre-grompped file?

Method to use alchembed on two separate .gro files

This will need as inputs:

  • An alchex config object
  • A simulation container
  • Topology for system1
  • Topology for system2
  • Structure for system1
  • Structure for system2
  • Additional parameters
  • A working directory

(note that system1 must include parameters for system2)

Next we need to:

  1. For both systems:
    1. Obtain preprocessed topologies using grompp
    2. Build moltypes using topcat
  2. Update resids in combined topologies
  3. Modify system1 topology to include system1, system2 as moltypes
  4. Write the alchembed.mdp file with any modifications, and couple-moltype set to system2
  5. GROMPP files
  6. MDRUN files
  7. Build combined topology and structure
    • Re-order residues
    • Renumber atoms
    • Write order to topology file

Support for .top files

Currently the following options are required

  • Load from topology file
    • Which ITP files are included?
    • What are the molecule counts?
    • What is the system name?
  • Save to topology file
    • ITP files in ordered list
    • Molecule counts
    • System name

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.