GithubHelp home page GithubHelp logo

shirtsgroup / physical_validation Goto Github PK

View Code? Open in Web Editor NEW
55.0 14.0 18.0 66.27 MB

Physical validation of molecular simulations

Home Page: https://physical-validation.readthedocs.io

License: MIT License

Python 99.58% TeX 0.42%
physical-validation molecular-simulation molecular-dynamics molecular-mechanics python

physical_validation's Introduction

GitHub Actions Test Status GitHub Actions Lint Status Documentation Status codecov
DOI DOI
MIT license

physical_validation: A Python package to assess the physical validity of molecular simulation results

physical_validation is a package testing results obtained by molecular simulations for their physical validity.

Please cite

Merz PT, Shirts MR (2018) Testing for physical validity in molecular simulations. PLoS ONE 13(9): e0202764. https://doi.org/10.1371/journal.pone.0202764
Merz et al., (2022). physical_validation: A Python package to assess the physical validity of molecular simulation results. Journal of Open Source Software, 7(69), 3981, https://doi.org/10.21105/joss.03981

physical_validation incorporates the functionality of checkensemble.

This software is developed in the Shirts group at University of Colorado in Boulder.

Documentation

Please check https://physical-validation.readthedocs.io for the full reference.

physical_validation's People

Contributors

cwalker7 avatar dependabot[bot] avatar lgtm-com[bot] avatar matkie avatar mattwthompson avatar mrshirts avatar pre-commit-ci[bot] avatar ptmerz avatar tlfobe avatar wehs7661 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

physical_validation's Issues

Think about pymbar dependency

physical_validation has a dependency on pymbar. This is only needed to access the timeseries submodule of pymbar. It seems a bit of an overkill to require the full pymbar package just to decorrelate a few timeseries...

Streamline use of Units throughout code

The UnitData, part of every SimulationData object, is not consistently used throughout the code. This is a reminder to check the use cases, and if possible simplify the UnitData object.

Make the variables in the axes consistent

Right now, in the graphs, it prints "Energy" in the distribution no matter what the variable is (enthalpy, energy, pressure); this should be improved (could borrow code from checkensemble)

Plot of sampled distribution and p-value from kinetic energy validation

Hi,

I am using the Maxwell-Boltzmann ensemble validation function to check a system of mine. When executed, my simulation result fits the Boltzmann distribution quite nicely, but the p-value is low (3.95e-08) and thus the null hypothesis is rejected. This seems to be contradictory. I am attaching the Gromacs files I have read in for this function, as well as PDF of the plot produced. Any insight on this would be appreciated.

Thanks,
Ray Matsumoto

Gromacs files: pyrrol_valid.zip

Plot: KE_val.pdf

Use fancy repo badges

Once we have our CI up and running, we'll want to have a lot of fancy repo badges ๐Ÿ˜Ž

Review code-level documentation

This concerns the documentation of the code itself, not accompanying tutorials / manual. Check for completeness, correctness, and accessibility to new developers.

TopologyData name adjustment

Perhaps we should change TopologyData to MolecularData? Topology data suggests something much more complicated than the relatively simple data that is needed for the tests.

Add example for usage with OpenMM

Usage with OpenMM should work by feeding physical_validation python data structures. We should make sure this works and have examples in the documentation.

Parsing molecular masses from [default][atomtypes]

masses = []
for atom in topology[molecule]['atoms']:
if len(atom.split()) >= 8:
masses.append(float(atom.split()[7]))
else:
code = atom.split()[1]
masses.append(float(topology['defaults']['atomtypes'][code].split()[3]))

Hello everyone,

I ran into problems with the kinetic_energy.equipartition() method and traced it back to parsing the wrong masses from my topology. It seems like the masses are parsed from the [moleculetype][atoms] if possible but the fallback is to take it from the [defaults][atomtypes] section. When falling back to the else statement the fourth value in each line of [defaults][atomtypes] is parsed as molecular masses, however, at least for my topology this is the charge and I actually want to parse the third value ([...].split()[2]).

As I'm using an in-house FF and all of the topology is written by me or colleagues this might be specific to our syntax. However, I looked at some of the ffnonbonded.itp of the amber FF and they also should have run into the same problem (from amber94.ff/ffnonbonded.itp):

; name at.num mass charge ptype sigma epsilon
Br 35 79.90 0.0000 A 3.95559e-01 1.33888e+00 ; Converted from parm99.dat
C 6 12.01 0.0000 A 3.39967e-01 3.59824e-01
CA 6 12.01 0.0000 A 3.39967e-01 3.59824e-01

Python issue that occurs when importing library

When I import the physical validation package, i.e.,

import physical_validation as pv

I get the error:

Traceback (most recent call last):
  File "ana_flatfile.py", line 2, in <module>
    import physical_validation as pv
  File "/Users/cri/anaconda3/lib/python2.7/site-packages/physical_validation/__init__.py", line 43, in <module>
    from . import kinetic_energy
  File "/Users/cri/anaconda3/lib/python2.7/site-packages/physical_validation/kinetic_energy.py", line 36, in <module>
    from .util import kinetic_energy as util_kin
  File "/Users/cri/anaconda3/lib/python2.7/site-packages/physical_validation/util/__init__.py", line 31, in <module>
    from . import ensemble
  File "/Users/cri/anaconda3/lib/python2.7/site-packages/physical_validation/util/ensemble.py", line 774
    '[0/{:d}]'.format(bs_repetitions), end='')
                                          ^
SyntaxError: invalid syntax

Note, on this machine I am running: Python 2.7.15 :: Anaconda, Inc.

Make sure that documentation is beginner-friendly

Ideally, the documentation should allow users having moderate knowledge of molecular simulations to run tests on their systems within a few minutes. It should then have additional information for users interested in the details of theory or implementation. It would be very helpful if group members that have little or no knowledge of the physical validation package could verify if the current documentation is beginner-friendly, and suggest improvements!

Brainstorming ideas/tasks for making physical validation a group project

  • Adding continuous integration testing framework (from MolSSI cookie cutter)
  • Adding pytest tests to the CI
  • adding different nontrivial examples
    • checking replica exchange simulations
    • Checking a vapor phase simulation?
    • What else?
  • Improving the documentation at read the docs
  • Adding support for different molecular simulation tools
    • AMBER (just needs to be tested once we are given it by SiliconTherapeutics)
    • LAMMPS
    • HOOMD (can add angular kinetic energy validation)
  • perform pylint/black reformatting
  • Finish python3 conversion.
  • List the things from JOSS needed.
  • Make it conda-installable?

conda-forge to-do list

The PR is open at conda-forge/staged-recipes#11370 but we should do a few things, and make a new release, before anticipating that PR be merged. This issue is to document those things

  • General cleanup - largely handled by #41
  • Tests - #59
  • License - #62 not a blocker but if it is going to be changed, it may be easier to have only the new license on forge
  • New release (including new license file packaged)
  • Add conda instructions on "Installation" page.
  • Update release docs https://github.com/shirtsgroup/physical_validation/wiki
  • Update license in conda recipe - we moved to MIT
  • Tests have additional dependencies, see devtools/conda-envs/test_env.yaml. Do we need to include them in the conda recipe? Can we reuse that .yaml file to reduce the number of places that list dependencies?

This is all that comes to mind - any other suggestions?

Add developer guide to documentation

Should include explanation of workflow, requirements to contributions, style guide, how tos for maintenance of the project, other useful topics.

Related to #45.
#42 #43 #44 should probably be made a part of this documentation.

naming conventions for better readbility?

I wonder if it would be better to make the function names more readable, for example

physical_validation instead of physicalvalidation
kinetic_energy instead of kineticenergy

Generating machine-readable results

One nice suggestion was to generate machine-readable results for the output of physical_validation. The steps would be

  1. Determine a JSON schema for the results
  2. The functions fill in the schema.
  3. the logfile is actually the output of running the JSON internal representation through an interpreter.

This is lower priority, but a good idea overall - would make it easier to have GROMACS take return values, or to have other programs incorporate results in an automated way.

Retire py2 support

Python 2 is finally out of support ๐Ÿฅณ

There are workarounds in different places of the code to make it compatible with both py2 and py3. We should find these and remove any hacks not needed anymore.

Catch the NaN error that occurs when histograms are not well-populated

Migrated from shirtsgroup/checkensemble#2

mrshirts commented on Jun 20, 2016:

When distributions are chosen poorly, it's very likely the ratio of the histograms will be undefined, as the denominator will be zero. This needs to be caught with a proper warning.

mrshirts commented on Mar 6, 2017:

The issues show up as warnings:

checkensemble/checkensemble/checkensemble.py:910: RuntimeWarning: divide by zero encountered in log
ratio = numpy.log(hlist[1]/hlist[0]) # this should have the proper exponential distribution
checkensemble/checkensemble/checkensemble.py:911: RuntimeWarning: invalid value encountered in divide
dratio = numpy.sqrt((dhlist[0]/hlist[0])**2 + (dhlist[1]/hlist[1])**2)
checkensemble/checkensemble/checkensemble.py:910: RuntimeWarning: divide by zero encountered in divide
ratio = numpy.log(hlist[1]/hlist[0]) # this should have the proper exponential distribution

License discussion

From #40:

SB: was there a particular reason to go with LGPL rather than something like MIT?

PM: Part of the code is currently in the GROMACS repo - the package was shipped with GROMACS before it became a stand-alone package and has not yet been removed. (When we have a stable release, I'll make sure that GROMACS gets the package from pip rather than shipping an outdated version.) At the time, we chose the same license as GROMACS to keep things simple.

MS: Agreed [that licensing can certainly be rediscussed should probably be a separate issue where we can discuss and come to a conclusion]

so here's that issue!

Evaluate use of other packages reading molecular simulation result files

Connected to #51 #52 #53 #54 - is having our own parsers the best way to move forward? Could MDTraj / MDAnalysis / pytraj / other packages make our lives easier? Note that physical_validation has some very specific requirements of what information is needed, which were not fulfilled by any other package last we checked.

Related to do: Make list of information physical_validation needs to read from simulation results files.

Check / prove that integrator test is valid for temperature / pressure coupling

The integrator test analyzes the fluctuations of the constant of motion of a simulation. Theoretically, this is expected to be proportional to dt^2 (since the integrator preserves the constant of motion of the shadow Hamiltonian, but we are observing the constant of motion of the real Hamiltonian...).

We are only advertising to use this in NVE simulations, since we never cleanly proved that it would work with constants of motion of extended systems using temperature or pressure coupling. It shouldn't be too hard to check this & update the documentation accordingly.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.