lumol-org / lumol Goto Github PK

View Code? Open in Web Editor NEW

184.0 13.0 18.0 4.64 MB

Universal extensible molecular simulation engine

Home Page: http://lumol.org/

License: BSD 3-Clause "New" or "Revised" License

Rust 99.43% Shell 0.23% Python 0.34%

molecular-dynamics monte-carlo molecular-simulation computational-chemistry atomistic-simulations

lumol's Introduction

Lumol molecular simulation engine

Lumol is a classical molecular simulation engine that provides a solid base for developing new algorithms and methods. Using Lumol, you can customize the behavior of all the algorithms in a simulation. Adding a new force field, customizing Monte Carlo moves or molecular dynamics integrators is easy and well documented.

Lumol goals are to be flexible, reliable and extensible. For us, this means that this software should be:

flexible: the code can simulate all kind of systems, from proteins to crystals, using various methods: molecular dynamics, Monte Carlo, etc.
reliable: the code is well tested, both at the function level; and at the simulation level, checking thermodynamic properties of the systems;
extendable: the code is modular, object-oriented, well documented, open-source, and easy to read.

Lumol is actively developed, and should be considered as alpha software. If you are interested, have some questions or want to participate, you can open a Github issue or go to the project chat room.

Features

Pair, molecular and electrostatic interactions (with Ewald or Wolf methods);
Energy minimization;
Molecular dynamics simulations in the NVE, NVT and NPT ensembles;
Monte Carlo simulations in the NVT ensemble;
and many others! Have a look at the documentation for more information

Getting started

Lumol provides both a command line tool for running simulations; and a Rust library for writing your own simulations algorithms using the pre-existing building blocks.

Documentation

Documentation is hosted here, and separated in multiple parts:

The user manual contains information about the general concepts of systems and simulations used in Lumol. Additionally, it has tutorials on how to use and extend Lumol. Use this documentation if you want to know basic concepts and how they are used in Lumol.
The input reference contains information about - well, the input file system of Lumol. Use this document if you want to use Lumol as a command line tool without writing code.
To use Lumol as a library inside your own code, we have a developer documentation, which contains documentation for all the library public functions, and examples for most of them.

Installation as a command line tool

You will need a stable Rust compiler, grab one if you do not have one yet. Then, you can download the code, build it and install it by running:

cargo install --git https://github.com/lumol-org/lumol

This will produce the a lumol binary in ~/.cargo/bin.

Usage as a library

You can add Lumol as a dependency in your project's Cargo.toml:

[dependencies]
lumol = {git = "https://github.com/lumol-org/lumol"}

A tutorial about how to implement new algorithms in Lumol is coming. While waiting, you can ask your questions here.

Contributing

If you want to contribute to Lumol, there are several ways to go: improving the documentation and helping with language issues; testing the code on your systems to find bugs; adding new algorithms and potentials; providing feature requests. Please come by and talk with us a bit before staring new work, or open an issue to discuss improvements. We also have recommendations for contributors.

See the AUTHORS file for a list of contributors to the code.

License

This software is licensed under the BSD license, see the LICENSE file for legal text.

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, shall be licensed under the same BSD license, without any additional terms or conditions.

lumol's People

Contributors

Stargazers

Watchers

Forkers

g-bauer odarbelaeze antoinewdg dd10-e gjepson luthaf droundy ybyygu nielsle gaybro8777 nileshjchoudhary computational-chemistry-research xiongb2sh amartyabose jharrymoore bodomartin tomatoisfruit iq-scm

lumol's Issues

Implement Hybrid Monte-Carlo

This is easy: just use the functionalities around to build the hybrid move

Find a better name for the project

cymbalum was a temporary initial name for this project. I need ideas for a better name, which must be

not too long, to be usable as name for a command line tool;
easy to search in google, thus not too common;
fun !

Use a builder pattern to create MonteCarlo structs

When I first read through the code of the MonteCarlo propagator I got really confused because it makes no sense to sample a move whose frequency is smaller than the sampled float. Makes a lot more sense if they are cumulative frequencies. It turns out they are, but the attribute is named frequencies.

Hovewer, if we renamed the attribute to reflect that they are cumulative frequencies things get weird because in the beginning (before setup is called) they are frequencies.
It turns out the weirdness comes from the methods add and add_move_with_acceptance, since they are the ones dealing with frequencies before the setup.

All of this to ask: is there sometimes a situation where we need to initialize a MonteCarlo propagator without the associated moves and add the moves later ? Because if not one could pass a Vec of (Box<MCMove>, f64, Option<f64>) to the constructor. This way frequencies would always be cumulative frequencies, and more importantly we would not have methods that can panic, which is always good.

Examples in the API documentation

The API documentation is complete, but lacks examples. Most of the function could get some examples.

Replace the warn_once! macros by the log-once crate

I pulled the code from here to its own crate: log-once. We should remove the in-tree macro and replace it with the crate.

Clarify how to compute ideal gas contribution to pressure

This issue was discussed in #58 (comment) and targets computation of the ideal gas pressure:

impl Compute for PressureAtTemperature {
    type Output = f64;
    fn compute(&self, system: &System) -> f64 {
        :
        let natoms = system.size() as f64;
        return natoms * K_BOLTZMANN * self.temperature / volume + virial / (3.0 * volume);
             //  ^--- natoms, degrees of freedom, number of molecules?
    }
}

Implement Linear force-shift for pair potential

As defined in NIST reference data for LJ potential: https://www.nist.gov/mml/csd/chemical-informatics-research-group/lennard-jones-fluid-reference-calculations

And add corresponding tests in the NIST tests.

Real command line interface

The main function does really ad-hoc command line arguments parsing. We should use a real library such as clap for command line argument parsing.

Add computation for truncated potentials with tail corrections

We should add a struct and computation for truncated intermolecular pair potentials with long range corrections.

In Monte-Carlo (MC) simulations we often use potential functions that are truncated at a cutoff distance rc. Since we don't have to evaluate forces in MC, we don't have to shift the potential at the cutoff distance.

Under the assumption that the pair correlation function is unity beyond rc, we can compute a tail correction (for the energy and the pressure) by integration of the pair-potential from rc to infinity. For some potentials this can be solved analytically. The tail correction is a function of the cutoff distance rc and the ~~systems' density~~ densities of all components (ParticleKind).

Add a molecule regrowth move for Monte Carlo

To sample intramolecular configurations of molecules, we should implement a "regrowth" or "rebuild" move for Monte Carlo Simulations.

Within the move, an interaction site of a molecule is randomly chosen and translated to a new position. From that single interaction site, neighboring sites are sequentially added using Rosenbluth/configurational bias sampling, which allows for effective sampling of large molecules.

Implement Verlet list and Cell list

Rewrap molecules

Add a Control algorithm to rewrap molecules inside the unit cell, using the molecule center of mass.

Add Noose-Hover Barostat

Create a website presenting the project

It can be hosted using github pages at lumol-org.github.io for the moment, we can change the domain name later.

I created an empty repository here to contain the website source.

Create tests using NIST Standard Reference Simulation Calculations

We should build tests for the systems/states for the NIST Simulation Benchmarks.

These tests should include evaluations of single configurations

LJ reference calculations for system energy and virial
- Truncated without tail corrections.
- Truncated with tail corrections (needs implementation of corrections).
- Linear force shift (needs implementation of LFS). (#55)
Ewald-Summation for SPC/E water

as well as simulation runs in different ensembles and potentials:

LJ
Alkanes (TraPPE)
SPC/TIP4P Water

EDIT: (@Luthaf) Added links

Monte Carlo moves: adjustable maximum displacement

It would be a good idea to implement a demanded acceptance for MC moves that have a displacement (translation, volume or box change). That way, the maximum displacement for the move can be adjusted during the simulation to achieve that acceptance. With a fixed displacement as is, simulations can be very inefficient.

Add testing on Windows and OS X

Rust code should be cross-platform, but it is better to be sure.

For Windows using Appveyor
For OS X using Travis

It can also help to build release binaries, cf https://github.com/japaric/rust-everywhere

Profiling performance

Kind of related to #62 but not quite: I intend to do some serious profiling to see where the bottlenecks are, I open this issue to keep you updated. I think I'll do it on some examples and some of the benchmarks.

I'll be starting from this, this and this. If you have any suggestion on what code would be interesting to profile or how to do this (this is the first time I'm profiling Rust code) don't hesitate. @Luthaf I think you told me you already did some kind of profiling, anything I can reuse ?

Add support for improper dihedrals

This involve several small steps:

Create a representation of impropers dihedrals in the code
Add impropers to molecules and in the molecule detection algorithm
Get the improper angle (or distance ?) and derivatives
Add support for impropers in the energy computation

Catch typo in the input files

It would be great to suggest a fix to the user when the input file contains an error.

Some fixes I can think of:

Check for \t inside YAML files;
Give suggestion with typo: hamronic => harmonic
Check types of the data

The first one is easy, the others one require some kind of mechanism to register a keyword and associated data to some kind of linter.

Input files

This is a big issue, and I have been thinking a bit around it.

Requirements

We need input files, because not every one wants to write code just to run a simulation;
We need nice input files, which are easy to read and modify without going back and forth to the documentation;
We need to be able to use the same input files when using cymbalum as a library, either directly in Rust, or with some kind of scripting language (probably Python);
The input files should use a pre-existing format;
As a bonus, the users should also be able to register new potentials, moves, etc. for parsing.

1 and 3 should not be controversial, and 2 means that input files should look a bit more like

pairs: atoms = [O O], lj = sigma 3.5 A, epsilon = 122K

Than like this

&potp \
3 3 3.5 122 lj
;

About 4, as I plan to integrate a scripting language, there is no need for some ad-hoc scripting language to drive the simulation. A pre-existing language might be more familiar to users, easier to read in most of the cases, will have syntactic highlight in text editors and will be easier to validate without running the simulation.

At this point, JSON is ruled out because it does not support comments, and is a bit too clumsy with "" around the keys. INI is nice but not standardized. YAML is very good, but has two problems: it is whitespace sensitive (which can confuse the users, see #5 for a wanted workaround), and nest a bit deeply for molecular simulations.

I thing I'll go with TOML, which is clean, simple and readable.

Actual syntax

This gist showcases a possible syntax for TOML input files. it is mainly separated in two sections:

Interactions input

This is demonstrated here. We may want to remove the type section, and use something like this:

[[pairs]]
atoms = ["O", "O"]
lj = {sigma = "3.4 A", epsilon = "0.155 kJ/mol"}

This would allow for potentials to be inside inline tables:

pairs = [
    {atoms = ["O", "O"], lj = {sigma = "3.4 A", epsilon = "0.155 kJ/mol"}},
    {atoms = ["O", "H"], lj = {sigma = "2.1 A", epsilon = "0.085 kJ/mol"}},
]

Simulation and systems input

There are multiple examples for the syntax:

One of the key points is that systems and simulations entry are arrays. That allow to have more than one system to apply some simulation to (think replica-exchange simulations, or gibbs ensemble Monte-Carlo); or to apply multiple simulation to one system: like here.

I might want to remove the type keyword here too, to replace

[[simulations]]
nsteps = 10_000_000
outputs = [
    {type = "Trajectory", file = "filename.xyz", frequency = 100},
    {type = "Properties", file = "averages.dat", frequency = 100}
]

with

[[simulations]]
nsteps = 10_000_000
outputs = [
    {trajectory = {file = "filename.xyz", frequency = 100}},
    {properties = {file = "averages.dat", frequency = 100}}
]

Or this with

[[simulations.propagator]]
type = "MonteCarlo"
moves = [
    {translate = {delta = "1 A", proba = 50}},
    {rotate = {delta = "20 deg", proba = 50}},
    {resize = {proba = 1}},
]

We probably do not want to remove it from everywhere, for example

[[simulations.propagator.moleculardynamics]]
timestep = "1 fs"
integrator = {berendsen_barostat = {pressure = "100 bar", timestep = "10000 fs"}}
thermostat = {berendsen = {temperature = "400 K", timestep = "1000 fs"}}

Feels readable than the original version.

Unresolved questions

Is TOML good enough as input format?
Should the type section be removed? If yes, where? It appears in potentials, in lists of outputs or moves, and as keyword for propagator, thermostat, ...

Add MTTK Barostat

http://dx.doi.org/10.1080/00268979600100761

Langevin thermostat and barostat

Add traits for energy computation

The following traits should be added:

trait Energy {
    // Used in monte-carlo and energy computation
    fn energy(&mut self, system: &System) -> f64;
}

trait Forces {
    // Used in virial computation
    fn forces(&mut self, system: &System) -> Vec<Vector3D>;
}

trait Energetics: Energy + Forces {
    // Used in energy minimization
    fn forces_and_energy(&mut self, system: &System) -> (f64, Vec<Vector3D>) {
        (self.energy(system), self.forces(system))
    }
}

This would allow to:

Have a SimpleEnergetics mode where we do not re-compute periodic boundary conditions when needing both forces and energy;
Add other way of computing energy, like ThreadedEnergetics, VerletLists or GPUEnergetics

Better testing for bad inputs

Currently, bad input are tested by checking that the result of the input read is an error. But we do not control the source of the error. Allowing to specify the expected error inside the bad input file would guarantee that we do not get another error that the one wanted (bad TOML, IO error, ...).

Bad input files could use comments to specify the expected input error:

[input]
versn = 1
#^ Missing 'version' key in 'input' table

Use matrix to store interactions?

Right now, the interactions are stored inside BTreeMap associating particles types and a vector of interactions. Using a 2D (for pairs and bonds), 3D (for angles) or 4D (for dihedrals) matrix could make the interactions access faster.

This would use more memory, and fill the CPU cache with garbage. So benchmarks are needed!

Decide how many atomic types we would get in the worst and standard case.
Also check sparse matrix storage, which may mitigate the memory requirement;

Regularly run the benchmarks

The regression benchmarks (defined in #62) should run for each new commit in master, and a report of the run time should be available (maybe as a graph).

I think we can use Travis for that, even if it is not ideal (there is no way to be sure that the machines always have the same power).

Ability to run control algorithm not at all steps

Control algorithms like RemoveRotation and RemoveTranslation do not need to run at each simulation step. they could have separated frequency field to run them only every so often.

Drift in energy caused by energy cache

I observed a drift in energy for long simulations of LJ particles in the NVT ensemble.

When a move is accepted, I compared the energy delta (energy_new - energy_old) from cache via
du_cache = cache.move_particles_cost(...) versus the difference computed via system.total_energy(). Looking at the difference, I found that often the difference is negligible (1e-16) but there are cases when the difference is about 1e-7 to 1e-5.

I think the problem is either move_particles_cost or the update function of the cache. As far as I can tell, the initialization of the cache works correctly (tail correction is missing at the moment).

You can find the results from a sample simulation here.

`Trajectory::open` shows different behavior depending on the order of atoms

I try to read three configurations with a single water molecule from xyz files. Depending on the order, it either works or throws an error. I simply changed the order by copying the whole line (coordinates are consistent) but it should not matter.

To reproduce, copy .rs file from this gist into the example folder and .xyz into example/data.

Error:

$ RUST_BACKTRACE=1 cargo run --example=error_molid --release
thread 'main' panicked at 'assertion failed: new_molid < old_molid', lumol/src/core/src/sys/systems.rs:342
stack backtrace:
   1:        0x1095141e8 - std::sys::backtrace::tracing::imp::write::h6f1d53a70916b90d
   2:        0x10951637f - std::panicking::default_hook::{{closure}}::h137e876f7d3b5850
   3:        0x109515475 - std::panicking::default_hook::h0ac3811ec7cee78c
   4:        0x109515a26 - std::panicking::rust_panic_with_hook::hc303199e04562edf
   5:        0x1094c04f4 - std::panicking::begin_panic::h341b039f84d0b176
   6:        0x1094c9041 - lumol::sys::systems::System::add_bond::ha262c429a07c9463
   7:        0x1094ca985 - <chemfiles::frame::Frame as lumol::sys::chfl::ToLumol>::to_lumol::h86945c7802c5c377
   8:        0x1094cac6a - lumol::sys::chfl::Trajectory::read_guess_bonds::hb213dbb19c866ab4
   9:        0x1094bedd3 - error_molid::main::h83ddaaca3f875982
  10:        0x10951693a - __rust_maybe_catch_panic
  11:        0x109514fb6 - std::rt::lang_start::h538f8960e7644c80

Add combining rules for intermolecular interaction parameters

It would be very convenient to have combining rules. For example, specify in the input:

[input]
version = 1

[[pairs]]
atoms = ["A", "A"]
lj = {sigma = "... ", epsilon = ..."}

[[pairs]]
atoms = ["B", "B"]
lj = {sigma = "... ", epsilon = ..."}

[combining-rule]
atoms = ["A", "B"]
type = "LB" # LB = Lorentz-Berthelot

As an option, one would not need to specify A,B pair interactions. Instead, a combining rule is applied. Another possibility is to make the combining rule "global", i.e.:

[input]
version = 1

[[pairs]]
atoms = ["A", "A"]
lj = {sigma = "... ", epsilon = ..."}

[[pairs]]
atoms = ["B", "B"]
lj = {sigma = "... ", epsilon = ..."}

# the rule is applied to all intermolec. interactions
[combining-rule]
type = "LB" # LB = Lorentz-Berthelot

Add "smart MC" moves

A possible way to efficiently translate or rotate a molecule is to bias the displacement using the actual force/torque acting on a molecule (see this paper). The magnitude of displacement can be adjusted just like in regular translation/rotation moves while the direction is biased towards the force/torque to which an additional random (gaussian) displacement is added.

Implementation of the translation move needs the following steps:

select a molecule
compute the energy of the molecule with the rest of the system (store distances for all pairs)
compute the total[1] normalized force vector acting on the molecules center of mass
bias the translation using the normalized force, a maximum displacement and a random gaussian displacement vector
compute the energy and force acting on the molecule for the new configuration
accept/reject the move according to an unbiased acceptance criterion

While computations in this move are more involved than in a regular displacement/rotation it offers the possibility to perform multi-particle moves in an efficient manner. In these moves all particles translate/rotate at the same "time". Just like in molecular dynamics, force (and energy) computations can be carried out in parallel leading to an increased efficiency.

[1] it was mentioned in this paper that it is sufficient to approximate the total force by only computing short range contributions. Maybe even with a reduced cut-off distance.

Add restart files

If the simulation crashes, or to restart a simulation for more steps, a restart file is needed. It should be some private binary format, but with compatibility between multiple versions of the code. So a binary dump of the memory is not an option.

Using Chemfiles with a file format containing all the data (NetCDF ? TNG ? Another one ?) could be nice, but chemfiles only provides floats values for now (see chemfiles/chemfiles#34), and double would be better here.

Add custom output algorithm

A custom output where users can specify what they want to be printed to the output stream: temperature, pressure volume, …

Plugin system

I have been thinking about this for a while, here is a request for comment (RFC) to shape it a bit more and gather feedback. The implementation of this will take time and is not high priority, but I really want to get this right.

Motivations

A nice way for the code to be easy to extend would be to provide a plugin system. A plugin is some piece of code that is not part of the main lumol code, and is not compiled at the same time than the main binary, nor linked to it.

Having a plugin system would mean that people can write their own code and use it with lumol without recompiling the whole project. It is highly desirable that this plugin system works easily with the input files too. I'd like the loading/use of a plugin to be as easy as:

[input]
version = 1

[plugins]
foo = "path/to/foo/plugin.so"

[[pairs]]
atoms = ["A", "A"]
foo = {} # potential foo is defined inside the foo plugin

[[pairs]]
atoms = ["A", "B"]
bar = {} # potential bar is also defined inside the foo plugin

Types of plugins

I see three main kinds of plugins for the Lumol use case:

Compiled plugins written in Rust;
Compiled plugins written in C (or any C-exporting language: Fortran, ...);
Interpreted plugins written in Python.

(1) and (2) are fast, (3) is nice for writing quick prototypes. (1) is easier to implement, and (2) allow to call into other codes for potential computations (to use DFT level energy for example). (1) and (2) would be based on shared library and dynamic loading, while (3) will use some kind of future python binding.

Plugins should be usable to extend lumol in all the possibles means: not only writing new potentials, but also new integrators, new Monte-Carlo moves, ... Everything that is defined as a trait for extensibility in the code should be available as a plugin

Proposed implementation strategy

I will focus on Rust plugins for now, as they are way easier to implement, and the C plugins can be built on the top of the Rust plugins mechanisms.

Every plugin would be compiled as a shared library, and expose at least a single function: lumol_register, taking a single argument of type &mut PluginRegistry defined in lumol_input crate. This function returns a Result in case something goes wrong. The PluginRegistry will define a few method to register extension points

extern crate lumol_input;
use lumol_input::{PluginRegistry, PluginError};

struct Foo;
struct Bar;

impl PairPotential for Foo {...}
impl PairPotential for Bar {...}

pub fn lumol_register(reg: &mut PluginRegistry) -> Result<(), PluginError> {
    try!(reg.add_pair::<Foo>("foo"));
    try!(reg.add_pair::<Bar>("bar"));
    Ok(())
}

// add_pair is defined in lumol_input as
impl PluginRegistry {
    pub fn add_pair<T: PairPotential>(&mut self, name: &str) {...}
   // a given potential name can only be given once
}

Reading an input file will follow this new algorithm:

Create a new PluginRegistry;
Register the standard potentials in it;
Parse the input file;
Load the plugins;
Call lumol_register for all the plugins using the PluginRegistry;
Read the input file, and at every section ask the PluginRegistry if there is any potential/integrator/MC move with the given name, and use it.

Extra goodies

This architecture would allow for more modularity. The lumol_core crate could only contains the main structs (UnitCell, System, Simulation, ...) and the traits (Propagator, Potential, ...). Standard implementation would then live in another crate (LennardJones, MolecularDynamics, ...), that is use by lumol_input.

The problem with this is that traits like MCMove would live in a separated crate than the MonteCarlo structure.

Unresolved questions

As rustc does not guarantee a stable ABI for now, calling lumol_register with two different rustc version could break everything. This may be solved by using a C ABI, but I do not know if it supports passing and returning Rust types.

Thank you for reading this, now please comment if you have any feedback or if clarifications are needed!

Cache computed physical properties

Universe properties (Forces, KineticEnergy, ...) can be expensive to compute. It would be nice to cache them.

But there are two difficult things in CS: naming and cache invalidation. Because these properties can be accessed at any point in the simulation, the Universe::step field is a very bad cache indicator. A better one will be harder to compute and use, so I will have to think wether this is worth the gain or not.

Implement NPT Monte-Carlo

This needs a ResizeCell Monte-Carlo move.

Ewald summation for triclinic cells

Triclinic cells are useful for some simulations, but make Ewald sum a bit harder. There is an implementation in the ewald-triclinic branch, but the tests are failing.

Implement classical potentials

More usefull potentials should be added. I know about theses:

Morse potential
Buckingham potential
Cosine series

Are there any others?

Add support for * in interaction input file

Like in

pairs:
   - atoms: [*, *]
     type: Null

Add cut off distance to `resize` MC move to catch too small volumes

If the box gets too small within a NPT simulation, the cut off radius will be larger than half the minimum box length (rc > L_min/2). If that is the case, the simulation should stop with an error.

How to get the largest cut off out of PairInteractions?

Parrallelisation strategy

For now, the code is plain old serial, and does not exploit the possibilities offered by Rust.

We should discuss about how to make this code parallel. I believe the first section of the code we should parallelize is the energy evaluation (#19 idea is to have both parallel and serial energy computation algorithm). I can see multiple area of improvement:

shared memory parallelism (OpenMP world), using something like rayon. This should be relatively easy to do;
Distributed memory parallelism (MPI world). This will need more work, as there is not clear MPI replacement for Rust, and MPI may not be the optimal strategy here.

At the same time, we can already enable some trivial parallelism because the code do not hold any global variable: parallel tempering should be very easy to setup, as should be running the same calculation with a single varying parameter. Here the question is more how can this be added to the input files in an intuitive manner.

Benchmarking Lumol

We need two kind of benchmarks:

regression benchmarks check the speed of a given algorithm;
comparison benchmark check the speed of a given simulation for comparison with other codes.

Benchmarks live in benches, and can be run using a nighlty compiler with cargo bench.

We currently have one regression benchmark for Ewald summation here; an,d one comparison benchmark against LAMMPS for the md simulation of Helium.

Here are some additional ideas:

Regression benchmarks

Energy computation for a Lennard-Jones fluid
Energy computation for a molecular fluid without charges (butane?)
Energy computation for a charged system
Energy computation for a molecular charged system

We already have energy computation for a molecular fluid with charges (water)

Simulation benchmarks

It would be nice to have all the combinations of MD/MC -- Lennard-Jones/butane/water -- NVE/NVT/NPT/μVT. Here is a full list:

That is already 18 different simulations, that we should compare against already existing MD and MC codes.

Maybe we can also have tests for bio-molecules, like a small peptide, DNA strand and a bigger protein.

Please comment with more ideas, and open PR to add the benchmarks!

Add Noose-Hover Thermostat

Does the Berendsen Barostat scale intramolecular configurations?

Since I just dove into position scaling for Monte-Carlo volume resizing, I was wondering if one does the same when using the Berendsen Barostat in MD (i.e. only resize the centers of mass, leaving the intramolecular configuration unchanged). Looking at the integration here, I found that currently all particle positions are scaled.

Implement Grand-Cannonical Monte-Carlo

This is a three-part bullet point issue:

Add cache function to get the cost of adding a molecule;
Add cache function to get the cost of removing a molecule;
Implement the GCMC move

Use std::f64::to_radian/to_degree instead of rolling our own.

We have our own rad2deg and deg2rad functions, but the exact same functions exist in the standard library.

We should replace ours by the ones in the standard library.

Where should we put our particles

Currently, the particles are stored contiguously in a vector, and the molecules have an usize field pointing to the first particle of the molecule in this vector.

struct System {
    particles: Vec<Particle>
    molecules: Vec<Molecule>,
}

struct Molecule {
    first: usize
}

Pro of this design

good cache locality for particles
easy to iterate over all particles in the system. A double loop over pairs is:

for (i, pi) in system.iter().enumerate() {
    for pj in system.iter().skip(i + 1) {
        // Do work
    }
}

Cons of this design

hard to iterate over particles in a molecule, and corresponding double loops are confusing.

for molecule in system.molecules() {
    for i in &molecule {
        for j in &molecule {
            if j <= i {continue;}
            let pi = system[i];
            let pj = system[j];
            // Do work
        }
    }
}

Making the particles owned by molecules is not a solution, because it make it very hard to iterate over pairs in the whole system, and add an extra indirection layer.

A possible solution, with a POC implementation here would be to store a pointer to the first atom in the molecule, together with the molecule size. This would make it easier to work with molecular code, but add some unsafe usage.

In particular, the pointer of all molecules must be updated when adding a new particle to the system, because the vector may allocate additional memory.

So is it worth it to add complexity in the library code in order to simplify the user code?

Get properties for a subgroup of atoms

Users should be able to get some property for a subgroup of particles: a molecule, all waters, all oxygens, ...

EDIT: removed code example, it made no sense.

Versioning for input files

All the YAML files should have a version key for indicating the version of the input format. This will prevent strange bugs and incompatibility when updating input format.

version: 1.0

pairs:
      - atoms: [O, O] 
        ...

lumol-org / lumol Goto Github PK

lumol's Introduction

Lumol molecular simulation engine

Features

Getting started

Documentation

Installation as a command line tool

Usage as a library

Contributing

License

lumol's People

Contributors

Stargazers

Watchers

Forkers

lumol's Issues

Requirements

Actual syntax

Interactions input

Simulation and systems input

Unresolved questions

Motivations

Types of plugins

Proposed implementation strategy

Extra goodies

Unresolved questions

Regression benchmarks

Simulation benchmarks

Recommend Projects

Recommend Topics

Recommend Org

Jobs