GithubHelp home page GithubHelp logo

baselines's Introduction

Open-Catalyst-Project Models

Implements the following baselines that take arbitrary chemical structures as input to predict material properties:

Installation

[last updated September 1, 2020]

The easiest way of installing prerequisites is via conda. After installing conda, run the following commands to create a new environment named ocp-models and install dependencies:

Pre-install step

Install conda-merge:

pip install conda-merge

If you're using system pip, then you may want to add the --user flag to avoid using sudo. Check that you can invoke conda-merge by running conda-merge -h.

GPU machines

Instructions are for PyTorch 1.6, CUDA 10.1 specifically.

First, check that CUDA is in your PATH and LD_LIBRARY_PATH, e.g.

$ echo $PATH | tr ':' '\n' | grep cuda
/public/apps/cuda/10.1/bin
$ echo $LD_LIBRARY_PATH | tr ':' '\n' | grep cuda
/public/apps/cuda/10.1/lib64

The exact paths may differ on your system. Then install the dependencies:

conda-merge env.common.yml env.gpu.yml > env.yml
conda env create -f env.yml

Activate the conda environment with conda activate ocp-models. Install this package with pip install -e .. Finally, install the pre-commit hooks:

pre-commit install

CPU-only machines

Please skip the following if you completed the with-GPU installation from above.

conda-merge env.common.yml env.cpu.yml > env.yml
conda env create -f env.yml
conda activate ocp-models
pip install -e .
pre-commit install

Usage

Download the datasets

For now, we are working with the following datasets:

  • ulissigroup_co: dataset of DFT results for CO adsorption on various slabs (shared by Junwoong Yoon) already in pytorch-geometric format.
  • gasdb: tiny dataset of DFT results for CO, H, N, O, and OH adsorption on various slabs (shared by Kevin Tran) in raw ase format.

To download the datasets:

cd data
./download_data.sh

Train models to predict energies from structures

To quickly get started with training a CGCNN model on the gasdb dataset with reasonable defaults, take a look at scripts/train_example.py (reproduced below):

from ocpmodels.trainers import SimpleTrainer

task = {
    "dataset": "gasdb",
    "description": "Binding energy regression on a dataset of DFT results for CO, H, N, O, and OH adsorption on various slabs.",
    "labels": ["binding energy"],
    "metric": "mae",
    "type": "regression",
}

model = {
    "name": "cgcnn",
    "atom_embedding_size": 64,
    "fc_feat_size": 128,
    "num_fc_layers": 4,
    "num_graph_conv_layers": 6,
}

dataset = {
    "src": "data/data/gasdb",
    "train_size": 800,
    "val_size": 100,
    "test_size": 100,
}

optimizer = {
    "batch_size": 10,
    "lr_gamma": 0.1,
    "lr_initial": 0.001,
    "lr_milestones": [100, 150],
    "max_epochs": 50,
    "warmup_epochs": 10,
    "warmup_factor": 0.2,
}

trainer = SimpleTrainer(
    task=task,
    model=model,
    dataset=dataset,
    optimizer=optimizer,
    identifier="my-first-experiment",
)

trainer.train()

predictions = trainer.predict("data/data/gasdb")

For more advanced usage and digging deeper into default parameters, take a look at BaseTrainer. To use BaseTrainer to train a CGCNN model on the ulissigroup_co CO adsorption data to predict binding energy (with default params):

python main.py --identifier my-first-experiment --config-yml configs/ulissigroup_co/cgcnn.yml

See configs/ulissigroup_co/base.yml and configs/ulissigroup_co/cgcnn.yml for dataset, model and optimizer parameters.

Acknowledgements

License

TBD

baselines's People

Contributors

abhshkdz avatar mshuaibii avatar anuroopsriram avatar txie-93 avatar junwoony avatar ktran9891 avatar wood-b avatar nianhant avatar calebho avatar weihua916 avatar clz55 avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.