materialsvirtuallab / matgl Goto Github PK
View Code? Open in Web Editor NEWGraph deep learning library for materials
License: BSD 3-Clause "New" or "Revised" License
Graph deep learning library for materials
License: BSD 3-Clause "New" or "Revised" License
Each pre-trained model directory should contain a README with key information about the models. I have written an outlined for the M3GNet PES. Particularly important are the training datasets and the actual performance metrics (MAE in energies, forces, etc.). It would also be helpful if an actual script is provided to demonstrate the training protocol.
Pls complete it and add similar files for the other models.
If I install pre-commit
hooks and then try to commit, mypy
aborts with 19 pre-existing errors:
- hook id: mypy
- exit code: 1
examples/trainer_beta/train.py:15: error: Module "matgl.utils" has no attribute "utils" [attr-defined]
examples/trainer_beta/train.py:43: error: "Tuple[Any, ...]" has no attribute "z_mean" [attr-defined]
examples/trainer_beta/train.py:43: error: "Tuple[Any, ...]" has no attribute "num_bond_mean" [attr-defined]
examples/trainer_beta/train.py:47: error: "Tuple[Any, ...]" has no attribute "mean" [attr-defined]
examples/trainer_beta/train.py:47: error: "Tuple[Any, ...]" has no attribute "std" [attr-defined]
examples/trainer_beta/train.py:56: error: "int" has no attribute "cpu" [attr-defined]
examples/trainer_beta/train.py:66: error: Function "collections.namedtuple" is not valid as a type [valid-type]
examples/trainer_beta/train.py:66: note: Perhaps you need "Callable[...]" or a callback protocol?
examples/trainer_beta/train.py:67: error: Function "collections.namedtuple" is not valid as a type [valid-type]
examples/trainer_beta/train.py:67: note: Perhaps you need "Callable[...]" or a callback protocol?
examples/trainer_beta/train.py:74: error: namedtuple? has no attribute "__iter__" (not iterable) [attr-defined]
examples/trainer_beta/train.py:80: error: namedtuple? has no attribute "z_mean" [attr-defined]
examples/trainer_beta/train.py:80: error: namedtuple? has no attribute "num_bond_mean" [attr-defined]
examples/trainer_beta/train.py:84: error: namedtuple? has no attribute "mean" [attr-defined]
examples/trainer_beta/train.py:84: error: namedtuple? has no attribute "std" [attr-defined]
examples/trainer_beta/train.py:90: error: "int" has no attribute "cpu" [attr-defined]
examples/trainer_beta/train.py:99: error: Function "collections.namedtuple" is not valid as a type [valid-type]
examples/trainer_beta/train.py:99: note: Perhaps you need "Callable[...]" or a callback protocol?
examples/trainer_beta/train.py:101: error: namedtuple? has no attribute "train" [attr-defined]
examples/trainer_beta/train.py:105: error: namedtuple? has no attribute "z_mean" [attr-defined]
examples/trainer_beta/train.py:105: error: namedtuple? has no attribute "num_bond_mean" [attr-defined]
examples/trainer_beta/train.py:181: error: Argument 1 to "run" has incompatible type "Namespace"; expected "ArgumentParser" [arg-type]
Found 19 errors in 1 file (checked 1 source file)
Would be a nicer developer experience if linters pass out of the box.
v0.8.5 and v0.7.1
First off, thank you all so far for your help with getting me started on this. I think at this point my issue has become more appropriate for a bug fix.
I am trying to train a model using multiple GPUs and opened the discussion in #188
I first had an issue using 0.8.5 MatGL and PyTorch = 2.1.1 (for CUDA 11.8) where I encountered the same error in #149 :
Traceback (most recent call last):
File "/home/u2019/work/ml/GPU-test/train.py", line 45, in <module>
dataset = M3GNetDataset(
File "/home/u2019/miniconda3/envs/py310/lib/python3.10/site-packages/matgl/graph/data.py", line 255, in __init__
super().__init__(name=name)
File "/home/u2019/miniconda3/envs/py310/lib/python3.10/site-packages/dgl/data/dgl_dataset.py", line 112, in __init__
self._load()
File "/home/u2019/miniconda3/envs/py310/lib/python3.10/site-packages/dgl/data/dgl_dataset.py", line 203, in _load
self.process()
File "/home/u2019/miniconda3/envs/py310/lib/python3.10/site-packages/matgl/graph/data.py", line 278, in process
line_graph = create_line_graph(graph, self.threebody_cutoff) # type: ignore
File "/home/u2019/miniconda3/envs/py310/lib/python3.10/site-packages/matgl/graph/compute.py", line 146, in create_line_graph
l_g, triple_bond_indices, n_triple_ij, n_triple_i, n_triple_s = compute_3body(graph_with_three_body)
File "/home/u2019/miniconda3/envs/py310/lib/python3.10/site-packages/matgl/graph/compute.py", line 24, in compute_3body
first_col = g.edges()[0].numpy().reshape(-1, 1)
File "/home/u2019/miniconda3/envs/py310/lib/python3.10/site-packages/torch/utils/_device.py", line 62, in __torch_function__
return func(*args, **kwargs)
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
This error is happening when the actual dataset is being created using M3GNetDataset. It was implemented in this commit : e825b75
Does this even need to be run on a GPU? It seems like due to hard coding it to run on CPU, that any torch.set_default_device('cuda') should come after creating the dataset itself. However looking through issue #94 Prof. Ong states that torch.device('cuda') must come first.
Nevertheless I was suggested to downgrade PyTorch and it's dependencies to 2.0.1 (and I also downgraded matgl to 0.7.1), and following the training example given by SmallBearC, I was able to get a single GPU to run.
However when attempting to setup multi-gpu use, I got an error in the MGLDataLoader:
Traceback (most recent call last):
File "/home/myless/Potential_Training/V-Cr-Ti/Test_1/train.py", line 77, in <module>
train_loader, val_loader, test_loader = MGLDataLoader(
File "/home/myless/.mambaforge/envs/matgl-gpu/lib/python3.10/site-packages/matgl/graph/data.py", line 78, in MGLDataLoader
train_loader = GraphDataLoader(
File "/home/myless/.mambaforge/envs/matgl-gpu/lib/python3.10/site-packages/dgl/dataloading/dataloader.py", line 1451, in __init__
self.dist_sampler = _create_dist_sampler(
File "/home/myless/.mambaforge/envs/matgl-gpu/lib/python3.10/site-packages/dgl/dataloading/dataloader.py", line 1281, in _create_dist_sampler
return DistributedSampler(dataset, **dist_sampler_kwargs)
File "/home/myless/.mambaforge/envs/matgl-gpu/lib/python3.10/site-packages/torch/utils/data/distributed.py", line 68, in __init__
num_replicas = dist.get_world_size()
File "/home/myless/.mambaforge/envs/matgl-gpu/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py", line 1196, in get_world_size
return _get_group_size(group)
File "/home/myless/.mambaforge/envs/matgl-gpu/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py", line 576, in _get_group_size
default_pg = _get_default_group()
File "/home/myless/.mambaforge/envs/matgl-gpu/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py", line 707, in _get_default_group
raise RuntimeError(
RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.
I was given the suggestion by a maintainer to change torch/utils/data/distributed.py
def __iter__(self) -> Iterator[T_co]:
if self.shuffle:
# deterministically shuffle based on epoch and seed
g = torch.Generator()
g.manual_seed(self.seed + self.epoch)
indices = torch.randperm(len(self.dataset), generator=g).tolist() # type: ignore[arg-type]
else:
indices = list(range(len(self.dataset))) # type: ignore[arg-type]
To:
def __iter__(self) -> Iterator[T_co]:
if self.shuffle:
# deterministically shuffle based on epoch and seed
#work around added to make matgl work
if torch.cuda.is_available():
device = "cuda"
else:
device = "cpu"
g = torch.Generator(device=device)
g.manual_seed(self.seed + self.epoch)
indices = torch.randperm(len(self.dataset), generator=g).tolist() # type: ignore[arg-type]
else:
indices = list(range(len(self.dataset))) # type: ignore[arg-type]
but the same error occurred. I did some digging and I found my error being caused by when the GraphDataLoader is trying to use DDP for the Training Data. It is likely an issue with how the model is being split and sent to the gpus.
There seems to be two bugs or issues here. In Version 0.8.5, I believe there is an issue with the _compute_3body() in matgl/src/matgl/graph/compute.py
n_atoms = g.num_nodes()
first_col = g.edges()[0].cpu().numpy().reshape(-1, 1)
all_indices = np.arange(n_atoms).reshape(1, -1)
Could this perhaps be fixed by moving things onto the gpu after the dataset has been created?
The bug with 0.7.1 seems to be something related to the ddp strategy of loading the dataset onto the multiple gpus for training. This seems more difficult to fix for the matgl team and since it is happening with an older package version, perhaps the bug in 0.8.5 is better to focus on?
I have pasted the code that I was using on matgl==0.7.1 and the output from slurm.
Thank you,
Myles
from __future__ import annotations
import os, json
import shutil
import warnings
import numpy as np
import pytorch_lightning as pl
from dgl.data.utils import split_dataset
from pytorch_lightning.loggers import CSVLogger
from pymatgen.io.vasp.outputs import Vasprun
import matgl
from matgl.ext.pymatgen import Structure2Graph, get_element_list
from matgl.graph.data import M3GNetDataset, MGLDataLoader, collate_fn_efs
from matgl.models import M3GNet
from matgl.utils.training import PotentialLightningModule
# To suppress warnings for clearer output
warnings.simplefilter("ignore")
import torch
torch.set_default_device('cuda')
AVAIL_GPUS = torch.cuda.device_count()
folder_path = './test_xml_data'
#folder_path = './data'
xml_files = [f for f in os.listdir(folder_path) if f.endswith(".xml")]
#initialize empty arrays
structures = []
energies = []
forces = []
stresses = []
errors = []
for xml_file in xml_files:
xml_file_path = os.path.join(folder_path, xml_file)
try:
vrun = Vasprun(xml_file_path)
#print(f"File: {xml_file} loaded")
for i in range(len(vrun.ionic_steps)):
structures.append(vrun.ionic_steps[i]['structure'])
energies.append(vrun.ionic_steps[i]['e_fr_energy'])
forces.append(vrun.ionic_steps[i]['forces'])
stresses.append(vrun.ionic_steps[i]['stress'])
except Exception as e:
error_message = f"Error parsing {xml_file}: {str(e)}"
errors.append(error_message)
print(error_message)
with open('./bad_vasprun.txt', 'w') as file:
for error_message in errors:
file.write(f"{error_message}\n")
labels = {
"energies": energies,
"forces": forces,
"stresses": stresses,
}
print(f"{len(structures)} downloaded from MP.")
#formatted_data = json.dumps(labels, indent=4)
#with open("labels.json","w") as json_file:
#json_file.write(formatted_data)
element_types = get_element_list(structures)
converter = Structure2Graph(element_types=element_types, cutoff=5.0)
dataset = M3GNetDataset(
threebody_cutoff=4.0,
structures=structures,
converter=converter,
energies=energies,
forces=forces,
stresses=stresses, #changed when downgrading to 0.7.1
#labels=labels,
)
train_data, val_data, test_data = split_dataset(
dataset,
frac_list=[0.8, 0.1, 0.1],
shuffle=True,
random_state=42,
)
train_loader, val_loader, test_loader = MGLDataLoader(
train_data=train_data,
val_data=val_data,
test_data=test_data,
collate_fn=collate_fn_efs,
batch_size=16,
num_workers=0,
use_ddp=True,
pin_memory=True,
generator=torch.Generator("cuda"),
)
model = M3GNet(
element_types=element_types,
is_intensive=False,
use_smooth=True
)
lr = 1e-4
lit_module = PotentialLightningModule(model=model,lr=lr)
# If you wish to disable GPU or MPS (M1 mac) training, use the accelerator="cpu" kwarg.
logger = CSVLogger("logs", name="M3GNet_training")
# Inference mode = False is required for calculating forces, stress in test mode and prediction mode
trainer = pl.Trainer(max_epochs=1, accelerator="cuda", devices=4, strategy="ddp", logger=logger, inference_mode=False)
#trainer = pl.Trainer(max_epochs = 1, accelerator='auto', logger=logger, inference_mode=False)
trainer.fit(model=lit_module, train_dataloaders=train_loader, val_dataloaders=val_loader)
trainer.test(dataloaders=test_loader)
model_export_path = './trained_model/mgl.m3g_out'
model.save(model_export_path)
model = matgl.load_model(path = model_export_path)
Traceback (most recent call last):
File "/home/myless/Potential_Training/V-Cr-Ti/Test_2/old_train.py", line 85, in <module>
train_loader, val_loader, test_loader = MGLDataLoader(
File "/home/myless/.mambaforge/envs/matgl-gpu/lib/python3.10/site-packages/matgl/graph/data.py", line 78, in MGLDataLoader
train_loader = GraphDataLoader(
File "/home/myless/.mambaforge/envs/matgl-gpu/lib/python3.10/site-packages/dgl/dataloading/dataloader.py", line 1451, in __init__
self.dist_sampler = _create_dist_sampler(
File "/home/myless/.mambaforge/envs/matgl-gpu/lib/python3.10/site-packages/dgl/dataloading/dataloader.py", line 1281, in _create_dist_sampler
Traceback (most recent call last):
File "/home/myless/Potential_Training/V-Cr-Ti/Test_2/old_train.py", line 85, in <module>
return DistributedSampler(dataset, **dist_sampler_kwargs)
File "/home/myless/.mambaforge/envs/matgl-gpu/lib/python3.10/site-packages/torch/utils/data/distributed.py", line 68, in __init__
train_loader, val_loader, test_loader = MGLDataLoader(
File "/home/myless/.mambaforge/envs/matgl-gpu/lib/python3.10/site-packages/matgl/graph/data.py", line 78, in MGLDataLoader
num_replicas = dist.get_world_size()
File "/home/myless/.mambaforge/envs/matgl-gpu/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py", line 1196, in get_world_size
train_loader = GraphDataLoader(
File "/home/myless/.mambaforge/envs/matgl-gpu/lib/python3.10/site-packages/dgl/dataloading/dataloader.py", line 1451, in __init__
return _get_group_size(group)
File "/home/myless/.mambaforge/envs/matgl-gpu/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py", line 576, in _get_group_size
self.dist_sampler = _create_dist_sampler(
File "/home/myless/.mambaforge/envs/matgl-gpu/lib/python3.10/site-packages/dgl/dataloading/dataloader.py", line 1281, in _create_dist_sampler
default_pg = _get_default_group()
File "/home/myless/.mambaforge/envs/matgl-gpu/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py", line 707, in _get_default_group
return DistributedSampler(dataset, **dist_sampler_kwargs)
File "/home/myless/.mambaforge/envs/matgl-gpu/lib/python3.10/site-packages/torch/utils/data/distributed.py", line 68, in __init__
raise RuntimeError(
RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.
num_replicas = dist.get_world_size()
File "/home/myless/.mambaforge/envs/matgl-gpu/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py", line 1196, in get_world_size
return _get_group_size(group)
File "/home/myless/.mambaforge/envs/matgl-gpu/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py", line 576, in _get_group_size
default_pg = _get_default_group()
File "/home/myless/.mambaforge/envs/matgl-gpu/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py", line 707, in _get_default_group
raise RuntimeError(
RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.
srun: error: gpu-rtx6000-02: task 2: Exited with exit code 1
srun: error: gpu-rtx6000-02: tasks 0-1: Exited with exit code 1
Hi, does this code have some file for doing material generation, as in the latest paper?
Thanks!
Dear everyone,
First off all thank you for taking the time to convert m3gnet to pytorch. I was wondering if there are any aspirations to build a pytorch lightning module for m3gnet which would trivialize multigpu/ multinode and mixed precision training?
best wishes,
Jonathan
Hi,
Thanks for your great work! I see you are using seed 42 to split the MPF 2021 dataset. May I ask if this is the same split used in the paper [1]? I didn't find out the information in the paper [1], so just ask in case.
Thank you and looking forward to your reply!
Best Regards
[1] Chen, Chi, and Shyue Ping Ong. "A universal graph deep learning interatomic potential for the periodic table." Nature Computational Science 2.11 (2022): 718-728.
Hi,
I am looking to extract the gradient with respect to atom positions from the band gap prediction model "MEGNet-MP-2019.4.1-BandGap-mfi".
By going through the wrapper into MEGNet.predict_structure function I can extract the gradients with respect to the edge attributes. Is there a way to convert this information into gradient with respect to atom positions in the structure (i.e. g.ndata["pos"] in the code below).
Below I provide the unpacked method to access the gradient within the predict_structure()
function.
I can do the same within the forward
method, but I am unclear which representation of the graph will provide me with the information I require for translation back to the structure.
Thank you for your help.
Code to reproduce:
import matgl
import torch
from mp_api.client import MPRester
from pymatgen.io.ase import AseAtomsAdaptor
from matgl.ext.pymatgen import Structure2Graph
from matgl.graph.compute import compute_pair_vector_and_distance
if __name__ == '__main__':
# loading arbitrary structure from Materials Project
with MPRester() as mpr:
structure = mpr.get_structure_by_material_id("mp-1840", final=True)
atoms_2 = AseAtomsAdaptor.get_atoms(structure)
# selecting a band gap method
state_feats = torch.tensor([3])
band_gap_model_wrapper = matgl.load_model("MEGNet-MP-2019.4.1-BandGap-mfi")
# unpacking forward method of MEGNet.predict_structure to access gradient information
graph_converter = Structure2Graph(
element_types=band_gap_model_wrapper.model.element_types,
cutoff=band_gap_model_wrapper.model.cutoff,
)
g, state_feats_default = graph_converter.get_graph(structure)
if state_feats is None:
state_feats = torch.tensor(state_feats_default)
bond_vec, bond_dist = compute_pair_vector_and_distance(g)
g.edata["edge_attr"] = band_gap_model_wrapper.model.bond_expansion(bond_dist)
# adding requires_grad argument edges to access gradients
g.edata["edge_attr"].requires_grad_()
model_output = band_gap_model_wrapper.model(g, g.edata["edge_attr"], g.ndata["node_type"], state_feats)
gradient_wrt_edge_attr = torch.autograd.grad(
model_output, g.edata["edge_attr"], create_graph=True, retain_graph=True,
)
model_output = model_output.detach()
model_output_converted = band_gap_model_wrapper.transformer.inverse_transform(model_output)
# asserting that unpacked method gives the same answer as the wrapper method provided
band_gap_value = band_gap_model_wrapper.predict_structure(
structure=structure,
state_feats=state_feats
)
assert model_output_converted == band_gap_value
I don't see a consistent choice of primitive datatypes. My guess is the default choice is intended to be torch.float32
and torch.int32
. But different datatypes appear throughout the code, and in particular having some tensors be float64 leads to graph data taking an unwieldy amount of storage space.
For example, when working atom graphs I find the following datatypes in their edge data:
bond_vec torch.float32
lattice torch.float64
pbc_offshift torch.float64
bond_dist torch.float32
pbc_offset torch.float64
And node data:
volume torch.float32
pos torch.float64
node_type torch.int64
I think primitive datatypes should be kept consistent.
Additionally, I was wondering if there are plans to implement a way to easily allow a global configuration of datatypes (similar to the config file in the original m3gent code)?
fails because class MEGNetTrainer
is defined in examples/trainer_beta/megnet.py
which is not included into the matgl
package namespace. The class should be moved into the matgl
directory.
I'm interested in setting up a pytorch M3GNet model with the exact same architecture as MP-2021.
I tried potential = Potential(M3GNet(DEFAULT_ELEMENT_TYPES, is_intensive=False))
. I then compared the parameters with the weights from the TF model: potential = Potential(model = M3GNet.load('MP-2021.2.8-EFS')
. However, I obtain a different number of weights/parameters and their shapes also do not match exactly.
>>> potential = Potential(model = M3GNet.load('MP-2021.2.8-EFS')
>>> for weight in potential.weights:
... print(weight.name, '|', weight.shape[::-1])
m3g_net/graph_featurizer/atom_embedding/atom_embedding/embeddings:0 | (64, 95)
m3g_net/graph_update_func/mlp/dense/kernel:0 | (64, 3)
m3g_net/three_d_interaction/mlp_1/dense_1/kernel:0 | (9, 64)
m3g_net/three_d_interaction/mlp_1/dense_1/bias:0 | (9,)
m3g_net/three_d_interaction/gated_mlp/dense_2/kernel:0 | (64, 9)
m3g_net/three_d_interaction/gated_mlp/dense_3/kernel:0 | (64, 9)
m3g_net/three_d_interaction_1/mlp_2/dense_4/kernel:0 | (9, 64)
m3g_net/three_d_interaction_1/mlp_2/dense_4/bias:0 | (9,)
m3g_net/three_d_interaction_1/gated_mlp_1/dense_5/kernel:0 | (64, 9)
m3g_net/three_d_interaction_1/gated_mlp_1/dense_6/kernel:0 | (64, 9)
m3g_net/three_d_interaction_2/mlp_3/dense_7/kernel:0 | (9, 64)
m3g_net/three_d_interaction_2/mlp_3/dense_7/bias:0 | (9,)
m3g_net/three_d_interaction_2/gated_mlp_2/dense_8/kernel:0 | (64, 9)
m3g_net/three_d_interaction_2/gated_mlp_2/dense_9/kernel:0 | (64, 9)
m3g_net/graph_network_layer/concat_atoms/gated_mlp_4/dense_15/kernel:0 | (64, 192)
m3g_net/graph_network_layer/concat_atoms/gated_mlp_4/dense_15/bias:0 | (64,)
m3g_net/graph_network_layer/concat_atoms/gated_mlp_4/dense_16/kernel:0 | (64, 64)
m3g_net/graph_network_layer/concat_atoms/gated_mlp_4/dense_16/bias:0 | (64,)
m3g_net/graph_network_layer/concat_atoms/gated_mlp_4/dense_17/kernel:0 | (64, 192)
m3g_net/graph_network_layer/concat_atoms/gated_mlp_4/dense_17/bias:0 | (64,)
m3g_net/graph_network_layer/concat_atoms/gated_mlp_4/dense_18/kernel:0 | (64, 64)
m3g_net/graph_network_layer/concat_atoms/gated_mlp_4/dense_18/bias:0 | (64,)
m3g_net/graph_network_layer/concat_atoms/dense_19/kernel:0 | (64, 3)
m3g_net/graph_network_layer/gated_atom_update/gated_mlp_3/dense_10/kernel:0 | (64, 192)
m3g_net/graph_network_layer/gated_atom_update/gated_mlp_3/dense_10/bias:0 | (64,)
m3g_net/graph_network_layer/gated_atom_update/gated_mlp_3/dense_11/kernel:0 | (64, 64)
m3g_net/graph_network_layer/gated_atom_update/gated_mlp_3/dense_11/bias:0 | (64,)
m3g_net/graph_network_layer/gated_atom_update/gated_mlp_3/dense_12/kernel:0 | (64, 192)
m3g_net/graph_network_layer/gated_atom_update/gated_mlp_3/dense_12/bias:0 | (64,)
m3g_net/graph_network_layer/gated_atom_update/gated_mlp_3/dense_13/kernel:0 | (64, 64)
m3g_net/graph_network_layer/gated_atom_update/gated_mlp_3/dense_13/bias:0 | (64,)
m3g_net/graph_network_layer/gated_atom_update/dense_14/kernel:0 | (64, 3)
m3g_net/graph_network_layer_1/concat_atoms_1/gated_mlp_6/dense_25/kernel:0 | (64, 192)
m3g_net/graph_network_layer_1/concat_atoms_1/gated_mlp_6/dense_25/bias:0 | (64,)
m3g_net/graph_network_layer_1/concat_atoms_1/gated_mlp_6/dense_26/kernel:0 | (64, 64)
m3g_net/graph_network_layer_1/concat_atoms_1/gated_mlp_6/dense_26/bias:0 | (64,)
m3g_net/graph_network_layer_1/concat_atoms_1/gated_mlp_6/dense_27/kernel:0 | (64, 192)
m3g_net/graph_network_layer_1/concat_atoms_1/gated_mlp_6/dense_27/bias:0 | (64,)
m3g_net/graph_network_layer_1/concat_atoms_1/gated_mlp_6/dense_28/kernel:0 | (64, 64)
m3g_net/graph_network_layer_1/concat_atoms_1/gated_mlp_6/dense_28/bias:0 | (64,)
m3g_net/graph_network_layer_1/concat_atoms_1/dense_29/kernel:0 | (64, 3)
m3g_net/graph_network_layer_1/gated_atom_update_1/gated_mlp_5/dense_20/kernel:0 | (64, 192)
m3g_net/graph_network_layer_1/gated_atom_update_1/gated_mlp_5/dense_20/bias:0 | (64,)
m3g_net/graph_network_layer_1/gated_atom_update_1/gated_mlp_5/dense_21/kernel:0 | (64, 64)
m3g_net/graph_network_layer_1/gated_atom_update_1/gated_mlp_5/dense_21/bias:0 | (64,)
m3g_net/graph_network_layer_1/gated_atom_update_1/gated_mlp_5/dense_22/kernel:0 | (64, 192)
m3g_net/graph_network_layer_1/gated_atom_update_1/gated_mlp_5/dense_22/bias:0 | (64,)
m3g_net/graph_network_layer_1/gated_atom_update_1/gated_mlp_5/dense_23/kernel:0 | (64, 64)
m3g_net/graph_network_layer_1/gated_atom_update_1/gated_mlp_5/dense_23/bias:0 | (64,)
m3g_net/graph_network_layer_1/gated_atom_update_1/dense_24/kernel:0 | (64, 3)
m3g_net/graph_network_layer_2/concat_atoms_2/gated_mlp_8/dense_35/kernel:0 | (64, 192)
m3g_net/graph_network_layer_2/concat_atoms_2/gated_mlp_8/dense_35/bias:0 | (64,)
m3g_net/graph_network_layer_2/concat_atoms_2/gated_mlp_8/dense_36/kernel:0 | (64, 64)
m3g_net/graph_network_layer_2/concat_atoms_2/gated_mlp_8/dense_36/bias:0 | (64,)
m3g_net/graph_network_layer_2/concat_atoms_2/gated_mlp_8/dense_37/kernel:0 | (64, 192)
m3g_net/graph_network_layer_2/concat_atoms_2/gated_mlp_8/dense_37/bias:0 | (64,)
m3g_net/graph_network_layer_2/concat_atoms_2/gated_mlp_8/dense_38/kernel:0 | (64, 64)
m3g_net/graph_network_layer_2/concat_atoms_2/gated_mlp_8/dense_38/bias:0 | (64,)
m3g_net/graph_network_layer_2/concat_atoms_2/dense_39/kernel:0 | (64, 3)
m3g_net/graph_network_layer_2/gated_atom_update_2/gated_mlp_7/dense_30/kernel:0 | (64, 192)
m3g_net/graph_network_layer_2/gated_atom_update_2/gated_mlp_7/dense_30/bias:0 | (64,)
m3g_net/graph_network_layer_2/gated_atom_update_2/gated_mlp_7/dense_31/kernel:0 | (64, 64)
m3g_net/graph_network_layer_2/gated_atom_update_2/gated_mlp_7/dense_31/bias:0 | (64,)
m3g_net/graph_network_layer_2/gated_atom_update_2/gated_mlp_7/dense_32/kernel:0 | (64, 192)
m3g_net/graph_network_layer_2/gated_atom_update_2/gated_mlp_7/dense_32/bias:0 | (64,)
m3g_net/graph_network_layer_2/gated_atom_update_2/gated_mlp_7/dense_33/kernel:0 | (64, 64)
m3g_net/graph_network_layer_2/gated_atom_update_2/gated_mlp_7/dense_33/bias:0 | (64,)
m3g_net/graph_network_layer_2/gated_atom_update_2/dense_34/kernel:0 | (64, 3)
m3g_net/pipe_24/graph_network_layer_3/graph_update_func_1/gated_mlp_9/dense_40/kernel:0 | (64, 64)
m3g_net/pipe_24/graph_network_layer_3/graph_update_func_1/gated_mlp_9/dense_40/bias:0 | (64,)
m3g_net/pipe_24/graph_network_layer_3/graph_update_func_1/gated_mlp_9/dense_41/kernel:0 | (64, 64)
m3g_net/pipe_24/graph_network_layer_3/graph_update_func_1/gated_mlp_9/dense_41/bias:0 | (64,)
m3g_net/pipe_24/graph_network_layer_3/graph_update_func_1/gated_mlp_9/dense_42/kernel:0 | (1, 64)
m3g_net/pipe_24/graph_network_layer_3/graph_update_func_1/gated_mlp_9/dense_42/bias:0 | (1,)
m3g_net/pipe_24/graph_network_layer_3/graph_update_func_1/gated_mlp_9/dense_43/kernel:0 | (64, 64)
m3g_net/pipe_24/graph_network_layer_3/graph_update_func_1/gated_mlp_9/dense_43/bias:0 | (64,)
m3g_net/pipe_24/graph_network_layer_3/graph_update_func_1/gated_mlp_9/dense_44/kernel:0 | (64, 64)
m3g_net/pipe_24/graph_network_layer_3/graph_update_func_1/gated_mlp_9/dense_44/bias:0 | (64,)
m3g_net/pipe_24/graph_network_layer_3/graph_update_func_1/gated_mlp_9/dense_45/kernel:0 | (1, 64)
m3g_net/pipe_24/graph_network_layer_3/graph_update_func_1/gated_mlp_9/dense_45/bias:0 | (1,)
>>> potential = Potential(M3GNet(DEFAULT_ELEMENT_TYPES, is_intensive=False))
>>> for parameter in potential.named_parameters():
... print(parameter[0], '|', tuple(parameter[1].shape))
model.embedding.layer_node_embedding.weight | (89, 64)
model.embedding.layer_edge_embedding.layers.0.weight | (64, 9)
model.embedding.layer_edge_embedding.layers.0.bias | (64,)
model.three_body_interactions.0.update_network_atom.layers.0.weight | (9, 64)
model.three_body_interactions.0.update_network_atom.layers.0.bias | (9,)
model.three_body_interactions.0.update_network_bond.layers.0.weight | (64, 9)
model.three_body_interactions.0.update_network_bond.gates.0.weight | (64, 9)
model.three_body_interactions.1.update_network_atom.layers.0.weight | (9, 64)
model.three_body_interactions.1.update_network_atom.layers.0.bias | (9,)
model.three_body_interactions.1.update_network_bond.layers.0.weight | (64, 9)
model.three_body_interactions.1.update_network_bond.gates.0.weight | (64, 9)
model.three_body_interactions.2.update_network_atom.layers.0.weight | (9, 64)
model.three_body_interactions.2.update_network_atom.layers.0.bias | (9,)
model.three_body_interactions.2.update_network_bond.layers.0.weight | (64, 9)
model.three_body_interactions.2.update_network_bond.gates.0.weight | (64, 9)
model.graph_layers.0.conv.edge_update_func.layers.0.weight | (64, 192)
model.graph_layers.0.conv.edge_update_func.layers.0.bias | (64,)
model.graph_layers.0.conv.edge_update_func.layers.2.weight | (64, 64)
model.graph_layers.0.conv.edge_update_func.layers.2.bias | (64,)
model.graph_layers.0.conv.edge_update_func.layers.4.weight | (64, 64)
model.graph_layers.0.conv.edge_update_func.layers.4.bias | (64,)
model.graph_layers.0.conv.edge_update_func.gates.0.weight | (64, 192)
model.graph_layers.0.conv.edge_update_func.gates.0.bias | (64,)
model.graph_layers.0.conv.edge_update_func.gates.2.weight | (64, 64)
model.graph_layers.0.conv.edge_update_func.gates.2.bias | (64,)
model.graph_layers.0.conv.edge_update_func.gates.4.weight | (64, 64)
model.graph_layers.0.conv.edge_update_func.gates.4.bias | (64,)
model.graph_layers.0.conv.edge_weight_func.weight | (64, 9)
model.graph_layers.0.conv.node_update_func.layers.0.weight | (64, 192)
model.graph_layers.0.conv.node_update_func.layers.0.bias | (64,)
model.graph_layers.0.conv.node_update_func.layers.2.weight | (64, 64)
model.graph_layers.0.conv.node_update_func.layers.2.bias | (64,)
model.graph_layers.0.conv.node_update_func.layers.4.weight | (64, 64)
model.graph_layers.0.conv.node_update_func.layers.4.bias | (64,)
model.graph_layers.0.conv.node_update_func.gates.0.weight | (64, 192)
model.graph_layers.0.conv.node_update_func.gates.0.bias | (64,)
model.graph_layers.0.conv.node_update_func.gates.2.weight | (64, 64)
model.graph_layers.0.conv.node_update_func.gates.2.bias | (64,)
model.graph_layers.0.conv.node_update_func.gates.4.weight | (64, 64)
model.graph_layers.0.conv.node_update_func.gates.4.bias | (64,)
model.graph_layers.0.conv.node_weight_func.weight | (64, 9)
model.graph_layers.1.conv.edge_update_func.layers.0.weight | (64, 192)
model.graph_layers.1.conv.edge_update_func.layers.0.bias | (64,)
model.graph_layers.1.conv.edge_update_func.layers.2.weight | (64, 64)
model.graph_layers.1.conv.edge_update_func.layers.2.bias | (64,)
model.graph_layers.1.conv.edge_update_func.layers.4.weight | (64, 64)
model.graph_layers.1.conv.edge_update_func.layers.4.bias | (64,)
model.graph_layers.1.conv.edge_update_func.gates.0.weight | (64, 192)
model.graph_layers.1.conv.edge_update_func.gates.0.bias | (64,)
model.graph_layers.1.conv.edge_update_func.gates.2.weight | (64, 64)
model.graph_layers.1.conv.edge_update_func.gates.2.bias | (64,)
model.graph_layers.1.conv.edge_update_func.gates.4.weight | (64, 64)
model.graph_layers.1.conv.edge_update_func.gates.4.bias | (64,)
model.graph_layers.1.conv.edge_weight_func.weight | (64, 9)
model.graph_layers.1.conv.node_update_func.layers.0.weight | (64, 192)
model.graph_layers.1.conv.node_update_func.layers.0.bias | (64,)
model.graph_layers.1.conv.node_update_func.layers.2.weight | (64, 64)
model.graph_layers.1.conv.node_update_func.layers.2.bias | (64,)
model.graph_layers.1.conv.node_update_func.layers.4.weight | (64, 64)
model.graph_layers.1.conv.node_update_func.layers.4.bias | (64,)
model.graph_layers.1.conv.node_update_func.gates.0.weight | (64, 192)
model.graph_layers.1.conv.node_update_func.gates.0.bias | (64,)
model.graph_layers.1.conv.node_update_func.gates.2.weight | (64, 64)
model.graph_layers.1.conv.node_update_func.gates.2.bias | (64,)
model.graph_layers.1.conv.node_update_func.gates.4.weight | (64, 64)
model.graph_layers.1.conv.node_update_func.gates.4.bias | (64,)
model.graph_layers.1.conv.node_weight_func.weight | (64, 9)
model.graph_layers.2.conv.edge_update_func.layers.0.weight | (64, 192)
model.graph_layers.2.conv.edge_update_func.layers.0.bias | (64,)
model.graph_layers.2.conv.edge_update_func.layers.2.weight | (64, 64)
model.graph_layers.2.conv.edge_update_func.layers.2.bias | (64,)
model.graph_layers.2.conv.edge_update_func.layers.4.weight | (64, 64)
model.graph_layers.2.conv.edge_update_func.layers.4.bias | (64,)
model.graph_layers.2.conv.edge_update_func.gates.0.weight | (64, 192)
model.graph_layers.2.conv.edge_update_func.gates.0.bias | (64,)
model.graph_layers.2.conv.edge_update_func.gates.2.weight | (64, 64)
model.graph_layers.2.conv.edge_update_func.gates.2.bias | (64,)
model.graph_layers.2.conv.edge_update_func.gates.4.weight | (64, 64)
model.graph_layers.2.conv.edge_update_func.gates.4.bias | (64,)
model.graph_layers.2.conv.edge_weight_func.weight | (64, 9)
model.graph_layers.2.conv.node_update_func.layers.0.weight | (64, 192)
model.graph_layers.2.conv.node_update_func.layers.0.bias | (64,)
model.graph_layers.2.conv.node_update_func.layers.2.weight | (64, 64)
model.graph_layers.2.conv.node_update_func.layers.2.bias | (64,)
model.graph_layers.2.conv.node_update_func.layers.4.weight | (64, 64)
model.graph_layers.2.conv.node_update_func.layers.4.bias | (64,)
model.graph_layers.2.conv.node_update_func.gates.0.weight | (64, 192)
model.graph_layers.2.conv.node_update_func.gates.0.bias | (64,)
model.graph_layers.2.conv.node_update_func.gates.2.weight | (64, 64)
model.graph_layers.2.conv.node_update_func.gates.2.bias | (64,)
model.graph_layers.2.conv.node_update_func.gates.4.weight | (64, 64)
model.graph_layers.2.conv.node_update_func.gates.4.bias | (64,)
model.graph_layers.2.conv.node_weight_func.weight | (64, 9)
model.final_layer.gated.layers.0.weight | (64, 64)
model.final_layer.gated.layers.0.bias | (64,)
model.final_layer.gated.layers.2.weight | (64, 64)
model.final_layer.gated.layers.2.bias | (64,)
model.final_layer.gated.layers.4.weight | (64, 64)
model.final_layer.gated.layers.4.bias | (64,)
model.final_layer.gated.layers.6.weight | (1, 64)
model.final_layer.gated.layers.6.bias | (1,)
model.final_layer.gated.gates.0.weight | (64, 64)
model.final_layer.gated.gates.0.bias | (64,)
model.final_layer.gated.gates.2.weight | (64, 64)
model.final_layer.gated.gates.2.bias | (64,)
model.final_layer.gated.gates.4.weight | (64, 64)
model.final_layer.gated.gates.4.bias | (64,)
model.final_layer.gated.gates.6.weight | (1, 64)
model.final_layer.gated.gates.6.bias | (1,)
No response
v0.8.6
This doesn't look right. Skips the release step even when it shouldn't. You probably wan't if: github.event_name == 'release' && needs.tests.result == 'success'
.
matgl/.github/workflows/testing.yml
Line 51 in d9f2665
Hence https://github.com/materialsvirtuallab/matgl/releases/tag/v0.8.6 didn't make it to PyPI.
No response
No response
Hi, thanks for the repository -
I was wondering if there was any example yet of how training could be performed given a list of structures, forces and energies, using a trainer object as per the original m3gnet repository? Or is this feature still in development?
Thanks in advance, Han.
I am puzzled why this code assumes there is a labels.json
file somewhere, when the other three files have explicit file names given. And in the preceding save method, the labels.json is written using a file.write
method. This is not how proper json is written. JSON is written using the json.dump
method to ensure that it is proper json format.
Also, there is no unittest for the labels.json loading.
I would like to use MatGL on a system with periodic boundary conditions, but have been unable to figure out how to do this. The issue is that even when periodic boundaries are specified in the system definitions, MatGL doesn't seem to recognize them.
To illustrate the issue, here are two python scripts that simulate 4 atoms of copper at 2400K. One uses the standard ASE package (with the EMT potential); one uses MatGL.
(1) ASE script: (produces a .traj file)
from asap3 import EMT
from ase import units
from ase.io.trajectory import Trajectory
from ase.lattice.cubic import FaceCenteredCubic
from ase.md.langevin import Langevin
from pymatgen.core import Lattice, Structure
from pymatgen.io.ase import AseAtomsAdaptor
import warnings
warnings.simplefilter("ignore")
# Define the lattice geometry and atom coordinates
lattice = Lattice.cubic(3.61, (True, True, True)) # lattice constant in units of Angstrom ; (True, True, True) sets periodic boundaries
fractcoords = [[0, 0, 0], [0, 0.5, 0.5], [0.5, 0, 0.5], [0.5, 0.5, 0]]
struct = Structure(lattice, ["Cu", "Cu", "Cu", "Cu"], fractcoords)
ase_adaptor = AseAtomsAdaptor()
atoms = ase_adaptor.get_atoms(struct)
# Describe the interatomic interactions with the Effective Medium Theory
atoms.calc = EMT()
T = 2400 # Kelvin
dyn = Langevin(atoms, 5 * units.fs, T * units.kB, 0.002)
def printenergy(a=atoms): # store a reference to atoms in the definition.
"""Function to print the potential, kinetic and total energy."""
epot = a.get_potential_energy() / len(a)
ekin = a.get_kinetic_energy() / len(a)
print('Energy per atom: Epot = %.3feV Ekin = %.3feV (T=%3.0fK) '
'Etot = %.3feV' % (epot, ekin, ekin / (1.5 * units.kB), epot + ekin))
dyn.attach(printenergy, interval=50)
traj = Trajectory('Cu_ASE.traj', 'w', atoms)
dyn.attach(traj.write, interval=50)
# Now run the dynamics
printenergy()
dyn.run(2000)
(2) MatGL script: (produces .traj and .log files)
from __future__ import annotations
import warnings
from ase.md.velocitydistribution import MaxwellBoltzmannDistribution
from pymatgen.core import Lattice, Structure
from pymatgen.io.ase import AseAtomsAdaptor
import matgl
from matgl.ext.ase import M3GNetCalculator, MolecularDynamics, Relaxer
warnings.simplefilter("ignore")
pot = matgl.load_model("M3GNet-MP-2021.2.8-PES")
# Define the lattice geometry and atom coordinates
lattice = Lattice.cubic(3.61, (True, True, True)) # lattice constant in units of Angstrom ; (True, True, True) sets periodic boundaries
fractcoords = [[0, 0, 0], [0, 0.5, 0.5], [0.5, 0, 0.5], [0.5, 0.5, 0]]
struct = Structure(lattice, ["Cu", "Cu", "Cu", "Cu"], fractcoords)
# Prepare atoms for molecular dynamics
ase_adaptor = AseAtomsAdaptor()
atoms = ase_adaptor.get_atoms(struct)
# Initiate temperature distribution
MaxwellBoltzmannDistribution(atoms, temperature_K=2400)
# Define molecular dynamics settings
driver = MolecularDynamics(
atoms,
potential=pot, # uses the M3GNet interatomic potential
temperature=2400,
timestep=1, # 1fs,
logfile="Cu_MatGL.log",
trajectory="Cu_MatGL.traj", # save trajectory
loginterval=50, # interval for recording the log
ensemble='nvt' # NVT ensemble
)
# Run molecular dynamics
driver.run(2000)
Also, to convert one of the .traj files to human-readable format, here is a little script: (produces .xyz trajectory file)
import ase
from ase.io import read, write
from ase.io.trajectory import Trajectory
traj = Trajectory("Cu_MatGL.traj")
#traj = Trajectory("Cu_ASE.traj")
atoms=traj[:]
writeme = ase.io.write("Cu_MatGL.xyz", atoms, "xyz")
#writeme = ase.io.write("Cu_ASE.xyz", atoms, "xyz")
Looking at the .xyz files, you'll notice that in the basic ASE simulation, none of the atom positions are larger than the specified lattice constant (3.61 Angstroms). This is as expected with periodic boundary conditions. However, in the MatGL simulation, the atom positions readily exceed the lattice constant value, and the atoms appear to drift through space.
For example, here are excerpts of the last recorded frame from the .xyz trajectory files:
(1) ASE trajectory excerpt
4
Cu 0.344162968499945 -0.148546966782993 -0.009870944387788
Cu -0.177349810618174 1.722229170919916 1.765656855202022
Cu 1.736425681826978 0.345596431729686 1.793462024164234
Cu 1.727657879060445 1.651865872484211 0.104391311992427
(2) MatGL trajectory excerpt
4
Cu 10.754732275176222 1.779757546738823 7.457469460754528
Cu 10.395401224745564 3.745542174112556 9.486116801335781
Cu 12.303262940663885 1.970363284378915 9.334171650233468
Cu 12.388181989458412 3.671734565914415 7.549590636835549
Do you have any recommendations for how to resolve this issue? How should one go about implementing periodic boundary conditions in a molecular dynamics simulation that leverages MatGL? Please advise, thank you.
No response
v0.8.5 and v0.7.1
Dear developers,
I'm trying to train a M3GNet potential using the same code in the tutorial (https://matgl.ai/tutorials%2FTraining%20a%20M3GNet%20Potential%20with%20PyTorch%20Lightning.html).
Training the potential on a CPU went smoothly without any issues. However, when I switched to a GPU node for training, I ran into several errors.
I made the following adjustments to the code to enable training on a GPU node.
trainer = pl.Trainer(max_epochs=1, accelerator="gpu", devices=[0], logger=logger, inference_mode=False)
trainer.fit(model=lit_module_finetune, train_dataloaders=train_loader, val_dataloaders=val_loader)
Then the following error occurs,
I also tried to set the default device to one specific gpu, but I encountered another error:
Do you have any suggestions on fixing these errors ? Thanks in advance.
No response
No response
Hi,
thank you for the great work. Do you plan to provide TorchScript support in the future?
Best regards!
No response
Dear developers,
Considering Molecular Dynamics with Matgl supports pressure, is it possible to add external pressure to Relaxer? It would be of great benefit for studying systems in pressure ranges.
add an option to relaxer for external pressure
VASP supports setting PSTRESS in INCAR
No response
Suppose we have a situation where we want to predict an atom-specific property. For example, the spectrum of a single absorbing site on a material. It would be nice to be able to label that atom as distinct.
For example, given material with symmetrically unique elements Ti, Ti, O, O, O, O
, perhaps we would like to distinguish the first Ti
as "absorber" or "special" in some way. Thus it does not get the standard Ti label, it gets index len(element_types) + 1
or something like this.
In #164, I've allowed for an extra index as a "catch all" for atoms not specified in the elements_list
. I'd like to add an option for yet another index, a "special" atom. In other words, the following two materials would produce the same atom label featurization:
Ti, Ti, O, O, O, O
with atom index 0 indicated as "special"Mg, Ti, O, O, O, O
with the standard use of the codeI'm happy to do the coding on this.
No response
Thanks for providing such an wonderful work.
I am trying to use pre-trained model "M3GNet-MP-2021.2.8-PES" to run MD simulation.
I wanted to run it on GPU. I placed the model to GPU but I got conflicting device issue.
Could you help me resolve it?
0.7.1
Train:
model = matgl.load_model('M3GNet-MP-2021.2.8-PES')
lit_model = PotentialLightningModule(model=model.model, lr=0.00005, force_weight=1)
trainer = pl.Trainer(max_epochs=10, accelerator='cuda', devices=1, precision='32')
trainer.fit(model=lit_model, train_dataloaders=train_loader, val_dataloaders=val_loader)
model_export_path = "./trained_model/"
model.save(model_export_path)
Prediction:
if __name__ == '__main__':
DB = Get_db()
strus,e,f,sr = DB.get_stru_energy_forces_stress("vasprun.xml")
print(f"{len(strus)} structures found !!! .")
model = matgl.load_model("../train/trained_model").model
pre_e = e
plt.scatter(range(1, len(strus)+1), pre_e, c='r')
plt.scatter(range(1, len(strus)+1), energy, c='b')
plt.savefig('res.png')
plt.show()
No response
Hello,
Thank you for making the code accessible. I want to inform with M3GNET i was able to relax the structure but when I switch to MATGL module the structure is not relaxing rather displaying large force.
I am testing this potential on HfNbTaTiZr High entropy alloys to study the core structure of screw dislocation in BCC structure. With M3GNET the relaxation was with 200 steps but with MATGL it takes 1000 steps but the forces on the structure keeps on increasing and then the terminal hangs on. Is the module on testing stage yet?
Thanks
I am able to train normally using the CPU. When I use GPU, it keeps failing. I don't know where the problem is, so I hope someone can help me. I would be very grateful。
My script is as follows:
from _future_ import annotations
import os
import shutil
import numpy as np
import pytorch_lightning as pl
from dgl.data.utils import split_dataset
from pymatgen.core import Structure
from matgl.ext.pymatgen import Structure2Graph, get_element_list
from matgl.graph.data import M3GNetDataset, MEGNetDataset, MGLDataLoader, collate_fn, collate_fn_efs
from matgl.models import M3GNet, MEGNet
from matgl.utils.training import ModelLightningModule, PotentialLightningModule
import torch
if _name_ == '_main_':
stru0 = Structure.from_file("../../strus/0/POSCAR")
stru1 = Structure.from_file("../../strus/1/POSCAR")
structures = [stru0, stru1] * 10
energies = np.zeros(len(structures))
forces = [np.zeros((len(s), 3)).tolist() for s in structures]
stresses = [np.zeros((3, 3)).tolist()] * len(structures)
element_types = get_element_list([stru0, stru1])
converter = Structure2Graph(element_types=element_types, cutoff=5.0)
dataset = M3GNetDataset(
threebody_cutoff=4.0,
structures=structures,
converter=converter,
energies=energies,
forces=forces,
stresses=stresses,
)
train_data, val_data, test_data = split_dataset(
dataset,
frac_list=[0.8, 0.1, 0.1],
shuffle=True,
random_state=42,
)
train_loader, val_loader, test_loader = MGLDataLoader(
train_data=train_data,
val_data=val_data,
test_data=test_data,
collate_fn=collate_fn_efs,
batch_size=32,
num_workers=8,
)
model = M3GNet(
element_types=element_types,
is_intensive=False,
)
lit_model = PotentialLightningModule(model=model)
torch.set_default_device('cuda')
torch.multiprocessing.set_start_method('spawn', force=True)
trainer = pl.Trainer(max_epochs=10)
trainer.fit(model=lit_model, train_dataloaders=train_loader, val_dataloaders=val_loader)
The error message I received is as follows:
Traceback (most recent call last):
File "/home/lycui/test/mgl/test/train.py", line 59, in
trainer.fit(model=lit_model, train_dataloaders=train_loader, val_dataloaders=val_loader)
File "/home/lycui/anaconda3/envs/mgl/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 531, in fit
call._call_and_handle_interrupt(
File "/home/lycui/anaconda3/envs/mgl/lib/python3.9/site-packages/pytorch_lightning/trainer/call.py", line 42, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/home/lycui/anaconda3/envs/mgl/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 570, in _fit_impl
self._run(model, ckpt_path=ckpt_path)
File "/home/lycui/anaconda3/envs/mgl/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 975, in _run
results = self._run_stage()
File "/home/lycui/anaconda3/envs/mgl/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1016, in _run_stage
self._run_sanity_check()
File "/home/lycui/anaconda3/envs/mgl/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1045, in _run_sanity_check
val_loop.run()
File "/home/lycui/anaconda3/envs/mgl/lib/python3.9/site-packages/pytorch_lightning/loops/utilities.py", line 177, in _decorator
return loop_run(self, *args, **kwargs)
File "/home/lycui/anaconda3/envs/mgl/lib/python3.9/site-packages/pytorch_lightning/loops/evaluation_loop.py", line 115, in run
self._evaluation_step(batch, batch_idx, dataloader_idx)
File "/home/lycui/anaconda3/envs/mgl/lib/python3.9/site-packages/pytorch_lightning/loops/evaluation_loop.py", line 375, in _evaluation_step
output = call._call_strategy_hook(trainer, hook_name, *step_kwargs.values())
File "/home/lycui/anaconda3/envs/mgl/lib/python3.9/site-packages/pytorch_lightning/trainer/call.py", line 287, in _call_strategy_hook
output = fn(*args, **kwargs)
File "/home/lycui/anaconda3/envs/mgl/lib/python3.9/site-packages/pytorch_lightning/strategies/strategy.py", line 379, in validation_step
return self.model.validation_step(*args, **kwargs)
File "/home/lycui/anaconda3/envs/mgl/lib/python3.9/site-packages/matgl/utils/training.py", line 59, in validation_step
results, batch_size = self.step(batch) # type: ignore
File "/home/lycui/anaconda3/envs/mgl/lib/python3.9/site-packages/matgl/utils/training.py", line 329, in step
e, f, s, _ = self(g=g, state_attr=state_attr, l_g=l_g)
File "/home/lycui/anaconda3/envs/mgl/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/lycui/anaconda3/envs/mgl/lib/python3.9/site-packages/matgl/utils/training.py", line 317, in forward
e, f, s, h = self.model(g=g, l_g=l_g, state_attr=state_attr)
File "/home/lycui/anaconda3/envs/mgl/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/lycui/anaconda3/envs/mgl/lib/python3.9/site-packages/matgl/apps/pes.py", line 75, in forward
total_energies = self.data_std * self.model(g=g, state_attr=state_attr, l_g=l_g) + self.data_mean
File "/home/lycui/anaconda3/envs/mgl/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/lycui/anaconda3/envs/mgl/lib/python3.9/site-packages/matgl/models/_m3gnet.py", line 227, in forward
expanded_dists = self.bond_expansion(g.edata["bond_dist"])
File "/home/lycui/anaconda3/envs/mgl/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/lycui/anaconda3/envs/mgl/lib/python3.9/site-packages/matgl/layers/_bond.py", line 65, in forward
bond_basis = self.rbf(bond_dist)
File "/home/lycui/anaconda3/envs/mgl/lib/python3.9/site-packages/matgl/layers/_basis.py", line 104, in call
return self._call_sbf(r)
File "/home/lycui/anaconda3/envs/mgl/lib/python3.9/site-packages/matgl/layers/_basis.py", line 120, in _call_sbf
func(r[:, None] * root[None, :] / self.cutoff) * factor / torch.abs(func_add1(root[None, :]))
File "/home/lycui/anaconda3/envs/mgl/lib/python3.9/site-packages/torch/utils/_device.py", line 62, in torch_function
return func(*args, **kwargs)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
Could you please add the pretrained megnet model for qm9 data set under the pretrained_models
directory?
Only the MP pretrained models are present under the pretrained-models
.
I'm comparing results between the pretrained m3gnet in this repo and in the original m3gnet repo, and for some of my structures I am finding pretty large discrepancies. Is this expected? For example, for this structure:
Full Formula (Ti2 Nb3)
Reduced Formula: Ti2Nb3
abc : 2.862214 2.862214 11.801210
angles: 85.362850 94.637150 70.528779
pbc : True True True
Sites (5)
# SP a b c
--- ---- --- --- ---
0 Ti 0 0 0
1 Ti 0.6 0.4 0.2
2 Nb 0.2 0.8 0.4
3 Nb 0.8 0.2 0.6
4 Nb 0.4 0.6 0.8
I get a difference of 40 meV/atom in expected energy, and different atomic positions as well.
Hi, I am using the MEGNet model for band gap prediction from crystal structures (MEGNet-MP-2019.4.1-BandGap-mfi), and I am trying to understand the purpose of using two zeros as a placeholder for the global state feature and the way it is processed. Running the following command
mgl predict -m MEGNet-MP-2019.4.1-BandGap-mfi --infile crys.cif
with default arguments, I get an error in line
matgl/matgl/layers/_graph_convolution.py
Line 62 in 4961e1c
File "/network/scratch/p/prashant.govindarajan/crystal_design_project/crystal-design/crystal_design/matgl/matgl/layers/_graph_convolution.py", line 62, in _edge_udf inputs = torch.hstack([vi, vj, eij, u]) RuntimeError: Tensors must have same number of dimensions: got 2 and 3
The state feature of size [2,]
passes through the embedding layer to get an output of shape [2,16]
. Because of this, the state feature becomes 3 dimensions once broadcasted across nodes in the graph. This causes dimension mismatch error while concatenating features in the edge update function in the graph convolution layer. So does the placeholder zero tensor that represents the state feature need to pass through the embedding layer i.e., this line?
matgl/matgl/layers/_embedding.py
Line 78 in 4961e1c
0.8.5
I'm attempting to run the code in the snippet below to compute the stress on a siliceous CHA unit cell with the ASE M3GNetCalculator (CIF file available here, but I think these issues arise with any ASE Atoms object). I did two things that seemed to resolve the issue:
matgl/layers/_three_body.py
to weights = three_cutoff[torch.stack(list(line_graph.edges()), dim=1).long()].view(-1, 2)
matgl/layers/_atom_ref.py
to one_hot = torch.eye(num_elements)[g.ndata["node_type"].long()]
.Thanks in advance for your help! Looks like there were just a few spots where the tensors needed to be changed to .long()
.
EDIT: I made different changes to the code than I had in my original bug report, apologies for the mistake.
import matgl
from matgl.ext.ase import M3GNetCalculator
potential = matgl.load_model("M3GNet-MP-2021.2.8-PES")
calculator = M3GNetCalculator(potential=potential)
atoms.calc = calculator
atoms.get_stress()
# === First Error ===
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
Cell In[1], line 13
11 calculator = M3GNetCalculator(potential=potential)
12 atoms.calc = calculator
---> 13 atoms.get_stress()
File ~/mambaforge/envs/htvs/lib/python3.9/site-packages/ase/atoms.py:820, in Atoms.get_stress(self, voigt, apply_constraint, include_ideal_gas)
817 if self._calc is None:
818 raise RuntimeError('Atoms object has no calculator.')
--> 820 stress = self._calc.get_stress(self)
821 shape = stress.shape
823 if shape == (3, 3):
824 # Convert to the Voigt form before possibly applying
825 # constraints and adding the dynamic part of the stress
826 # (the "ideal gas contribution").
File ~/mambaforge/envs/htvs/lib/python3.9/site-packages/ase/calculators/abc.py:26, in GetPropertiesMixin.get_stress(self, atoms)
25 def get_stress(self, atoms=None):
---> 26 return self.get_property('stress', atoms)
File ~/mambaforge/envs/htvs/lib/python3.9/site-packages/ase/calculators/calculator.py:737, in Calculator.get_property(self, name, atoms, allow_calculation)
735 if not allow_calculation:
736 return None
--> 737 self.calculate(atoms, [name], system_changes)
739 if name not in self.results:
740 # For some reason the calculator was not able to do what we want,
741 # and that is OK.
742 raise PropertyNotImplementedError('{} not present in this '
743 'calculation'.format(name))
File ~/mambaforge/envs/htvs/lib/python3.9/site-packages/matgl/ext/ase.py:178, in M3GNetCalculator.calculate(self, atoms, properties, system_changes)
176 energies, forces, stresses, hessians = self.potential(graph, self.state_attr)
177 else:
--> 178 energies, forces, stresses, hessians = self.potential(graph, state_attr_default)
179 self.results.update(
180 energy=energies.detach().numpy(),
181 free_energy=energies.detach().numpy(),
182 forces=forces.detach().numpy(),
183 )
184 if self.compute_stress:
File ~/mambaforge/envs/htvs/lib/python3.9/site-packages/torch/nn/modules/module.py:1194, in Module._call_impl(self, *input, **kwargs)
1190 # If we don't have any hooks, we want to skip the rest of the logic in
1191 # this function, and just call forward.
1192 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1193 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1194 return forward_call(*input, **kwargs)
1195 # Do not call functions when jit is used
1196 full_backward_hooks, non_full_backward_hooks = [], []
File ~/mambaforge/envs/htvs/lib/python3.9/site-packages/matgl/apps/pes.py:76, in Potential.forward(self, g, state_attr, l_g)
73 if self.calc_forces:
74 g.ndata["pos"].requires_grad_(True)
---> 76 predictions = self.model(g, state_attr, l_g)
77 if isinstance(predictions, tuple) and len(predictions) > 1:
78 total_energies, site_wise = predictions
File ~/mambaforge/envs/htvs/lib/python3.9/site-packages/torch/nn/modules/module.py:1194, in Module._call_impl(self, *input, **kwargs)
1190 # If we don't have any hooks, we want to skip the rest of the logic in
1191 # this function, and just call forward.
1192 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1193 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1194 return forward_call(*input, **kwargs)
1195 # Do not call functions when jit is used
1196 full_backward_hooks, non_full_backward_hooks = [], []
File ~/mambaforge/envs/htvs/lib/python3.9/site-packages/matgl/models/_m3gnet.py:252, in M3GNet.forward(self, g, state_attr, l_g)
250 node_feat, edge_feat, state_feat = self.embedding(node_types, g.edata["rbf"], state_attr)
251 for i in range(self.n_blocks):
--> 252 edge_feat = self.three_body_interactions[i](
253 g,
254 l_g,
255 three_body_basis,
256 three_body_cutoff,
257 node_feat,
258 edge_feat,
259 )
260 edge_feat, node_feat, state_feat = self.graph_layers[i](g, edge_feat, node_feat, state_feat)
261 g.ndata["node_feat"] = node_feat
File ~/mambaforge/envs/htvs/lib/python3.9/site-packages/torch/nn/modules/module.py:1194, in Module._call_impl(self, *input, **kwargs)
1190 # If we don't have any hooks, we want to skip the rest of the logic in
1191 # this function, and just call forward.
1192 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1193 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1194 return forward_call(*input, **kwargs)
1195 # Do not call functions when jit is used
1196 full_backward_hooks, non_full_backward_hooks = [], []
File ~/mambaforge/envs/htvs/lib/python3.9/site-packages/matgl/layers/_three_body.py:61, in ThreeBodyInteractions.forward(self, graph, line_graph, three_basis, three_cutoff, node_feat, edge_feat)
59 print(three_cutoff)
60 print(torch.stack(list(line_graph.edges()), dim=1).view(-1, 2))
---> 61 weights = three_cutoff[torch.stack(list(line_graph.edges()), dim=1)].view(-1, 2) # type: ignore
62 weights = torch.prod(weights, dim=-1) # type: ignore
63 basis = basis * weights[:, None]
IndexError: tensors used as indices must be long, byte or bool tensors
# === Second Error ===
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
Cell In[1], line 13
11 calculator = M3GNetCalculator(potential=potential)
12 atoms.calc = calculator
---> 13 atoms.get_stress()
File ~/mambaforge/envs/htvs/lib/python3.9/site-packages/ase/atoms.py:820, in Atoms.get_stress(self, voigt, apply_constraint, include_ideal_gas)
817 if self._calc is None:
818 raise RuntimeError('Atoms object has no calculator.')
--> 820 stress = self._calc.get_stress(self)
821 shape = stress.shape
823 if shape == (3, 3):
824 # Convert to the Voigt form before possibly applying
825 # constraints and adding the dynamic part of the stress
826 # (the "ideal gas contribution").
File ~/mambaforge/envs/htvs/lib/python3.9/site-packages/ase/calculators/abc.py:26, in GetPropertiesMixin.get_stress(self, atoms)
25 def get_stress(self, atoms=None):
---> 26 return self.get_property('stress', atoms)
File ~/mambaforge/envs/htvs/lib/python3.9/site-packages/ase/calculators/calculator.py:737, in Calculator.get_property(self, name, atoms, allow_calculation)
735 if not allow_calculation:
736 return None
--> 737 self.calculate(atoms, [name], system_changes)
739 if name not in self.results:
740 # For some reason the calculator was not able to do what we want,
741 # and that is OK.
742 raise PropertyNotImplementedError('{} not present in this '
743 'calculation'.format(name))
File ~/mambaforge/envs/htvs/lib/python3.9/site-packages/matgl/ext/ase.py:178, in M3GNetCalculator.calculate(self, atoms, properties, system_changes)
176 energies, forces, stresses, hessians = self.potential(graph, self.state_attr)
177 else:
--> 178 energies, forces, stresses, hessians = self.potential(graph, state_attr_default)
179 self.results.update(
180 energy=energies.detach().numpy(),
181 free_energy=energies.detach().numpy(),
182 forces=forces.detach().numpy(),
183 )
184 if self.compute_stress:
File ~/mambaforge/envs/htvs/lib/python3.9/site-packages/torch/nn/modules/module.py:1194, in Module._call_impl(self, *input, **kwargs)
1190 # If we don't have any hooks, we want to skip the rest of the logic in
1191 # this function, and just call forward.
1192 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1193 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1194 return forward_call(*input, **kwargs)
1195 # Do not call functions when jit is used
1196 full_backward_hooks, non_full_backward_hooks = [], []
File ~/mambaforge/envs/htvs/lib/python3.9/site-packages/matgl/apps/pes.py:84, in Potential.forward(self, g, state_attr, l_g)
82 total_energies = self.data_std * total_energies + self.data_mean
83 if self.element_refs is not None:
---> 84 property_offset = torch.squeeze(self.element_refs(g))
85 total_energies += property_offset
87 forces = torch.zeros(1)
File ~/mambaforge/envs/htvs/lib/python3.9/site-packages/torch/nn/modules/module.py:1194, in Module._call_impl(self, *input, **kwargs)
1190 # If we don't have any hooks, we want to skip the rest of the logic in
1191 # this function, and just call forward.
1192 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1193 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1194 return forward_call(*input, **kwargs)
1195 # Do not call functions when jit is used
1196 full_backward_hooks, non_full_backward_hooks = [], []
File ~/mambaforge/envs/htvs/lib/python3.9/site-packages/matgl/layers/_atom_ref.py:64, in AtomRef.forward(self, g, state_attr)
52 """Get the total property offset for a system.
53
54 Args:
(...)
59 offset_per_graph
60 """
61 num_elements = (
62 self.property_offset.size(dim=1) if self.property_offset.ndim > 1 else self.property_offset.size(dim=0)
63 )
---> 64 one_hot = torch.eye(num_elements)[g.ndata["node_type"]]
65 if self.property_offset.ndim > 1:
66 offset_batched_with_state = []
IndexError: tensors used as indices must be long, byte or bool tensors
Thanks for providing such an wonderful work.
I'm trying to run "trian_mp_eform.py", but I'm getting an error:
"trainer. train(
TypeError: train() got an unexpected keyword argument 'n_epochs'”
How should I modify it to avoid this error.
An interface to LAMMPS for MatGL is needed. This can be done in two ways:
I am trying to reproduce the results in
Chen, C.; Zuo, Y.; Ye, W.; Li, X.; Ong, S. P. Learning Properties of Ordered and Disordered Materials from Multi-Fidelity Data. Nature Computational Science, 2021, 1, 46–53.
for the extended QM7b energy data set (Extended data Fig 4b of the paper).
Following the instructions and looking at the code for the band gap example in the paper, the code for which is provided in GitHub, I set the 'state' variable/key in the structures to either 1 for low fidelity or 2 for high fidelity. This information is then passed as global feature setting nfeat_global = 1, global_embedding_dim=None.
However, I cannot reproduce the results using the default megnet with 3 blocks, 3 message passing step and graph_converter=CrystalGraph with the Gaussian distance method. Very little information was provided in the paper on this example and there is no information on the Github page.
Could you please upload the code for that example or else let me know the precise details how to implement the multi-fidelity version for that example please? Much appreciated.
if possible, could you upload the code for generating the results of Fig 4b?
No response
Hello, I am using the following script to check for smoothness of the M3GNet PES. It's a very simple scenario where two particles are pulled apart inside a large box, i.e. no three-body contributions are involved.
import numpy as np
import matplotlib.pyplot as plt
from ase import Atoms
import torch
import matgl
from matgl.ext.ase import M3GNetCalculator
elements = [element for element in range(1, 4)]
N = len(elements)
r_min = 4.8
r_max = 5.2
for i, element1 in enumerate(elements):
for element2 in elements[i:]:
atoms = Atoms([element1, element2], positions=[(0.0, 0.0, 0.0), (r_min, 0.0, 0.0)],
cell=np.eye(3, dtype=np.float32) * 1000, pbc=True)
model = matgl.load_model('M3GNet-MP-2021.2.8-PES')
model.to(torch.float32)
calc = M3GNetCalculator(potential=model)
atoms.set_calculator(calc)
distances = np.linspace(r_min, r_max, 100)
potential_energies = []
for distance in distances:
atoms.set_distance(0, 1, distance) # Set the distance between the two atoms
potential_energy = atoms.get_potential_energy()
potential_energies.append(potential_energy)
plt.plot(distances, potential_energies)
plt.xlabel('Distance (Å)')
plt.ylabel('Potential Energy (eV)')
plt.title(f'Potential Energy of {atoms.get_chemical_symbols()}')
plt.grid(True)
plt.show()
Looking at the plots, it seems that the energy curve is not as smooth around the cutoff as I would expect, but I might be wrong. What are your thoughts on this?
Respected Authors
First of all, thank you for creating this wonderful library.
I am facing the same issue as the previous issue opened, but for MEGNET. I use pip install downloading matgl.
On the website, it is mentioned that pre-trained MEGNet models now available for formation energies and band gaps. I am trying to predict the band gaps, but got stuck in the implementation.
Hallo,
When I install via pip it does not show the latest version only 0.7.1.
A
Line 245 in aaab746
When I try to load the pretrained m3gnet potential, I get the following error:
FileNotFoundError: [Errno 2] No such file or directory: '/Users/kuner/opt/anaconda3/envs/atomate2/lib/python3.9/site-packages/matgl/models/../../pretrained/MP-2021.2.8-EFS/m3gnet.pt'
It seems like all of the necessary files are currently not being distributed with the package (note I installed this via 'pip install matgl'). Any help would be appreciated!
Even when I load the element types, I still get an error indicating that the positional argument 'element_types' is missing. I think there might be a bug with how the model is loaded in utils/io. Do you have any suggestions on how to fix this?
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[3], line 11
9 model_dict = json.load(open(os.path.join(potential_path,"model.json")))
10 m3gnet = M3GNet(tuple(model_dict["kwargs"]["model"]["init_args"]["element_types"]))
---> 11 model, d = m3gnet.load(potential_path, include_json=True)
12 # print(d)
13 potential = Potential(model)
File ~/.conda/envs/matgl/lib/python3.8/site-packages/matgl/utils/io.py:132, in IOMixIn.load(cls, path, include_json, **kwargs)
130 d = {k: v for k, v in d.items() if not k.startswith("@")}
131 print(d)
--> 132 model = cls(**d)
133 model.load_state_dict(state) # type: ignore
135 if include_json:
TypeError: __init__() missing 1 required positional argument: 'element_types'
Hello,
I am working through the tutorial to train a M3GNet
model (https://matgl.ai/tutorials%2FTraining%20a%20M3GNet%20Potential%20with%20PyTorch%20Lightning.html) and it appears that the interface for M3GNetDataset
no longer takes inputs energies, forces, stresses
.
My matgl
version is 0.8.3
. Could I be advised on how to proceed? Thank you in advance.
I created a conda-forge package: https://anaconda.org/conda-forge/matgl
So you can now install matgl using:
conda install -c conda-forge matgl
Hello. I would like to train a new model with the pre-trained m3gnet and my own AIMD trajectories data. Could you share an example of how to do it? Thank you so much!
When trying to obtain the reference energies of single atoms or of non periodic systems, m3gnet gives an error.
single atom:
>>> import ase
>>> import matgl
>>> potential = matgl.load_model("M3GNet-MP-2021.2.8-PES")
>>> calc = matgl.ext.ase.M3GNetCalculator_new(potential)
>>> atoms3 = ase.Atoms('H', [[0,0,0]], cell = [100,100,100])
>>> atoms3.calc = calc_new
>>> atoms3.get_potential_energy()
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
[<ipython-input-9-47bee463321d>](https://localhost:8080/#) in <cell line: 3>()
1 atoms3 = Atoms('H', [[0,0,0]], cell = [100,100,100])
2 atoms3.calc = calc_new
----> 3 atoms3.get_potential_energy()
6 frames
[/usr/local/lib/python3.10/dist-packages/numpy/core/shape_base.py](https://localhost:8080/#) in stack(arrays, axis, out)
420 arrays = [asanyarray(arr) for arr in arrays]
421 if not arrays:
--> 422 raise ValueError('need at least one array to stack')
423
424 shapes = {arr.shape for arr in arrays}
ValueError: need at least one array to stack
non-periodic: (not expected to work)
>>> import ase
>>> import matgl
>>> potential = matgl.load_model("M3GNet-MP-2021.2.8-PES")
>>> calc = matgl.ext.ase.M3GNetCalculator_new(potential)
>>> atoms3 = ase.Atoms('H', [[0,0,0]])
>>> atoms3.calc = calc_new
>>> atoms3.get_potential_energy()
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
[<ipython-input-10-2b31f4d7515d>](https://localhost:8080/#) in <cell line: 3>()
1 atoms3 = Atoms('H', [[0,0,0]])
2 atoms3.calc = calc_new
----> 3 atoms3.get_potential_energy()
5 frames
[/usr/local/lib/python3.10/dist-packages/ase/atoms.py](https://localhost:8080/#) in get_volume(self)
1919 """Get volume of unit cell."""
1920 if self.cell.rank != 3:
-> 1921 raise ValueError(
1922 'You have {0} lattice vectors: volume not defined'
1923 .format(self.cell.rank))
ValueError: You have 0 lattice vectors: volume not defined
A work around could be
atoms.positions.max(axis=0)-atoms.positions.min(axis=0)+calc.potential.model.cutoff
to obtain a bounding box around your atoms that places periodic images at least 1 cutoff away.
For analysis, single atoms or molecules were extracted from a simulation box in order to obtain their energy contribution.
The new way to obtain the pretrained model is very convenient, thank you for the update!
Hi, I tried to repeat the example “Training a M3GNet Formation Energy Model with PyTorch Lightning.ipynb”, but I want to train this model to predict spectra as a vector, and when I try to train m3gnet model, I get the error, although I put the ntarget parameter.
# setup the architecture of MEGNet model
model = M3GNet(
element_types=elem_list,
is_intensive=True,
readout_type="set2set",
ntarget=66,
)
# setup the MEGNetTrainer
lit_module = ModelLightningModule(model=model)
logger = CSVLogger("logs", name="M3GNet_training")
trainer = pl.Trainer(max_epochs=20, accelerator="gpu", logger=logger)
trainer.fit(model=lit_module, train_dataloaders=train_loader, val_dataloaders=val_loader)
ValueError: Caught ValueError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
data = fetcher.fetch(index)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 54, in fetch
return self.collate_fn(data)
File "/usr/local/lib/python3.10/dist-packages/matgl/graph/data.py", line 31, in collate_fn
labels = torch.tensor([next(iter(d.values())) for d in labels], dtype=matgl.float_th) # type: ignore
ValueError: only one element tensors can be converted to Python scalars
https://colab.research.google.com/drive/1L05611HYB6UMb380xYWXp9nBZL51iHYc#scrollTo=6crRrc29Dawl
Ideally, matgl should be materials code agnostic. The only materials code-specific stuff should be in the ext and apps packages. Use in tests are ok as well.
I have removed the unnecessary dependency to pymatgen in _megnet.py. The only other problem I see is that AtomRef now contains a get_feature_matrix that optionally takes an input of list(Structures). Based on my reading, it seems this is unnecessary given that it is always created from a list of graphs by the internal code itself. @kenko911 should verify and remove the option to pass list of structures / molecules if this is not used in that manner.
The general idea should be that all graph architecture based stuff should be completely agnostic to materials code. So anything in layers, utils, graphs, data, and models should have no reference to any materials code (pymatgen, ASE or otherwise).
Once this is done, the ext and apps packages should do imports such that these two packages are made optional/extras.
munch
is imported but not specified as optional dep in pyproject.toml
/setup.py
.
matgl/examples/trainer_beta/qm9_utils.py
Line 13 in d754da8
matgl/examples/trainer_beta/train.py
Line 12 in d754da8
Just running pip install -e ./matgl
and executing these scripts raises
ModuleNotFoundError: No module named 'munch'
I see the following code repeated almost verbatim everywhere that an atomic graph is used:
graph, state_attr = converter.get_graph(structure)
bond_vec, bond_dist = compute_pair_vector_and_distance(graph)
graph.edata["bond_vec"] = bond_vec
graph.edata["bond_dist"] = bond_dist
Would it make sense to have those computed in the converter itself? Since almost all use cases of an atomic graph will want those edge attributes.
No response
0.8.5
matgl
is holding up the whole MP stack with this error
ValueError: Bad serialized model or bad model name. It is possible that you have an older model cached. Please clear your cache by running
python -c "import matgl; matgl.clear_cache()"
/opt/hostedtoolcache/Python/3.11.6/x64/lib/python3.11/site-packages/matgl/utils/io.py:213:
It's affecting pymatgen
, atomate2
, matcalc
, emmet
, ...
Please add tests for whatever broke so this doesn't happen again!
No response
No response
No response
0.9.1
I retrained M3GNet-MP-2021.2.8-PES potentials following tutorial from https://matgl.ai/tutorials%2FTraining%20a%20M3GNet%20Potential%20with%20PyTorch%20Lightning.html using collected structures, energy, forces and stresses data from our structure relaxations.
When I attempt to relax any structure using that potential, I got error AttributeError: 'M3GNet' object has no attribute 'calc_stresses'
from __future__ import annotations
.
import warnings
from ase.md.velocitydistribution import MaxwellBoltzmannDistribution
from pymatgen.core import Lattice, Structure
from pymatgen.io.ase import AseAtomsAdaptor
import matgl
from matgl.ext.ase import M3GNetCalculator, MolecularDynamics, Relaxer
warnings.simplefilter("ignore")
pot = matgl.load_model("MyRetrainedPotential") # this was put into ~/.cache/matgl after training
relaxer = Relaxer(potential=pot) # this produces following error:
Cell In[14], line 1
----> 1 relaxer = Relaxer(potential=pot)
File ~/miniconda3/envs/mgl/lib/python3.9/site-packages/matgl/ext/ase.py:211, in Relaxer.__init__(self, potential, state_attr, optimizer, relax_cell, stress_weight)
200 """
201 Args:
202 potential (Potential): a M3GNet potential, a str path to a saved model or a short name for saved model
(...)
208 stress_weight (float): conversion factor from GPa to eV/A^3.
209 """
210 self.optimizer: Optimizer = OPTIMIZERS[optimizer.lower()].value if isinstance(optimizer, str) else optimizer
--> 211 self.calculator = M3GNetCalculator(
212 potential=potential,
213 state_attr=state_attr,
214 stress_weight=stress_weight, # type: ignore
215 )
216 self.relax_cell = relax_cell
217 self.potential = potential
File ~/miniconda3/envs/mgl/lib/python3.9/site-packages/matgl/ext/ase.py:146, in M3GNetCalculator.__init__(self, potential, state_attr, stress_weight, **kwargs)
144 super().__init__(**kwargs)
145 self.potential = potential
--> 146 self.compute_stress = potential.calc_stresses
147 self.compute_hessian = potential.calc_hessian
148 self.stress_weight = stress_weight
File ~/miniconda3/envs/mgl/lib/python3.9/site-packages/torch/nn/modules/module.py:1695, in Module.__getattr__(self, name)
1693 if name in modules:
1694 return modules[name]
-> 1695 raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
AttributeError: 'M3GNet' object has no attribute 'calc_stresses'
I would like to request the inclusion of an example demonstrating the M3GNet training for property prediction. Currently, there's no dedicated example available.
I attempted to adapt the MEGNet example(examples/Training a MEGNet Formation Energy Model with PyTorch Lightning.ipynb
) for M3GNet, but I encountered errors and unexpected issues during this process.
Please add an example notebook (or concise code snippet) demonstrating training M3GNet for property prediction. This addition will greatly assist users like me looking to utilize M3GNet for material property prediction tasks.
No response
I am trying to run the script here:
https://github.com/materialsvirtuallab/matgl/blob/main/examples/training/MEGNet/MP-2018.6.1-Eform/train_mp_eform.py
And I am getting the following error.
Traceback (most recent call last):
File "/home/trial/matgl-try/matgl/examples/training/MEGNet/MP-2018.6.1-Eform/train_mp_eform.py", line 130, in
trainer.train(
TypeError: train() got an unexpected keyword argument 'n_epochs'
Here is how to reproduce this issue assuming you have Conda or Miniconda installed:
Create a new directory:
mkdir matgel-try
Create a conda environment with Python 3.9:
conda create -n matgl-trial python=3.9
Activate the Conda environment:
conda activate matgl-trial
Clone the repo:
git clone https://github.com/materialsvirtuallab/matgl.git
cd into the repo:
cd matgl/
install the package using the editable option:
pip install -e .
Run the script:
python examples/training/MEGNet/MP-2018.6.1-Eform/train_mp_eform.py
In fact, running the help function in the class method through the command help(trainer.train)
gives the following:
Help on method train in module torch.nn.modules.module:
train(mode: bool = True) -> ~T method of matgl.utils.training.ModelTrainer instance
Sets the module in training mode.This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g. :class:`Dropout`, :class:`BatchNorm`, etc. Args: mode (bool): whether to set training mode (``True``) or evaluation mode (``False``). Default: ``True``. Returns: Module: self
The Arguments for the method do not seem to match the provided arguments. I think this error is due to updates that rolled out this month for both torch and pytorch-lightening but I am not sure how to fix it.
Dear developers,
Hello there! I hope you're doing well.
I have a quick question regarding the current state of GPU training support in your `matgl`` package.
As of now, does it support GPU training, or is it limited to CPU-related adaptation only?
When using default dgl
in matgl
, it would fail.
DGLError: [09:37:23] /opt/dgl/src/runtime/c_runtime_api.cc:82: Check failed: allow_missing: Device API cuda is not enabled. Please install the cuda version of dgl.
However, when cu102 dgl
is used, it would fail to load split_dataset from dgl.data.utils.
OSError Traceback (most recent call last)
Cell In[1], line 13
11 import pytorch_lightning as pl
12 import torch
---> 13 from dgl.data.utils import split_dataset
......
OSError: libcudart.so.10.2: cannot open shared object file: No such file or directory
---------------------------------------------------------------------------
I appreciate any insights you can provide on this matter. Thank you!
No response
0.0.3
The calc.results["energy"]
property from the M3GNetCalculator
should be a float but is a numpy array of length 1. This behavior is not observed in CHGNet, for what it's worth.
import matgl
from matgl.ext.ase import M3GNetCalculator
from ase.build import bulk
atoms = bulk("Cu")
potential = matgl.load_model("M3GNet-MP-2021.2.8-DIRECT-PES")
atoms.calc = M3GNetCalculator(potential)
e = atoms.get_potential_energy() # or atoms.calc.results["energy"]
print(e)
You can compare this with:
from ase.build import bulk
from ase.calculators.emt import EMT
atoms = bulk("Cu")
atoms.calc = EMT()
e = atoms.get_potential_energy()
print(e)
The output is:
array(-32.750034, dtype=float32)
Hi, thank you for these amazing codes.
I tried to follow the example notebook file to train M3GNet potential. I added trainer.test(model=lit_module, dataloaders=test_loader)
to test the model, but I got the RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
. I found from google that it can be due to required_grad=False
in the Tensor, but the error had not occured during training and validation, so it might not the cause of the error. Could you help me out to solve this? Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.