Hi Shyueping and Kenko The dgl implentation of M3GNet gives stochast

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

It seems that M3GNet UP is not having this issue. <a target="_blank" rel="noopener

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Stochasticity in M3GNet prediction,about materialsvirtuallab/matgl

Comments (13)

shyuep commented on June 1, 2024

Hmm interesting. The MEGNet implementation doesn't suffer from this problem. So it must be in the threebody computations. @kenko911 pls fix this asap.

from matgl.

shyuep commented on June 1, 2024

I debugged a little. I can confirm that in the forward method, the node_feat and edge_feat are consistent. But the

node_vec = self.readout(g)
vec = torch.hstack([node_vec, state_feat]) if self.inc...```

are stochastic. So there must be some issue with the readout layer. MEGNet does not use the same readout.

from matgl.

kenko911 commented on June 1, 2024

@BowenD-UCB Thanks a lot for reporting the issue and I will figure out the problem and fix it ASAP.

from matgl.

JiQi535 commented on June 1, 2024

It seems that M3GNet UP is not having this issue.

from matgl.

shyuep commented on June 1, 2024

Because the UP is not intensive and so does not call the readout.

from matgl.

shyuep commented on June 1, 2024

I have confirmed that the stochasticity resulted from the Set2Set in dgl.nn. For instance, I replaced the set2set call in Set2SetReadout by the following code:

            for i in range(3):
                set2set = Set2Set(in_feats, n_iters=self.num_steps, n_layers=self.num_layers)
                print(g.ndata["node_feat"])
                out_tensor = set2set(g, g.ndata["node_feat"])
                print(f"out={out_tensor}")

The out tensor is different all three times, despite there being no change in initialization and the input.

from matgl.

shyuep commented on June 1, 2024

I just pushed a fix. The trick is to init the Set2Set only once. Every time it is init, it introduces stochasticity.

from matgl.

shyuep commented on June 1, 2024

However, it should be noted that multiple loads of the model will still lead to stochasticity. I am not sure how to solve that. I am reopening this issue for now.

from matgl.

shyuep commented on June 1, 2024

@kenko911 I think you need to look into whether there is some random seed somewhere that needs to be fixed for dgl.nn for the Set2Set initialization. It is rather strange it is not deterministic.

from matgl.

shyuep commented on June 1, 2024

Scratch what I wrote previously. The dgl.seed does not do anything. Right now, the stochasticity with single model loading is solved. But the stochasticity with multiple model loadings is not.

from pymatgen.core import Lattice, Structure
import matgl
import torch
torch.manual_seed(0)
import random
random.seed(0)
import dgl
dgl.seed(0)
# This is the structure obtained from the Materials Project.
struct = Structure.from_spacegroup("Pm-3m", Lattice.cubic(4.1437), ["Cs", "Cl"], [[0, 0, 0], [0.5, 0.5, 0.5]])
for i in range(10):
    model = matgl.load_model("M3GNet-MP-2018.6.1-Eform")
    eform = model.predict_structure(struct)
    print(f"The predicted formation energy for CsCl is {float(eform.numpy()):.3f} eV/atom.")

still gives stochastic results. Note, the difference between @BowenD-UCB 's code and this one is just the model loading is now in the for loop, so a load is carried out each time. If the load is outside of the for loop, results are now deterministic within that session.

from matgl.

shyuep commented on June 1, 2024

See this is a known issue with LSTM: pytorch/pytorch#35661

from matgl.

shyuep commented on June 1, 2024

Note that it seems that if you fix the torch.manual_seed, all stochasticity is gone.

This is the fix that I pushed. I am still not sure why this non-deterministic behavior occurs in m3gnet but not MEGNet. MEGnet seems to work fine without any manual seed set. It may be due to the number of layers in the set2set? M3Gnet uses 3 layers while MEGnet uses just 1.

from matgl.

shyuep commented on June 1, 2024

@BowenD-UCB Just a note that Kenko pushed a final fix this morning. A new version has been released. The main issue was that the final layer was not in the init and so the weights were not saved. The new code fixes this problem. The only model that was affected was the M3GNet formation energy model. I also added an integration test to make sure it does not recur in future.

from matgl.

Stochasticity in M3GNet prediction about matgl HOT 13 CLOSED

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs