microsoft / graphormer Goto Github PK
View Code? Open in Web Editor NEWGraphormer is a general-purpose deep learning backbone for molecular modeling.
License: MIT License
Graphormer is a general-purpose deep learning backbone for molecular modeling.
License: MIT License
Graphormer/graphormer/model.py
Line 20 in 740e6ff
When you re-initialize the weight of embedding, the weight of 0th index is also initialized by normal distribution, whose padding vector in the feature input will be non-zero. It should be wrong.
Hello, I am trying to use Graphormer on other commonly used datasets from MoleculeNet (https://moleculenet.org/datasets-1) to check the performance, such as BACE, BBBP, etc. I have used the default hparams in the script of molhiv, but the results are horrible...
Hello!
Any chance uploading a PreTrained models for the different experiments ?
Thanks a lot!
In file is2re.py
, should the expand_pos_relaxed
use pos
or pos_relaxed
?
Graphormer/graphormer/tasks/is2re.py
Line 114 in 377cf71
I followed https://graphormer.readthedocs.io/en/latest/Datasets.html
And it says 'fairseq-train: error: unrecognized arguments: --user-data-dir'
I am trying to use a Graphormer on brain intelligence regression problems. So, we are using brain connectivity as an input graph, and we are trying to solve the graph regression problem.
And we set the whole edge and node features as an integer.
We are facing these errors below.
expected tensor for argument #1 'indices' to have scalar type long but got torch.FloatTensor instead
So I tried to change the code in model.py (line 162 originally) in a diverse way.
However, after I changed X.type to the torch.cuda.LongTensor, I am still facing another error related to Cuda.
Can you help me with how to solve the problem?
Thank You
Considering that package Graphormer code to distribute this software.
model.py Line 114
in_degree, out_degree = batched_data.in_degree, batched_data.in_degree
?
Hi! this is Stella from Seoul National University.
I'd like to ask how can I implement regression task on Graphormer.
I adjusted ogb module for our data, and setted num_class as -1 like other regression datasets.
And I faced problem with editing model dimensions at model.py, line 62 ~ 75.
I think that 512*9+1 is something like vocabulary size, which is calculated by 512 * (number of categories of node features) + 1.
Is my guess right? And you said that it should be greater than the number of the class of all categories in issue #32, and how can I set this number in regression task? maybe number of graphs?
Thank you!
It looks like the only use of edge encodings right now is to change the bias of the attention, so in a toy example where there are just two identical nodes with a single edge between them, with the edge label(edge feature) either 0 or 1, and a binary classification task, where the wanted prediction is the label of this edge, how would the network solve this task? Just using the edge encoding as attention bias should not be enough here right? I am asking because, I successfully applied graphormer to a similer task, but now I am not exactly sure how it works
Hi, thanks for your research.I just see the attention, but the other modules are not available.
Can you share more code about the model?
Hi, I have a quick question that how can I customize the number of epochs for training?
Hi,
I trained some models on PCQM4M-LSC, ogbg-molhiv, and ZINC following the setting in the paper, and the results of PCQM4M-LSC and ogbg-molhiv are same as the paper. I also run experiment on ZINC several times, but the MAE is always more than 0.14 (with or without adding --intput_dropout_rate 0), which should be about 0.12 according to the paper. Here is my command:
python3 entry.py --dataset_name ZINC --hidden_dim 80 --ffn_dim 80 --num_heads 8 --tot_updates 400000 --batch_size 256 --warmup_updates 40000 --precision 16 --intput_dropout_rate 0 --gradient_clip_val 5 --num_workers 8 --gpus 1 --accelerator ddp --max_epochs 10000
Hi authors, thanks for your great work. As I have been trying to reproduce the No.1 result on ogb-pcba board, I didn't find the checkpoints mentioned in your paper that pretrained for the PCBA task. Therefore, I turned to use the PCQM checkpoint you provided for the PCQM task. But during loading the checkpoints, an error occured even I have set the hidden dimension and ffn dimension from 1024 to 768:
`RuntimeError: Error(s) in loading state_dict for Graphormer:
size mismatch for atom_encoder.weight: copying a param with shape torch.Size([4737, 768]) from checkpoint, the shape in current model is torch.Size([4609, 768]).
size mismatch for edge_encoder.weight: copying a param with shape torch.Size([769, 32]) from checkpoint, the shape in current model is torch.Size([1537, 32]).`
Thus, may I ask two questions about the reproducing process:
Looking forward to your reply.
Thank you!
What do I have to modify the code in order to try the model on ogbn-proteins? (Wrapper, collator..)
Hello:
I got an error when runing your code.Here:
"Graphormer/OGB-LSC/graphormer/src/ogb_wrapper.py", line 34, in preprocess_item
all_rel_pos_3d_with_noise = torch.from_numpy(algos.bin_rel_pos_3d_1(item.all_rel_pos_3d, noise=noise)).long()
AttributeError: module 'algos' has no attribute 'bin_rel_pos_3d_1'
hello,
Thank you for your sharing , this is a great work!
When i trained my data with your code, i found a problem.
When every epoch starts, it needs to wait for a long time to see the GPU utilization rised up..
What may be the reason for this ?
This issue is to maintain all features request on one page.
Note to contributors: If you want to work for a requested feature, re-open the linked issue. Everyone is welcome to work on any of the issues below.
Note to maintainers: All feature requests should be consolidated to this page. When there are new feature request issues, close them and create the new entries, with the link to the issues, in this page. The one exception is issues marked good first issue
. these should be left open so they are discoverable by new contributors.
we would like to call the voting here, to prioritize these requests.
If you think a feature request is very necessary for you, you can vote for it by the following process:
got the issue (feature request) number.
search the number in this issue, check the voting of it exists or not.
if the voting exists, you can add ๐ to that voting
if the voting doesn't exist, you can create a new voting by replying to this thread, and add the number in the it.
I got this error when I run the pcqv2.sh on 24G 3090
I'm trying to reproduce the reported results on OGB and ZINC datasets, but I failed to achieve the performance.
I first directly run the provided scripts hiv.sh
to train a graphormer on MolHiv dataset without pretraining. The final AUC is 73.10%. Then I followed the instructions and hyper-parameter settings in the paper to do pre-training. I pre-trained on the PCQM4M for 20 epochs (until the loss converge) and fine-tuned the model on MolHiv for 8 epochs (as specified in the script) The best result turn out to be 76.25%.
Despite some improvement, the final AUC is not as high as it was reported in the paper. I also tried to reproduce the result on ZINC via the example script. But the best MAE is 0.1576, which is lower than 0.122 reported in the paper.
I'm wondering what I'm likely to miss that results in my poor performance. Can I know more reproduction details? My python environment is elaborated as below:
pytorch==1.9.0
pytorch-geometric==1.7,2
pytorch-scatter==2.0.8
pytorch-sparse==0.6.11
pytorch-lightning==1.3.0
ogb==1.3.1
cudatoolkit==11.1
I'd really appreciate it if someone could share their reproduced results and give me some suggestions.
OSError: [Errno 12] Cannot allocate memory
self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory
OSError: [Errno 12] Cannot allocate memory
OSError: [Errno 12] Cannot allocate memory
Provide pre-trained 48-layer Graphormer (12 layers * 4 blocks) model on OC20 to help the community better play on this large-scale dataset.
Hi! Thanks for your great work! I wonder how does Graphormer perform on some traditional GNN graph classification benchmarks (such as ones used in the original GIN paper). I've tried to apply Graphormer in my task, but the result is not very ideal without pre-training. Are pre-training and a large dataset necessary for the distinguished performance of Graphormer?
Graphormer/graphormer/model.py
Line 114 in 740e6ff
This line has an obvious error. Thought you may want to know.
self.in_degree_encoder = nn.Embedding(64, hidden_dim, padding_idx=0)
Why the size of the degree dictionary is 64 or 512? Thanks!
Hi! This is Stella from Seoul National University, I'm getting a lot of help from your code.
I have a question about entry.py line 87.
Originally it has metric = 'valid_' + get_dataset(dm.dataset_name)['metric']
but when I run model, I faced error like this:
'pytorch_lightning.utilities.exceptions.MisconfigurationException: ModelCheckpoint(monitor='valid_mae') not found in the returned metrics: ['train_loss']. HINT: Did you call self.log('valid_mae', value) in the LightningModule?'
So I changed the line 87 as metric = 'train_loss'
and it runs well.
I'm quite afraid that I'm doing something wrong, is it right way to modify the code?
Here are some useful information for my project:
'num_class': 1,
'loss_fn': F.l1_loss,
'metric': 'mae',
'metric_mode': 'min',
Dear Graphormer authors,
thanks for this great piece of software!
I have some feature requests.
Can you please add the functionality for evidential deep learning?
See article:
ACS Cent. Sci. 2021, 7, 8, 1356โ1367
Please add the 10 smaller datasets from MoleculeNet to the benchmarks. They are ogbg-moltox21, ogbg-molbace, ogbg-molbbbp, ogbg-molclintox, ogbg-molmuv, ogbg-molsider, and ogbg-moltoxcast for (multi-task) binary classification, and ogbg-molesol, ogbg-molfreesolv, and ogbg-mollipo for regression.
See https://ogb.stanford.edu/docs/graphprop/
Please add functionality for molecular representation pre-training via attribute masking
See Strategies for Pre-training Graph Neural Networks
Please add metrics described in the Regression Metrics Guide
As the manual selection of parameters for a graph neural network is difficult, please add support
for some of the automated machine learning techniques.
See for example techniques described in AutoGL
Many thanks.
Hi, thank you for your exciting work on graphformer. I am curious in understanding the mechanisims for this model. I tried to declare the example Encoder layer. I commented out the data import lines.
It seems the Multihead Attention is not imported and I am not sure whether this MHA module under graphformer is customized or not. I am mostly curious on the implementation of spatial encoding part.
May I know is it possible for you to provide a toy example? May be a forward pass for a random 10x10 node matrix will do.
Dear authors, thank you for this exciting work. Do you have the results on ogbg-molhiv and ogbg-molpcba without pretraining?
Thank you for your leaderboard submission. Please provide the exact command to reproduce your leaderboard results.
Hello,
I got a problem while using https://github.com/microsoft/Graphormer/blob/main/graphormer/algos.pyx
Here is the error :
Traceback (most recent call last):
File "C:\Users\James\AppData\Local\Temp/ipykernel_1060/863204263.py", line 1, in <module>
shortest_path_result, path = algos.floyd_warshall(adj.numpy())
File "algos.pyx", line 19, in algos.floyd_warshall
cdef numpy.ndarray[long, ndim=2, mode='c'] path = numpy.zeros([n, n], dtype=numpy.int64)
ValueError: Buffer dtype mismatch, expected 'long' but got 'long long'
I'm using Windows-64bit and Python 3.8.2. Numpy version is 1.21.2
Provide pre-trained 24-layer Graphormer (Graphormer-large with 1024 hidden dims) on PCQM4M and PCQM4M v2 to help the community better utilize Graphormer on downstream tasks.
Calculation of shortest path on CPU would be the bottlenect when batchsize is small. High efficient implementation comparing to current Cython is desired.
in_degree, out_degree = batched_data.in_degree, batched_data.in_degree
--->
in_degree, out_degree = batched_data.in_degree, batched_data.out_degree
how can i add a new dataset for classification with classes > 2 ?
Hello,
I have read your code and have some question.
Which part of the code is the encoding method mentioned in the paper specifically reflected?
Hi,
Thanks for your interesting work. I have a problem regarding the evaluation. I downloaded your checkpoints from here, then I run the following command as mentioned in the Readme (for all_fold_seed0
checkpoint):
conda activate graphormer-lsc
export arch="--ffn_dim 768 --hidden_dim 768 --attention_dropout_rate 0.1 --dropout_rate 0.1 --n_layers 12 --peak_lr 2e-4 --edge_type multi_hop --multi_hop_max_dist 20 --weight_decay 0.0 --intput_dropout_rate 0.0"
export ckpt_path="checkpoints"
export ckpt_name="all_fold_seed0.ckpt"
bash inference.sh
The output log is:
Global seed set to 1
> PCQM4M-LSC loaded!
{'num_class': 1, 'loss_fn': <function l1_loss at 0x7fc2381b3950>, 'metric': 'mae', 'metric_mode': 'min', 'evaluator': <ogb.lsc.pcqm4m.PCQM4MEvaluator object at 0x7fc1995d2110>, 'dataset': MyPygPCQM4MDataset2(3803453), 'max_node': 128}
> dataset info ends
total params: 47167841
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
Global seed set to 1
initializing ddp: GLOBAL_RANK: 0, MEMBER: 1/1
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
len(val_dataloader) 1487
Validating: 100%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ| 1487/1487 [03:04<00:00, 7.42it/s]
0.027769196778535843
--------------------------------------------------------------------------------
DATALOADER:0 VALIDATE RESULTS
{'valid_mae': 0.027769196778535843}
--------------------------------------------------------------------------------
[{'valid_mae': 0.027769196778535843}]
I assumed that I should get results near Table 1 "validate MAE column", but it's different from that. Do I miss something?
Thanks for your help.
def convert_to_single_emb(x, offset=512):
feature_num = x.size(1) if len(x.size()) > 1 else 1
feature_offset = 1 +
torch.arange(0, feature_num * offset, offset, dtype=torch.long)
x = x + feature_offset
return x
could you tell me why add this func? i am very about that?
thanks!
hi, I am having trouble importing PygPCQN4MDataset, I ran the following line of code from ogb.lsc.pcqm4m_pyg import PygPCQM4MDataset
and it threw this error ImportError: cannot import name 'smiles2graph' from 'ogb.utils' (/usr/local/lib/python3.7/dist-packages/ogb/utils/__init__.py)
. I am trying to run it the code on colab and fulfilled requirements of code but this error poped up.
The architecture design has minor changes between Graphormer v1.0 and v2.0, leading to the requirement of new hyper-parameter configurations of pre-trained model for ogbg-molpcba.
Hi. What does the function convert_to_single_emb mean? And why the original features should be added an offset? Thank you.
Hi,
How can I print the eval accuracy after each epoch in the new version of Graphormer?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.