microsoft / graphormer Goto Github PK

View Code? Open in Web Editor NEW

2.0K 30.0 327.0 8.45 MB

Graphormer is a general-purpose deep learning backbone for molecular modeling.

License: MIT License

Python 93.95% Cython 1.48% Shell 4.57%

graph transformer deep-learning ai4science molecule-simulation

graphormer's People

Stargazers

Watchers

Forkers

chengxuanying lsj2408 minghao2016 codeaudit sailfish009 caodh fghg123 luo-chang hell-to-heaven laurasanchz2 fanyang-x sxxtyz shuowang-ai xrosliang herolin12 adbmd jianzhu jacknichao redsuncmx wonlee2019 yiyang-wang cwhyee tangzwei stjordanis yiqunzhang wangmou21 aspenstarss pierrehao diggerdu zk-soda dawnywu yifanfanfanfan zhanzheng8585 jiamingz1996 frankji 2011cine lanceknight yyht guome beira-bf jiechengzhao caved123 ljhz123 ytchx1999 fanwan saeyoon17 liujing1023 trendingtechnology alirezamshi-zz sstar-rrain lzx325 antecede birdyrrr chantalmp dpstart shamim-hussain wesley-stone standardgalactic superxiang heiidii xiexiaqing qiangbo1222 snowman828 oskarholmstrom zk-zhou masizhou pjin0 chaoyingyang zaixizhang createrll a-zhudong huyhoang17 sangyoon-bae laplacekorea liuchuang0059 elichienxd itsdaniele eyekid saluoxl zengwang430521 yanyipu hehuanma trumanw ndnng zebrajack daima2017 txd888 v-cyberpunk-01 jk3472 ihumonen shiyu1994 ztingz zhengsx dixinluo blackhalo-drake lumelon iphyer asclepiusinformatica vendettak paper-nlp

graphormer's Issues

The weight of embedding padding_idx=0 is not zero

Graphormer/graphormer/model.py

Line 20 in 740e6ff

module.weight.data.normal_(mean=0.0, std=0.02)

When you re-initialize the weight of embedding, the weight of 0th index is also initialized by normal distribution, whose padding vector in the feature input will be non-zero. It should be wrong.

train from scratch on molecule datasets

Hello, I am trying to use Graphormer on other commonly used datasets from MoleculeNet (https://moleculenet.org/datasets-1) to check the performance, such as BACE, BBBP, etc. I have used the default hparams in the script of molhiv, but the results are horrible...

May I know have you tried your model on these datasets without pretrained model? And do you have any suggestions on the hparams for these datasets if we want to train from scratch? I am trying to find out why the results are so bad...
For molhiv without pretrained model, I have tried with the provided script in the examples folder, with not adding the "checkpoint_path" argument, and train for 100 epochs. But the best val score is only around 0.763 and the corresponding test score is only 0.636... I don't know what goes wrong... May I know have you tried to use Graphormer directly on molhiv without pretrained model? How is the performance?
Thank you.

PreTrained Models

Hello!

Any chance uploading a PreTrained models for the different experiments ?
Thanks a lot!

Possible typo in the IS2RE processing file

In file is2re.py, should the expand_pos_relaxed use pos or pos_relaxed?

Graphormer/graphormer/tasks/is2re.py

Line 114 in 377cf71

pos.unsqueeze(0).expand(self.n_cells, -1, -1) + offsets

How to use custom datasets?

I followed https://graphormer.readthedocs.io/en/latest/Datasets.html
And it says 'fairseq-train: error: unrecognized arguments: --user-data-dir'

Using the code for the new dataset that we made.

I am trying to use a Graphormer on brain intelligence regression problems. So, we are using brain connectivity as an input graph, and we are trying to solve the graph regression problem.

And we set the whole edge and node features as an integer.

We are facing these errors below.
expected tensor for argument #1 'indices' to have scalar type long but got torch.FloatTensor instead

So I tried to change the code in model.py (line 162 originally) in a diverse way.

However, after I changed X.type to the torch.cuda.LongTensor, I am still facing another error related to Cuda.

Can you help me with how to solve the problem?

Thank You

[Feature Request] Package Graphormer code for PyPI

Summary

Considering that package Graphormer code to distribute this software.

A question about the file model.py

model.py Line 114
in_degree, out_degree = batched_data.in_degree, batched_data.in_degree
?

MultiHeadAttention

How can I do graph regression with graphormer?

Hi! this is Stella from Seoul National University.
I'd like to ask how can I implement regression task on Graphormer.
I adjusted ogb module for our data, and setted num_class as -1 like other regression datasets.
And I faced problem with editing model dimensions at model.py, line 62 ~ 75.

I think that 512*9+1 is something like vocabulary size, which is calculated by 512 * (number of categories of node features) + 1.
Is my guess right? And you said that it should be greater than the number of the class of all categories in issue #32, and how can I set this number in regression task? maybe number of graphs?

Thank you!

Edge Understanding of Graphormer

It looks like the only use of edge encodings right now is to change the bias of the attention, so in a toy example where there are just two identical nodes with a single edge between them, with the edge label(edge feature) either 0 or 1, and a binary classification task, where the wanted prediction is the label of this edge, how would the network solve this task? Just using the edge encoding as attention bias should not be enough here right? I am asking because, I successfully applied graphormer to a similer task, but now I am not exactly sure how it works

Could you please share more code about model?

Hi, thanks for your research.I just see the attention, but the other modules are not available.

Can you share more code about the model?

Customizing number of epochs for training.

Hi, I have a quick question that how can I customize the number of epochs for training?

Cannot Reproduce Result of ZINC

Hi,
I trained some models on PCQM4M-LSC, ogbg-molhiv, and ZINC following the setting in the paper, and the results of PCQM4M-LSC and ogbg-molhiv are same as the paper. I also run experiment on ZINC several times, but the MAE is always more than 0.14 (with or without adding --intput_dropout_rate 0), which should be about 0.12 according to the paper. Here is my command:

python3 entry.py --dataset_name ZINC --hidden_dim 80 --ffn_dim 80 --num_heads 8 --tot_updates 400000 --batch_size 256 --warmup_updates 40000 --precision 16 --intput_dropout_rate 0 --gradient_clip_val 5 --num_workers 8 --gpus 1 --accelerator ddp --max_epochs 10000

About reproducing PCBA result

Hi authors, thanks for your great work. As I have been trying to reproduce the No.1 result on ogb-pcba board, I didn't find the checkpoints mentioned in your paper that pretrained for the PCBA task. Therefore, I turned to use the PCQM checkpoint you provided for the PCQM task. But during loading the checkpoints, an error occured even I have set the hidden dimension and ffn dimension from 1024 to 768:
`RuntimeError: Error(s) in loading state_dict for Graphormer:

    size mismatch for atom_encoder.weight: copying a param with shape torch.Size([4737, 768]) from checkpoint, the shape in current model is torch.Size([4609, 768]).

    size mismatch for edge_encoder.weight: copying a param with shape torch.Size([769, 32]) from checkpoint, the shape in current model is torch.Size([1537, 32]).`

Thus, may I ask two questions about the reproducing process:

Can you provide the checkpoints that can be used to reproduce the PCBA result?
Is there a reason why the code cannot load the previous PCQM checkpoints even though having changed the ffn and hidden dimension?

Looking forward to your reply.
Thank you!

Node classification

What do I have to modify the code in order to try the model on ogbn-proteins? (Wrapper, collator..)

algos AttributeError

Hello:
I got an error when runing your code.Here:

"Graphormer/OGB-LSC/graphormer/src/ogb_wrapper.py", line 34, in preprocess_item
    all_rel_pos_3d_with_noise = torch.from_numpy(algos.bin_rel_pos_3d_1(item.all_rel_pos_3d, noise=noise)).long()
AttributeError: module 'algos' has no attribute 'bin_rel_pos_3d_1'

how to speed up one epoch

hello,
Thank you for your sharing , this is a great work!
When i trained my data with your code, i found a problem.
When every epoch starts, it needs to wait for a long time to see the GPU utilization rised up..
What may be the reason for this ?

Feature Requests & Voting Hub

This issue is to maintain all features request on one page.

Note to contributors: If you want to work for a requested feature, re-open the linked issue. Everyone is welcome to work on any of the issues below.

Note to maintainers: All feature requests should be consolidated to this page. When there are new feature request issues, close them and create the new entries, with the link to the issues, in this page. The one exception is issues marked good first issue. these should be left open so they are discoverable by new contributors.

Call for voting

we would like to call the voting here, to prioritize these requests.
If you think a feature request is very necessary for you, you can vote for it by the following process:

got the issue (feature request) number.
search the number in this issue, check the voting of it exists or not.
if the voting exists, you can add 👍 to that voting
if the voting doesn't exist, you can create a new voting by replying to this thread, and add the number in the it.

Efficiency related

High efficient shortest path implementation (#69 )
Compact Pre-trained Graphormer-base on PCQM4M (#80 )

New features:

Molecule Feature Extraction by RDKit (#71 )
Package Graphormer to PyPI (#72)
Windows support (#144)

New algorithms:

Examples of node classification and link prediction (#75 )

Objective and metric functions:

New pre-trained models:

Pre-trained Graphormer v2.0 for ogbg-molpcba (#70 )
Pre-trained Graphormer v2.0 on OC20 (#73 )
Pre-trained Graphormer-large v2.0 on PCAM4M (#74 )

Input enhancements:

Sparse Graph Representation (#82 )

Bug fixs:

I don't know the effect of adding offset to the x feature in datapreprocess

RuntimeError: CUDA error: out of memory

I got this error when I run the pcqv2.sh on 24G 3090

Unable to reproduce results

I'm trying to reproduce the reported results on OGB and ZINC datasets, but I failed to achieve the performance.

I first directly run the provided scripts hiv.sh to train a graphormer on MolHiv dataset without pretraining. The final AUC is 73.10%. Then I followed the instructions and hyper-parameter settings in the paper to do pre-training. I pre-trained on the PCQM4M for 20 epochs (until the loss converge) and fine-tuned the model on MolHiv for 8 epochs (as specified in the script) The best result turn out to be 76.25%.

Despite some improvement, the final AUC is not as high as it was reported in the paper. I also tried to reproduce the result on ZINC via the example script. But the best MAE is 0.1576, which is lower than 0.122 reported in the paper.

I'm wondering what I'm likely to miss that results in my poor performance. Can I know more reproduction details? My python environment is elaborated as below:

pytorch==1.9.0
pytorch-geometric==1.7,2
pytorch-scatter==2.0.8
pytorch-sparse==0.6.11
pytorch-lightning==1.3.0
ogb==1.3.1
cudatoolkit==11.1

I'd really appreciate it if someone could share their reproduced results and give me some suggestions.

OSError: [Errno 12] Cannot allocate memory self.pid = os.fork() OSError: [Errno 12] Cannot allocate memory OSError: [Errno 12] Cannot allocate memory OSError: [Errno 12] Cannot allocate memory

OSError: [Errno 12] Cannot allocate memory
self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory
OSError: [Errno 12] Cannot allocate memory
OSError: [Errno 12] Cannot allocate memory

Report errors from pytorch_lightning

I follow the conda setup in the readme file. However one error happens when importing pytorch_lightning:

Could you give me some ideas how to fix it? Thanks!

[Feature Request] Pre-trained Graphormer v2.0 model on OC20

Summary

Provide pre-trained 48-layer Graphormer (12 layers * 4 blocks) model on OC20 to help the community better play on this large-scale dataset.

Performance of Graphormer on traditional GNN benchmarks

Hi! Thanks for your great work! I wonder how does Graphormer perform on some traditional GNN graph classification benchmarks (such as ones used in the original GIN paper). I've tried to apply Graphormer in my task, but the result is not very ideal without pre-training. Are pre-training and a large dataset necessary for the distinguished performance of Graphormer?

Typo

Graphormer/graphormer/model.py

Line 114 in 740e6ff

in_degree, out_degree = batched_data.in_degree, batched_data.in_degree

This line has an obvious error. Thought you may want to know.

a question about Centrality Encoding

self.in_degree_encoder = nn.Embedding(64, hidden_dim, padding_idx=0)
Why the size of the degree dictionary is 64 or 512? Thanks!

Report errors with validation dataloader

Hi, I try to reproduce the results on ogb-lsc. It report errors at the 90% iteration of the first epoch. It seems there is something wrong with the validation dataloader. Could you give me some suggestions to fix it? Thanks!

Changing entry.py for MisconfigurationException error

Hi! This is Stella from Seoul National University, I'm getting a lot of help from your code.
I have a question about entry.py line 87.
Originally it has metric = 'valid_' + get_dataset(dm.dataset_name)['metric']

but when I run model, I faced error like this:
'pytorch_lightning.utilities.exceptions.MisconfigurationException: ModelCheckpoint(monitor='valid_mae') not found in the returned metrics: ['train_loss']. HINT: Did you call self.log('valid_mae', value) in the LightningModule?'

So I changed the line 87 as metric = 'train_loss'

and it runs well.

I'm quite afraid that I'm doing something wrong, is it right way to modify the code?
Here are some useful information for my project:

task: regression
input type : integer (originally continuous value, but discretized)
target type : real value
eval metric : rmse
features from data.py

```
        'num_class': 1,
```
```
        'loss_fn': F.l1_loss,
```
```
        'metric': 'mae',
```
```
        'metric_mode': 'min',
```

Evidential deep learning and other feature requests

Dear Graphormer authors,

thanks for this great piece of software!
I have some feature requests.

Can you please add the functionality for evidential deep learning?
See article:
ACS Cent. Sci. 2021, 7, 8, 1356–1367

Please add the 10 smaller datasets from MoleculeNet to the benchmarks. They are ogbg-moltox21, ogbg-molbace, ogbg-molbbbp, ogbg-molclintox, ogbg-molmuv, ogbg-molsider, and ogbg-moltoxcast for (multi-task) binary classification, and ogbg-molesol, ogbg-molfreesolv, and ogbg-mollipo for regression.
See https://ogb.stanford.edu/docs/graphprop/

Please add functionality for molecular representation pre-training via attribute masking
See Strategies for Pre-training Graph Neural Networks

Please add metrics described in the Regression Metrics Guide

As the manual selection of parameters for a graph neural network is difficult, please add support
for some of the automated machine learning techniques.
See for example techniques described in AutoGL

Many thanks.

On the example usage of graphformer encoder

Hi, thank you for your exciting work on graphformer. I am curious in understanding the mechanisims for this model. I tried to declare the example Encoder layer. I commented out the data import lines.

It seems the Multihead Attention is not imported and I am not sure whether this MHA module under graphformer is customized or not. I am mostly curious on the implementation of spatial encoding part.

May I know is it possible for you to provide a toy example? May be a forward pass for a random 10x10 node matrix will do.

Results on ogbg-mol* without pretraining

Dear authors, thank you for this exciting work. Do you have the results on ogbg-molhiv and ogbg-molpcba without pretraining?

About OGB submission

Thank you for your leaderboard submission. Please provide the exact command to reproduce your leaderboard results.

problem when using algos.pyx

Hello,
I got a problem while using https://github.com/microsoft/Graphormer/blob/main/graphormer/algos.pyx
Here is the error :

Traceback (most recent call last):

  File "C:\Users\James\AppData\Local\Temp/ipykernel_1060/863204263.py", line 1, in <module>
    shortest_path_result, path = algos.floyd_warshall(adj.numpy())

  File "algos.pyx", line 19, in algos.floyd_warshall
    cdef numpy.ndarray[long, ndim=2, mode='c'] path = numpy.zeros([n, n], dtype=numpy.int64)

ValueError: Buffer dtype mismatch, expected 'long' but got 'long long'

I'm using Windows-64bit and Python 3.8.2. Numpy version is 1.21.2

question about code

Hello:
I have a little question when runing your code.

Why execute the else statement and not execute if statement .

[Feature Request] Pre-trained Graphormer-large v2.0 on PCQM4M

Summary

Provide pre-trained 24-layer Graphormer (Graphormer-large with 1024 hidden dims) on PCQM4M and PCQM4M v2 to help the community better utilize Graphormer on downstream tasks.

[Feature Request] High Efficient Shortest Path Implementation

Summary

Calculation of shortest path on CPU would be the bottlenect when batchsize is small. High efficient implementation comparing to current Cython is desired.

bugs in model.py line 144.

    in_degree, out_degree = batched_data.in_degree, batched_data.in_degree

--->
in_degree, out_degree = batched_data.in_degree, batched_data.out_degree

How can i add a new dataset?

how can i add a new dataset for classification with classes > 2 ?

Training Graphormer-Small on dataset PCQM4M and Understanding of Edge Encoder

Hello,
I have read your code and have some question.

How to train Graphormer-Small on dataset PCQM4M, could you provide a sample? e.g. shell code
How to understand the dimension nn.Embedding(512 * 3 + 1, num_heads, padding_idx=0) of edge encoder? The shape of edge_input at the end of the function collator in colloator.py (before making a Batch format) is [n_graph, n_node, n_node, multi_hop_max_dist, n_edge_features], where n_edge_features=3. So how do the multiplication with 512 and the addition with 1 come?
Thanks a lot!

_pickle.UnpicklingError when attempting to load 3D position file

Encoding problem

Which part of the code is the encoding method mentioned in the paper specifically reflected?

Reproduce Validate MAE

Hi,

Thanks for your interesting work. I have a problem regarding the evaluation. I downloaded your checkpoints from here, then I run the following command as mentioned in the Readme (for all_fold_seed0 checkpoint):

conda activate graphormer-lsc
export arch="--ffn_dim 768 --hidden_dim 768 --attention_dropout_rate 0.1 --dropout_rate 0.1 --n_layers 12 --peak_lr 2e-4 --edge_type multi_hop --multi_hop_max_dist 20 --weight_decay 0.0 --intput_dropout_rate 0.0"
export ckpt_path="checkpoints"
export ckpt_name="all_fold_seed0.ckpt"
bash inference.sh

The output log is:

Global seed set to 1
 > PCQM4M-LSC loaded!
{'num_class': 1, 'loss_fn': <function l1_loss at 0x7fc2381b3950>, 'metric': 'mae', 'metric_mode': 'min', 'evaluator': <ogb.lsc.pcqm4m.PCQM4MEvaluator object at 0x7fc1995d2110>, 'dataset': MyPygPCQM4MDataset2(3803453), 'max_node': 128}
 > dataset info ends
total params: 47167841
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
Global seed set to 1
initializing ddp: GLOBAL_RANK: 0, MEMBER: 1/1
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
len(val_dataloader) 1487
Validating: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1487/1487 [03:04<00:00,  7.42it/s]
0.027769196778535843
--------------------------------------------------------------------------------                                                                                                                           
DATALOADER:0 VALIDATE RESULTS
{'valid_mae': 0.027769196778535843}
--------------------------------------------------------------------------------
[{'valid_mae': 0.027769196778535843}]

I assumed that I should get results near Table 1 "validate MAE column", but it's different from that. Do I miss something?

Thanks for your help.

a question about code

def convert_to_single_emb(x, offset=512):
feature_num = x.size(1) if len(x.size()) > 1 else 1
feature_offset = 1 +
torch.arange(0, feature_num * offset, offset, dtype=torch.long)
x = x + feature_offset
return x
could you tell me why add this func? i am very about that?
thanks!

unable to import PygPCQN4MDataset

hi, I am having trouble importing PygPCQN4MDataset, I ran the following line of code from ogb.lsc.pcqm4m_pyg import PygPCQM4MDataset and it threw this error ImportError: cannot import name 'smiles2graph' from 'ogb.utils' (/usr/local/lib/python3.7/dist-packages/ogb/utils/__init__.py). I am trying to run it the code on colab and fulfilled requirements of code but this error poped up.

[Feature Request] Molecule Feature Extraction by RDKit

Summary

This techical report describes the features used in Graphormer in KDD Cup 2021 quantum chemistry track. They are extracted by RDKit and could be found at here. We may consider to import it in v2.0 for molecule dataset.

[Feature Request] Pre-trained Graphormer v2.0 model for OGBG-MolPCBA

Summary

The architecture design has minor changes between Graphormer v1.0 and v2.0, leading to the requirement of new hyper-parameter configurations of pre-trained model for ogbg-molpcba.

Feature offset

Hi. What does the function convert_to_single_emb mean? And why the original features should be added an offset? Thank you.

How to evaluate the model with V.2?

Hi,
How can I print the eval accuracy after each epoch in the new version of Graphormer?

microsoft / graphormer Goto Github PK

graphormer's People

Stargazers

Watchers

Forkers

graphormer's Issues

Summary

Call for voting

Efficiency related

New features:

New algorithms:

Objective and metric functions:

New pre-trained models:

Input enhancements:

Bug fixs:

Summary

Summary

Summary

Summary

Summary

Recommend Projects

Recommend Topics

Recommend Org

Jobs