jimliu96 / deoscirec Goto Github PK
View Code? Open in Web Editor NEWDeoscillated Graph Collaborative Filtering
Deoscillated Graph Collaborative Filtering
hi,dear
I have a dataset with 100,000 users ,40,000 items and about 4000,000 interactions,
but got the error,
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/data/logs/xulm1/DeosciRec/utilitys.py", line 336, in create_adj_mat
cross_adj_mat = band_cross_hop_laplacian(adj_mat, low_pass=low, high_stop=high)
File "/data/logs/xulm1/DeosciRec/utilitys.py", line 308, in band_cross_hop_laplacian
cross_adj = adj.dot(adj)
File "/data/logs/xulm1/myconda/lib/python3.7/site-packages/scipy/sparse/base.py", line 359, in dot
return self * other
File "/data/logs/xulm1/myconda/lib/python3.7/site-packages/scipy/sparse/base.py", line 480, in __mul__
return self._mul_sparse_matrix(other)
File "/data/logs/xulm1/myconda/lib/python3.7/site-packages/scipy/sparse/base.py", line 539, in _mul_sparse_matrix
return self.tocsr()._mul_sparse_matrix(other)
File "/data/logs/xulm1/myconda/lib/python3.7/site-packages/scipy/sparse/compressed.py", line 509, in _mul_sparse_matrix
np.asarray(other.indices, dtype=idx_dtype))
RuntimeError: nnz of the result is too large
So how to deal with the problem ?
hi,dear
after I clone the rp, then run the code down according to yours
$ python DGCF_osci.py --dataset ratings_ml-1m --model_type DGCF_osci --alg_type dgcf --epoch 500 --regs [0.01] --lr 0.001 --batch_size 1024 --low 0.01 --stop_step 5 --embed_size 64 --layer_size [64,64,64,64]
I add the low para, but tell me the error
usage: DGCF_osci.py [-h] [--weights_path [WEIGHTS_PATH]]
[--data_path [DATA_PATH]] [--proj_path [PROJ_PATH]]
[--dataset [DATASET]] [--pretrain PRETRAIN]
[--verbose VERBOSE] [--epoch EPOCH] [--residual RESIDUAL]
[--n_fold N_FOLD] [--embed_size EMBED_SIZE]
[--layer_size [LAYER_SIZE]] [--batch_size BATCH_SIZE]
[--test_epoch TEST_EPOCH] [--test_interval TEST_INTERVAL]
[--regs [REGS]] [--local_factor LOCAL_FACTOR] [--lr LR]
[--model_type [MODEL_TYPE]] [--adj_type [ADJ_TYPE]]
[--alg_type [ALG_TYPE]] [--interaction [INTERACTION]]
[--gpu_id GPU_ID] [--node_dropout_flag NODE_DROPOUT_FLAG]
[--node_dropout [NODE_DROPOUT]]
[--mess_dropout [MESS_DROPOUT]] [--Ks [KS]]
[--save_flag SAVE_FLAG] [--test_flag [TEST_FLAG]]
[--stop_step STOP_STEP] [--report REPORT]
DGCF_osci.py: error: unrecognized arguments: --low 0.01
could you pls help me ?
thx
Would you please provide the scripts to run DGCF on the four datasets?
Amongst others, we are not able to replicate the results!
Hi,
Thank you so much for sharing your codes.
But currently, after tuning the parameters (i.e., regs and lr) for many days,
I still cannot reproduce the results you claimed in your paper for some datasets.
Could you please share with me the best setting for the parameter in dataset Gowalla and Amazon?
Thank you so much.
Hello,
I'm trying to reimplement your algorithm in PyTorch but I'm having trouble understanding a couple details.
The paper says,
We also randomly dropout partial edges to prevent oversmoothing.
What's a partial edge? Looking through the code I'm seeing node_dropout
. Is that related?
Googling I found this paper https://arxiv.org/abs/1907.10903 DropEdge: Towards Deep Graph Convolutional Networks on Node Classification which seems to be related?
I didn't see any ablation in the paper about the dropout. Is it significant for performance?
Also, why do you use Xavier initialization for the LA weights? Isn't that type of initialization usually used for NN layer weights?
I've attempted a reimplementation in PyTorch for the recsys framework RecBole here, RUCAIBox/RecBole#594 so that it's convenient to compare with other algorithms, etc.
I replicated your experiment almost exactly afaict: MovieLens100k, 70-20-10 split, early stopping with Recall@20. The only difference I see is that I didn't remove users with few interactions as you say in the paper
For this dataset, we maintain users with at least 5 interactions.
I used HyperOpt to do a search on the hyperparameter ranges specified in the paper (with an added option for dropout probability between 0.1 and 0.5) limited to 50 trials.
DGCF results:
best params: {'dropout_prob': 0.24266119278104079, 'embedding_size': 128, 'learning_rate': 0.0016153742760160951, 'n_layers': 2, 'reg_weight': 2.031773354290135e-05}
'test_result': {'recall@20': 0.3248, 'mrr@20': 0.5986, 'ndcg@20': 0.3795, 'hit@20': 0.9618, 'precision@20': 0.2608}
I did the same for LightGCN
LightGCN results:
best params: {'embedding_size': 128, 'learning_rate': 0.002856632032475591, 'n_layers': 2, 'reg_weight': 1.43923729841778e-05}
'test_result': {'recall@20': 0.3336, 'mrr@20': 0.6135, 'ndcg@20': 0.3868, 'hit@20': 0.9724, 'precision@20': 0.2629}
These figures are quite different from your paper, the ndcg especially, but in particular LightGCN is winning in every metric.
Is there anything not written in the paper that I might be missing in my implementation?
And, btw, are you applying node dropout to LightGCN (even though it wasn't a part of the algorithm originally, afaik)?
Thanks for any help!
I cannot reach the paper results with your provided code and stated hyperparameters. Would you please provide the best result commands for ML-1M and ML-100k?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.