spacelearner / sessionrec-pytorch Goto Github PK

View Code? Open in Web Editor NEW

60.0 3.0 12.0 375 KB

Session-based Recommendation

License: MIT License

Python 99.44% Shell 0.56%

recommendation-system session-based-recommendation-system session-based-recommendation

sessionrec-pytorch's Introduction

Session-based Recommendation Library

MSGIFSR

This is the official implementation of Learning Multi-granularity User Intent Unit for Session-based Recommendation from WSDM 2022 and some other session-based recommendation models. We use DGL library and mainly follow the implementation of lessr. (Tianwen Chen, Raymond Wong, KDD 2020)

Baselines

We also reimplemented several current session-based recommendation baselines including SRGNN, NISER+ and tuned them through our best effert. They are summarized as follows. Leaderboards are comming soon.

Dataset

Download and extract the following datasets and put the files in the dataset folder named under datasets/$DATASETNAME

Then run the code in src/utils/data/preprocess to process them. When processing yoochoose, please run preprocess_yoochoose.py in datasets first and then run the code in src/utils/data/preprocess.

Run

first create the enveriment.

conda env create -f environment.yaml

then

bash start.sh $MODEL_NAME $DATASET_NAME

Experiment Results

We find that keeping the original order of training data makes the result better. It is due to the way of splitting the dataset. Current public session-based recommendation datasets usually split train/test data according to time. This will make the distribution of samples at the latter positions of the training data more similar to the test data than those at the former positions. Without shuffling the model will fit better. This is a common phemonemon in recommender systems since user interest evolves fast and too early samples will not help the recommendation. We also provide a version that shuffle the training dataset. In both settings the testing dataset is shuffled.

Citation

@inproceedings{10.1145/3488560.3498524,
      author = {Jiayan Guo and Yaming Yang and Xiangchen Song and Yuan Zhang and Yujing Wang and Jing Bai and Yan Zhang},
      title = {Learning Multi-granularity User Intent Unit for Session-based Recommendation},
      year = {2022},
      booktitle = {Proceedings of the 15'th ACM International Conference on Web Search and Data Mining {WSDM}'22},
}

sessionrec-pytorch's People

Contributors

Stargazers

Watchers

Forkers

felixhilyer qianrenjian eniacpro leepoyang guadzilla saesimcheon aitek230telu365 choco9966 ivanvrkic martinvrangalovski iamjiang wyk1002

sessionrec-pytorch's Issues

Creating environment

this command line:
conda env create -f environment.yml
should be:
conda env create -f environment.yaml

I cannot reproduce the performance of the model in the paper

First I used environment.yaml to configure the environment and start.sh to test on the diginetica dataset, here are the results for the last four epochs:
20%|██ | 6/30 [14:52<59:37, 149.08s/it] Epoch 5: MRR = 19.008%, Hit = 55.406%
Batch 8500: Loss = 4.0826, Time Elapsed = 21.61s
Batch 8600: Loss = 4.0721, Time Elapsed = 9.60s
Batch 8700: Loss = 4.0687, Time Elapsed = 9.53s
Batch 8800: Loss = 4.0753, Time Elapsed = 9.51s
Batch 8900: Loss = 4.0801, Time Elapsed = 10.03s
Batch 9000: Loss = 4.0732, Time Elapsed = 9.77s
Batch 9100: Loss = 4.0731, Time Elapsed = 9.63s
Batch 9200: Loss = 4.0772, Time Elapsed = 9.47s
Batch 9300: Loss = 4.0917, Time Elapsed = 10.03s
Batch 9400: Loss = 4.0949, Time Elapsed = 9.88s
Batch 9500: Loss = 4.0945, Time Elapsed = 9.58s
Batch 9600: Loss = 4.0916, Time Elapsed = 9.70s
Batch 9700: Loss = 4.0954, Time Elapsed = 9.76s
Batch 9800: Loss = 4.1006, Time Elapsed = 9.96s
23%|██▎ | 7/30 [17:21<57:01, 148.75s/it]Epoch 6: MRR = 19.013%, Hit = 55.383%
Batch 9900: Loss = 4.0622, Time Elapsed = 21.49s
Batch 10000: Loss = 4.0686, Time Elapsed = 9.53s
Batch 10100: Loss = 4.0630, Time Elapsed = 9.81s
Batch 10200: Loss = 4.0753, Time Elapsed = 9.31s
Batch 10300: Loss = 4.0691, Time Elapsed = 9.69s
Batch 10400: Loss = 4.0818, Time Elapsed = 9.66s
Batch 10500: Loss = 4.0660, Time Elapsed = 9.58s
Batch 10600: Loss = 4.0944, Time Elapsed = 9.87s
Batch 10700: Loss = 4.0769, Time Elapsed = 9.77s
Batch 10800: Loss = 4.0792, Time Elapsed = 9.62s
Batch 10900: Loss = 4.0737, Time Elapsed = 9.85s
Batch 11000: Loss = 4.0862, Time Elapsed = 9.36s
Batch 11100: Loss = 4.0784, Time Elapsed = 9.50s
Batch 11200: Loss = 4.0959, Time Elapsed = 10.21s
Epoch 7: MRR = 19.006%, Hit = 55.379%
27%|██▋ | 8/30 [19:48<54:23, 148.35s/it]Batch 11300: Loss = 4.0781, Time Elapsed = 21.73s
Batch 11400: Loss = 4.0772, Time Elapsed = 9.75s
Batch 11500: Loss = 4.0463, Time Elapsed = 9.50s
Batch 11600: Loss = 4.0733, Time Elapsed = 9.85s
Batch 11700: Loss = 4.0664, Time Elapsed = 9.95s
Batch 11800: Loss = 4.0548, Time Elapsed = 9.84s
Batch 11900: Loss = 4.0820, Time Elapsed = 9.80s
Batch 12000: Loss = 4.0775, Time Elapsed = 9.51s
Batch 12100: Loss = 4.0711, Time Elapsed = 9.55s
Batch 12200: Loss = 4.0749, Time Elapsed = 9.63s
Batch 12300: Loss = 4.0626, Time Elapsed = 9.86s
Batch 12400: Loss = 4.0845, Time Elapsed = 9.82s
Batch 12500: Loss = 4.0734, Time Elapsed = 9.45s
Batch 12600: Loss = 4.0856, Time Elapsed = 9.66s
27%|██▋ | 8/30 [22:16<1:01:14, 167.02s/it]
Epoch 8: MRR = 19.006%, Hit = 55.379%
MRR@20 HR@20
19.015% 55.406%

Process finished with exit code 0

类型错误？？

Traceback (most recent call last):
File "E:/WorkSpace2--pycharm/MHISG-SessionRec-pytorch-main/src/scripts/main_msgifsr.py", line 137, in
mrr, hit = runner.train(args.epochs, args.log_interval)
File "E:\WorkSpace2--pycharm\MHISG-SessionRec-pytorch-main\src\utils\train.py", line 96, in train
mrr, hit = evaluate(self.model, self.test_loader, self.device)
File "E:\WorkSpace2--pycharm\MHISG-SessionRec-pytorch-main\src\utils\train.py", line 49, in evaluate
logits = model(*inputs)
File "E:\requireSoft\anaconda_37\lib\site-packages\torch\nn\modules\module.py", line 722, in call_impl
result = self.forward(*input, **kwargs)
File "E:\WorkSpace2--pycharm\MHISG-SessionRec-pytorch-main\src\models\msgifsr.py", line 246, in forward
feat = self.embeddings(iid)
File "E:\requireSoft\anaconda_37\lib\site-packages\torch\nn\modules\module.py", line 722, in call_impl
result = self.forward(*input, **kwargs)
File "E:\requireSoft\anaconda_37\lib\site-packages\torch\nn\modules\sparse.py", line 126, in forward
self.norm_type, self.scale_grad_by_freq, self.sparse)
File "E:\requireSoft\anaconda_37\lib\site-packages\torch\nn\functional.py", line 1813, in embedding
no_grad_embedding_renorm(weight, input, max_norm, norm_type)
File "E:\requireSoft\anaconda_37\lib\site-packages\torch\nn\functional.py", line 1733, in no_grad_embedding_renorm
torch.embedding_renorm(weight, input, max_norm, norm_type)
RuntimeError: Expected tensor for argument #2 'indices' to have scalar type Long; but got torch.IntTensor instead (while checking arguments for embedding_renorm)
有哪位大神知道怎么改嘛？

Is the processing code related to yoochoose missing？

类型出错？？？？

请问大家有出现这个问题吗？
dgl._ffi.base.DGLError: Expect all graphs to have the same schema on nodes["s2"].data, but graph 1 got
{'iid': Scheme(shape=(2,), dtype=torch.int32), 'last': Scheme(shape=(), dtype=torch.int32)}
which is different from
{'iid': Scheme(shape=(2,), dtype=torch.int64), 'last': Scheme(shape=(), dtype=torch.int32)}.

Is the processing code related to yoochoose missing？(Again)

Can you provide yoochoose data after your preprocessing?. There are too many mysteries in your paper for example the statitics in the paper about yoochoose are different from sr-gnn after preprocessing( the number items,...).
And also did you truncate_long_sessions all session has length higher 20 like you did in digentical?

Sementic Expander is really useful?

hello. I'm really benefiting from your research.

However, I have a question.
I know that sementic expander can and does learn the longterm depency of nodes through the gru layer.

In this process, all the nodes that came in through the batch are made into a single line and entered into the GRU, which seems to leave room for unrelated nodes to be connected to each other.

What are your thoughts on this?