hkuds / dcrec Goto Github PK

View Code? Open in Web Editor NEW

45.0 45.0 5.0 11.47 MB

[WWW'2023] "DCRec: Debiased Contrastive Learning for Sequential Recommendation"

Home Page: https://arxiv.org/abs/2303.11780

License: MIT License

Python 100.00%

contrastive-learning graph-neural-networks self-supervised-learning sequential-recommendation

dcrec's Introduction

Hi there 👋

✨Welcome to the Data Intelligence Lab @ HKU!✨

🚀 Our Lab is Passionately Dedicated to Exploring the Forefront of the Data Science & AI 👨‍💻

dcrec's People

Contributors

Stargazers

Watchers

Forkers

kpn3569 megahanga seanby chenglongma joannaclever

dcrec's Issues

Questioning the Experiment Results in Your Paper

I recently read your paper and found it to be an interesting and informative read. However, I have some concerns regarding the experiment results presented in your paper.

Upon reviewing your experimental methodology and data analysis, I noticed some significant inconsistencies and irregularities in the results. Specifically, in Table 2, the performance of S3 and ICLR is exceptionally poor, while DuoRec's performance is unexpectedly good, which is inconsistent with previous experimental results and my own experimental observations..

Based on my observations, I have reasons to doubt that you did not make a reasonable parameter setting for the baseline and question the validity and accuracy of the experimental results in your paper. In light of this, I kindly request that you provide the parameters setting or tuning range for the baseline, as well as any additional information that may help clarify the issues I have raised.

Doubts about how the data partitioning method

Thank you very much for your paper. It has been very inspiring to me. I sent you an email yesterday, but I am concerned that you may not frequently check your inbox. Therefore, I am reaching out again on GitHub to ask my question.
The paper does not provide detailed information about the data partitioning method, such as the ML-20M dataset, which originally contains 20 million data points, but your experiments only utilize a small portion of it.Therefore, I would like to request your assistance.
May I ask if you can share the raw dataset used in your experiments in CSV format or inform me about the data partitioning method, solely for academic exchange purposes?

Problem of Evaluation Protocols

In Section 3.1.2 of your paper, you mentioned, "We follow [11, 20, 30] to adopt the leave-one-out strategy for model evaluation. Specifically, we treat the last interaction of each user as testing data, and designate the previous one as validation data. " However, the evaluation protocol employed in your released code does not correspond to the leave-one-out strategy. The log information for DCRec is as follows:

eval_args = {'mode': 'pop100', 'order': 'TO', 'split': {'RS': [0.8, 0.1, 0.1]}, 'group_by': 'user'}

In RecBole, the 'RS' parameter divides the dataset into train, val, and test sections based on ratios.

Training BUG

Thank you very much for your wonderful work. When I have Training loss is nan when I train, how should I solve this problem?

Doubts about assessment metrics in dissertations

Thank you very much for your frequent work. Here are the results of my reproduction training, the evaluation metrics are recall and NDCG, while the paper results show HR and NDCG, and the recall value of the code training is very close to the HR, is it a problem of the code, or a clerical error of the paper? This is rather confusing to me.

Issue with Model Functionality at Various Sampling Sizes

Hello,
I've encountered an issue with DCRec where it only functions correctly when the sampling size is set to 100. (pop100)
When I try to set the sampling size to either the total number of items or 200, it results in an error such as when I set it to full, pop200, pop3537. The model should function correctly regardless of the sampling size, whether it's the total number of items or 200 but it throws an error when the sampling size is set to anything other than 100.
I appreciate your time looking into this issue and would be grateful for any guidance or fixes you can provide.
Thank you.

Questions about the construction procedure of item_edges_a and item_edges_b in function build_adj_graph of file run_DCRec.py

Hello, thank you for your open source code. I have a detailed question that I hope to get answered:
In the build_adj_graph function of run_DCRec.py, there are two if branches in the loop body that will add elements to item_edges_a and item_edges_b, which will cause the same item in item_seq to be repeatedly added to item_edges_a and item_edges_b.

I don't quite understand the reason for this construction procedure. I would have thought the if branch was just for bounds checking. So I tried changing the second if to elif and found that it could still train successfully, but the metrics became worse
(test result: {'recall@1': 0.2101, 'recall@5': 0.4058, 'recall@10': 0.5149, 'ndcg@1': 0.2101, 'ndcg@5': 0.3128, 'ndcg@10': 0.3478} before change; and test result: {'hit@1': 0.1793, 'hit@5': 0.3911, 'hit@10': 0.5067, 'ndcg@1': 0.1793, 'ndcg@5': 0.2899, 'ndcg@10': 0.327} after change).

I would be very grateful if you could tell me what item_edges_a and item_edges_b are used for and why such a construction procedure is needed❤️

Doubt about data set statistics and experimental results

I'm so sorry to bother you again. After following the data preprocessing method you provided for the ML-20M dataset, I encountered two issues during the experiment.

The first issue is that I noticed that in "Table 1: Detailed statistics of experimental datasets," the term "Interactions" seems to represent the sum of session_id counts from both the ml-20m.train.inter and ml-20m.test.inter files, as they were concatenated in the code of DCRec. This sum does not reflect the actual number of user interactions. The true number of user interactions appears to be 1,856,746.

The second issue is that, following the data processing method you provided, I used the same data to generate input formats for both the DCRec and SURGE models. I verified that the number of interactions input to both DCRec and SURGE is 1,856,746. In the case of DCRec, I obtained an ndcg@10' result of 0.3001 on the test set, which matches the result reported in the paper. However, in the case of the SURGE experiment, I obtained an ndcg@10' result of 0.4391 on the test set, which is higher than the result presented in the paper.
Could it be because DCRec has more than 100 negative samples on the validation and test sets?

Thank you for your understanding and assistance. I greatly appreciate your guidance in addressing these issues. If my description is incorrect, please feel free to correct me. If I have offended you in any way, I sincerely apologize. Once again, thank you for your time and support.

hkuds / dcrec Goto Github PK

dcrec's Introduction

Hi there 👋

dcrec's People

Contributors

Stargazers

Watchers

Forkers

dcrec's Issues

Questioning the Experiment Results in Your Paper

Doubts about how the data partitioning method

Problem of Evaluation Protocols

Training BUG

Doubts about assessment metrics in dissertations

Issue with Model Functionality at Various Sampling Sizes

Questions about the construction procedure of item_edges_a and item_edges_b in function build_adj_graph of file run_DCRec.py

Doubt about data set statistics and experimental results

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs