thudm / comirec Goto Github PK

View Code? Open in Web Editor NEW

277.0 5.0 75.0 23 KB

Source code and dataset for KDD 2020 paper "Controllable Multi-Interest Framework for Recommendation"

Python 100.00%

recommendation recommender-system multi-interest controllable

comirec's Introduction

Controllable Multi-Interest Framework for Recommendation

Original implementation for paper Controllable Multi-Interest Framework for Recommendation.

Yukuo Cen, Jianwei Zhang, Xu Zou, Chang Zhou, Hongxia Yang, Jie Tang

Accepted to KDD 2020 ADS Track!

Prerequisites

Python 3
TensorFlow-GPU >= 1.8 (< 2.0)
Faiss-GPU

Getting Started

Installation

Install TensorFlow-GPU 1.x
Install Faiss-GPU based on the instructions here: https://github.com/facebookresearch/faiss/blob/master/INSTALL.md
Clone this repo git clone https://github.com/THUDM/ComiRec.

Dataset

Original links of datasets are:
- http://jmcauley.ucsd.edu/data/amazon/index.html
- https://tianchi.aliyun.com/dataset/dataDetail?dataId=649&userId=1
Two preprocessed datasets can be downloaded through:
- Tsinghua Cloud: https://cloud.tsinghua.edu.cn/f/e5c4211255bc40cba828/?dl=1
- Dropbox: https://www.dropbox.com/s/m41kahhhx0a5z0u/data.tar.gz?dl=1
You can also download the original datasets and preprocess them by yourself. You can run python preprocess/data.py {dataset_name} and python preprocess/category.py {dataset_name} to preprocess the datasets.

Training

Training on the existing datasets

You can use python src/train.py --dataset {dataset_name} --model_type {model_name} to train a specific model on a dataset. Other hyperparameters can be found in the code. (If you share the server with others or you want to use the specific GPU(s), you may need to set CUDA_VISIBLE_DEVICES.)

For example, you can use python src/train.py --dataset book --model_type ComiRec-SA to train ComiRec-SA model on Book dataset.

When training a ComiRec-DR model, you should set --learning_rate 0.005.

Training on your own datasets

If you want to train models on your own dataset, you should prepare the following three(or four) files:

train/valid/test file: Each line represents an interaction, which contains three numbers <user_id>,<item_id>,<time_stamp>.
category file (optional): Each line contains two numbers <item_id>,<cate_id> used for computing diversity..

Common Issues

The computation of NDCG score.

I'm so sorry that the computation of NDCG score in the original version (now in `paper` branch) is not consistent with the definition in the paper, as mentioned in the issue #6. I have updated the computation of NDCG score in the `master` branch according to the correct definition. For reproducing the NDCG scores reported in the paper, please use the `paper` branch. By the way, I personally recommend to use the reported results of recall and hit rate only.

If you have ANY difficulties to get things working in the above steps, feel free to open an issue. You can expect a reply within 24 hours.

Acknowledgement

The structure of our code is based on MIMN.

Cite

Please cite our paper if you find this code useful for your research:

@inproceedings{cen2020controllable,
  title = {Controllable Multi-Interest Framework for Recommendation},
  author = {Cen, Yukuo and Zhang, Jianwei and Zou, Xu and Zhou, Chang and Yang, Hongxia and Tang, Jie},
  booktitle = {Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},
  year = {2020},
  pages = {2942–2951},
  publisher = {ACM},
}

comirec's People

Contributors

Stargazers

Watchers

comirec's Issues

puzzle at user_id in training and serving

Thanks for providing the code!!
user_id is transformed into index in train、test and evaluation, so the model can predict for user whose user_id exists in train, how about other users not exist in train?
I'm very puzzled about how to handle user_id feature in recommendation

question about capsule softmax weights

capsule_weight = tf.stop_gradient(tf.zeros([get_shape(item_his_emb)[0], self.num_interest, self.seq_len]))

capsule_softmax_weight = tf.nn.softmax(capsule_weight, axis=1)
capsule_softmax_weight = tf.where(tf.equal(atten_mask, 0), paddings, capsule_softmax_weight)
capsule_softmax_weight = tf.expand_dims(capsule_softmax_weight, 2)

why softmax is performed on axis 1, which represents different user interest? According to the paper, The item embeddings of the user sequence can be viewed as primary capsules, then the weights of capsules in next layer should sum to 1. So I think softmax should be perfomed on the user sequence. The code should be capsule_softmax_weight = tf.nn.softmax(capsule_weight, axis=2)

请问你们迭代了多少次才得到的论文里的效果？为什么迭代了30000次就开始效果下降了啊？

计算NDCG

在master branch中的95到97行计算idcg的过程当中
idcg = 0.0
--------for no in range(recall): # 原来的代码
for no in range(len(true_item_set)): # 我觉得应该要遍历len(true_item_set) 次
idcg += 1.0 / math.log(no+2, 2)

请问我的理解有错误吗？否则按照您原来的代码，假设我的ture_item_set 有5个商品，那么我topn推荐出来如果中了1个或者2个（简单起见都假设排在了top1和top2），这两种情况计算的最终ndcg是一样的（都为1）。但是显然不应该一样。

如何预测

你好，请问如何用该模型进行预测

关于代码中NDCG@N计算方式的问题

论文中的NDCG@N如下：

代码中的ndcg计算方式如下：
dcg：https://github.com/THUDM/ComiRec/blob/master/src/train.py#L143
idcg：https://github.com/THUDM/ComiRec/blob/master/src/train.py#L148
论文中的dcg求和项分母是k，是指给用户u推荐的第k个item。
论文中的k是否对应dcg计算代码中的no？
我理解no是用户实际交互项中的编号，并不是给用户推荐的第k个item。

请问这里应该怎么理解？谢谢

Run on Toys dataset

Hello, have you tried to run on Amazon Toys dataset? The effect is not very good after I run it.

AttributeError: 'NoneType' object has no attribute 'train'

你好！我在运行代码的过程中出现了以上报错，这是什么原因呢？该怎么处理？非常感谢

how to preprocess the amazon book data?

关于分布式训练

请问一下，对于interest和behavior embedding之间的连接权重logit的更新，因为是用动态路由的方式来更新，而不是梯度下降。如果是用分布式的训练，那不是存在各个worker之间的最新logit之间相互覆盖的问题了？关于分布式训练这点，请问作者你怎么考虑呢？

关于item编码的问题

@cenyk1230
您好！有个小细节的问题想咨询一下你，这个item的编码是从1开始的吗？
https://github.com/THUDM/ComiRec/blob/master/preprocess/data.py#L57
这么做的主要目的是什么呢

Book dataset statistics can't align

Hi, I use your provided preprocess script to process book dataset. The data file is also downloaded at the website as you mentioned. However, I got the book statistics as follows:

total items: 367982
total users: 603668
total behaviors: 8898041

While the processed data you provided is as follows:
total items: 313966
total users: 459133
total behaviors: 8898041

All I just did was:

Download the dataset from http://jmcauley.ucsd.edu/data/amazon/index.html
Decompress the file to get reviews_Books_5.json
Run script python preprocess/data.py book

The misalignment makes me confused. Could you elaborate on it or publish the latest version of data.py?

Thank you for your feedback!

达不到paper的效果

hi,dear
用我的实际数据训练，测试集和验证集相同，均采用的是user的最后一个点击，除此之外的为训练集。
然而实际效果如下，整个训练过程这些参数都没有改变。

doic_ComiRec-SA_b128_lr0.001_d64_len20_doic
iter: 51000, train loss: 1.8256, valid recall: 0.000007, valid ndcg: 0.000007, valid hitrate: 0.000007
time interval: 168.0316 min
doic_ComiRec-SA_b128_lr0.001_d64_len20_doic
iter: 52000, train loss: 1.8214, valid recall: 0.000007, valid ndcg: 0.000007, valid hitrate: 0.000007
model restored from best_model/doic_ComiRec-SA_b128_lr0.001_d64_len20_doic/
valid recall: 0.000007, valid ndcg: 0.000007, valid hitrate: 0.000007, valid diversity: 0.000000
test recall: 0.000007, test ndcg: 0.000007, test hitrate: 0.000007, test diversity: 0.000000

如果说是item类别太少，也就是说分类的粒度比较粗，那么细分后效果会不会好点？

Questions about Performance (unable to reproduce)

I ran MIND 5 times on amazon, and the end result is not ideal.
my running cmd is python3 -u ./src/train.py --model_type MIND 2>&1 | tee MIND
The summary of the running results is as follows

test recall: 0.061517, test ndcg: 0.048181, test hitrate: 0.130982, test diversity: 0.236232
test recall: 0.061752, test ndcg: 0.048717, test hitrate: 0.131164, test diversity: 0.187387
test recall: 0.063693, test ndcg: 0.049689, test hitrate: 0.133500, test diversity: 0.193425
test recall: 0.061744, test ndcg: 0.049157, test hitrate: 0.131297, test diversity: 0.216244
test recall: 0.060717, test ndcg: 0.047014, test hitrate: 0.128497, test diversity: 0.197372

book_MIND_b128_lr0.001_d64_len20.zip

评估指标计算疑惑

为什么计算评估指标时，多兴趣的item list排序要reverse=True，不应该取topN最小的吗

Item remap id 0 may cause KeyError

In your preprocess code, item new remap id start from 1, and 0 for padding item id.

But in train.evaluate_full function, D, I = gpu_index.search(user_embs, topN), I may contains 0, since item_cate_map dont have 0 as key, so an error occurred in compute_diversity function.

Two error example logs are shown below

GRU4REC_5.LOG
DNN_5_taobao.LOG

save model for predicting user embeddings

How to save model for predicting.
I do the follow:
def save_user_model(self, sess, path): builder = tf.compat.v1.saved_model.Builder(path) sig_def = tf.compat.v1.saved_model.predict_signature_def( inputs={'mid_his_batch_ph': self.mid_his_batch_ph, 'mask': self.mask}, outputs={'output': self.user_eb} ) builder.add_meta_graph_and_variables( sess, tags=['serve'], signature_def_map={ tf.compat.v1.saved_model.DEFAULT_SERVING_SIGNATURE_DEF_KEY: sig_def } ) builder.save()

it outputs different user embedding for the same input.

训练集采样方式的一些疑问

大佬您好，本人在学习代码的过程中做了一些修改，发现如果将data_iterator.py 67行的代码：
k = random.choice(range(4, len(item_list)))改成：
k = len(item_list)-1后
第一个epoch正常，但是进入到后续的epoch虽然loss还在继续下降，但是其余各项指标ndcg、hr等也大幅下降
请问这个是因为产生了过拟合吗？
因为个人的理解使用原始代码的方式每次的训练数据都会不一样
期待您的解答！

关于attention-weight的计算

大佬, 请问在model.py里计算softmax的时候,为什么要先计算tf.exp(x,1)呢, tf.exp(x,1) 不就刚好是x

Questions about MostPopular Recall on Taobao

I implemented the MostPopular method myself and tested it on your dataset.
On Amazon book, I have the same result as you.
On Taobao, I have the same HitRate result as you, but the result of Recall is different from yours.
My Taobao Recall result is 0.366@20, 0.667@50
Can you open source your MostPopular code?

questions about computing ndcg score

The NDCG score computed in L91~L94 seems contradicted with the definition in the paper: dcg is computed according to item in the ground truth with the k-th nearest time instead of the k-th recommended item as mentioned in Eq. (16) of the paper.
Is it correct?

关于MIND的sampled softmax layer

你好，请问一个MIND的问题。
最后的sampled softmax层负责抽取样本集，和attention层输出的user_eb计算softmax。但此时attention层的query是正样本item_eb，所以输出的这个user_eb也是和正样本item_eb相关的用户兴趣。那是不是就变成了用正样本得出的用户嵌入和负样本算softmax了？
而正确的softmax应该是attention层的query为负样本的情况下得到的用户嵌入（和负样本相关的用户兴趣），和这个负样本的嵌入来计算得到用户点击这个负样本的可能性吧。

关于capsule_softmax_weight 的计算

大佬， capsule_weight = tf.stop_gradient(tf.truncated_normal([get_shape(item_his_emb)[0], self.num_interest, self.seq_len], stddev=1.0))
model.py第153行，按照胶囊网络的结构， capsule_softmax_weight = tf.nn.softmax(capsule_weight, axis=1) 应该是在 seq_len这个维度做softmax才对呀
这样保证低capsule的输入权重相加之和是1，这块没理解为啥在 num_interest的维度做softmax

关于停止梯度更新的问题

作者：
您好！非常感谢您的开源工作，对于源码，我有一个小问题，在胶囊网络里面这个停止梯度更新是起什么作用呢？
@cenyk1230

ComiRec/src/model.py

Line 139 in 8303ab4

if self.stop_grad:

Can't run the code

关于loss的问题

作者您好，我看您的代码，在多向量建模部分，在训练的时候，user_eb是先和item_eb做了交互之后（即做了attention），再和mid_batch_ph，也就是next item做交叉熵loss，这样子不会泄漏信息吗？

也就是说，在与next item做loss前就已经和next item做过交互了，通过next item来确定多向量的权重，再和next item做loss，这样做感觉不是很合理？

GRU4Rec book数据集，没有论文中的recall高

请问你的论文中GRU4Rec的参数设置是？我使用的Adam的优化器，测试了lr=0.005和0.001，topN 为50，我的recall才0.35左右

Question about masking of capsule softmax in lines 150~155 of model.py

Is there a reason to apply masking after softmax in lines 150~155? This seems to make the sum of the coupling coefficients (c_ij) less than 1.

On the other hand, in Model_ComiRec_SA, masking was done before softmax (lines 233~237).

I think the masking of lines 150~155 should also be applied before softmax. This is because the thesis has the following passage.

"The coupling coefficients between capsule i and all the capsules in the next layer should sum to 1."

https://arxiv.org/pdf/2005.09347.pdf

关于 sampled softmax loss

你好，关于ComiRec和MIND这类方法，一直有一个疑惑，模型是否只能使用id类特征，如果在生产实践中，item_emb的生成使用了很多特征，比如100个，那么如何进行sampled softmax loss呢，其中weight(item_emb look_up_table)参数怎么定义呢？

关于胶囊网络停止梯度更新疑问（补充）

作者虽然在前面issue中回答过：stop the backward propagation gradient from capsule weights to item embedding，但是还想进一步了解一下，1）这里只是停止了前N-1次迭代时的梯度，最后一次迭代还是开放了梯度，那应该没法完全防止梯度传播到item embedding吧，不知道是否理解有误？2）而且代码是将映射矩阵w乘item Embedding后才停止的梯度，映射矩阵w肯定是要更新的，那怎么防止w更新而item Embedding不更新呢？3）即便真的防止了梯度传播到item embedding，为什么要防止呢？只有item embedding是现成的才不用更新吧，如果场景是需要端到端学习来同步得到item embedding，代码中stop Gradient是不是就不需要了？

模型问题

Hi，
请问ComiRec-DR和MIND的具体区别在哪里呢？根据代码中./src/model.py的188-208行，是否可以认为两个算法的区别在于bilinear_type设置的不同？具体就是bilinear的映射方式不同、以及capsule_weight初始化不同，一共有这两处区别。

恳请解答，感谢。

关于capsule_weight的疑问

有2个地方想请问下：

这里capsule_weight是用的普通的tensor而不是variable，请问是对于每个batch的每个样本，一次前向计算时都会做随机初始化然后做3次迭代吗？那inference的时候也是需要做迭代过程？
我这边的几个训练任务，得到的capsule_weight以及之后的dnn的各层输出的norm值都非常大，请问这是正常的吗？

关于model.py中class Model_DNN的一个问题

Model_DNN实现中：
masks = tf.concat([tf.expand_dims(self.mask, -1) for _ in range(hidden_size)], axis=-1)是否应该改成
masks = tf.concat([tf.expand_dims(self.mask, -1) for _ in range(embedding_dim)], axis=-1)?

关于sampled_softmax_loss的使用

大佬您好，以我个人的理解，发现源代码中，以0作为#PAD的替代，那么这样一来在sampled_softmax_loss中会出现以最大的概率取0作为neg_sample的情况，而0并不是真实的item，这样一来是否有些不合理呢？不知道我的理解有没有错误，望指正！

关于sampled softmax的一个问题

hi～，请问你们num_sampled这个参数的值为什么设置成这个 self.neg_num * self.batch_size ，为什么带上batch_size

关于item_id长度问题

大佬你好！

我在尝试用我们公司的数据复现论文，我用的item有2千万多个，所以我重新编码item_id之后最大的item_id长度有8位数，这种情况下train出来的item_embedding各个维度都为0。想请教一下我哪里处理的不对嘛？

达不到论文效果

我直接下载代码和数据，运行了readme.md 中的指令 python src/train.py --dataset book --model_type ComiRec-SA, 得到的效果为

valid recall: 0.082050, valid ndcg: 0.128702, valid hitrate: 0.163897, valid diversity: 0.220184
test recall: 0.080504, test ndcg: 0.127190, test hitrate: 0.161744, test diversity: 0.221127

与原文中的效果出入很大，

请问是可能是什么问题呢？