graytowne / caser_pytorch Goto Github PK

View Code? Open in Web Editor NEW

262.0 6.0 66.0 5.76 MB

A PyTorch implementation of Convolutional Sequence Embedding Recommendation Model (Caser)

License: GNU Lesser General Public License v3.0

Python 100.00%

caser_pytorch's Introduction

Caser-PyTorch

A PyTorch implementation of Convolutional Sequence Embedding Recommendation Model (Caser) from the paper:

Personalized Top-N Sequential Recommendation via Convolutional Sequence Embedding, Jiaxi Tang and Ke Wang , WSDM '18

Requirements

Python 2 or 3
PyTorch v0.4+
Numpy
SciPy

Usage

Install required packages.
run python train_caser.py

Configurations

Data

Datasets are organized into 2 separate files: train.txt and test.txt
Same to other data format for recommendation, each file contains a collection of triplets:

user item rating

The only difference is the triplets are organized in time order.
As the problem is Sequential Recommendation, the rating doesn't matter, so I convert them to all 1.

Model Args (in train_caser.py)

L: length of sequence
T: number of targets
d: number of latent dimensions
nv: number of vertical filters
nh: number of horizontal filters
ac_conv: activation function for convolution layer (i.e., phi_c in paper)
ac_fc: activation function for fully-connected layer (i.e., phi_a in paper)
drop_rate: drop ratio when performing dropout

Citation

If you use this Caser in your paper, please cite the paper:

@inproceedings{tang2018caser,
  title={Personalized Top-N Sequential Recommendation via Convolutional Sequence Embedding},
  author={Tang, Jiaxi and Wang, Ke},
  booktitle={ACM International Conference on Web Search and Data Mining},
  year={2018}
}

Comments

This PyTorch version may get better performance than what the paper reports.

When d=50, L=5, T=3, and set other arguments to default, after 20 epochs, mAP may get to 0.17 on the test set.

Acknowledgment

This project (utils.py, interactions.py, etc.) is heavily built on Spotlight. Thanks Maciej Kula for his great work.

caser_pytorch's People

Contributors

Stargazers

Watchers

caser_pytorch's Issues

Padding issue, when trying to run the traning on ml1m validation set

I got this error when trying I run the model on /ml1m/validation/{train,test}.txt set (a.k.a validation set) , meanwhile it works just fine for /ml1m/validation/{train,test}.txt set :

Traceback (most recent call last):
  File "train_caser.py", line 335, in <module>
    model.fit(train, test, verbose=True)
  File "train_caser.py", line 181, in fit
    precision, recall, mean_aps = evaluate_ranking(self, test, train, k=[1, 5, 10])
  File "C:\Users\username\JupyterLab\caser_pytorch-master\evaluation.py", line 53, in evaluate_ranking
    test = test.tocsr()
  File "C:\Users\username\JupyterLab\caser_pytorch-master\interactions.py", line 92, in tocsr
    return self.tocoo().tocsr()
  File "C:\Users\username\JupyterLab\caser_pytorch-master\interactions.py", line 84, in tocoo
    return sp.coo_matrix((data, (row, col)),
  File "C:\Users\username\Anaconda3\envs\PT1_6\lib\site-packages\scipy\sparse\coo.py", line 196, in __init__
    self._check()
  File "C:\Users\username\Anaconda3\envs\PT1_6\lib\site-packages\scipy\sparse\coo.py", line 285, in _check
    raise ValueError('column index exceeds matrix dimensions')
ValueError: column index exceeds matrix dimensions

Is there any explanation ?

ps: it's works fine when I comment the padding code (or putting a boolean variable to False), like fellow :

def to_sequence(self, sequence_length=5, target_length=1, padding=True):
...
        # change the item index start from 1 as 0 is used for padding in sequences
        if padding:
            for k, v in self.item_map.items():
                self.item_map[k] = v + 1
            self.item_ids = self.item_ids + 1
            self.num_items += 1
...

although, I think it's totally wrong not doing the padding !

vertical convolution

Any results about vertical convolution? Does it make a difference to the results?

one question about interactions.py

in interactions.py :
line 148: num_subsequences = sum([c - max_sequence_length + 1 if c >= max_sequence_length else 1 for c in counts])

Why num_subsequences = sum of (c - max_sequence_length + 1)?

e.g. when counts = 30, max_sequence_length = 10, I thought the num_subsequences = 30/10=3?rather than 30-10+1=21?
because of the length is 10, and num_subsequences means the numbers of sequence. Or is there something wrong with my understanding?

look forward to your reply，thank u !

Gowalla dataset preprocess

Hi graytowne,

I download the gowalla dataset from https://snap.stanford.edu/data/loc-gowalla.html, and preprocess it as your paper:

I got usernum=63089 and itemnum=68906 on the dataset after preprocessing, which is very different from your statistics in the paper.
Is there any detail I missed?

Looking forward to your reply.

Some problem about test data.

    def fit(self, train, test, verbose=False):
        """
        The general training loop to fit the model
        Parameters
        ----------
        train: :class:`spotlight.interactions.Interactions`
            training instances, also contains test sequences
        test: :class:`spotlight.interactions.Interactions`
            only contains targets for test sequences

here
, you say "test: only contains targets for test sequences". But I have check the test.txt in 'dataset/valid/test.txt'. I don’t understand the conten in this file... Shouldn't there be only three elements in each session? But I have see a lot of.

And in evaluate_ranking() of evaluate.py, test = test.tocsr() , test is a Interactions object. I have see the src, But I don't know what that means yet.

I'd appreciate it if I could get your help. Thank you so much!

关于代码结果的问题

您好，
我复现了您提供的代码，在ml1m上，结果如您所说，要比您原论文的结果要好。但是，对于gowalla，复现出的结果有点差（差距还不小）。我的做法是，下载了您rank_distill里的数据集，然后修改caser中的args，只修改了数据集，其他参数比如L,T都没有改。想问一下您是怎么做的？谢谢。

Question about Sequential Intensity factor

Hello !
Just been curious, What sequential association rule mining algorithm you used to calculate the Sequential Intensity factor.
if you don't mined, it would be nice of you to share the code with me.

如何处理foursquare和gowalla数据集？

您好，foursquare和gowalla数据集中存在重复的签到数据，也就是说用户在同一个位置签到多次，请问您是如何处理这些重复数据的呢？是保留最早的一次签到记录还是最晚的一次？采用不同的处理方式对序列的影响较大，希望您提供帮助，非常感谢！

您好，我想问一下，其他三个分好的实验数据集也可以放出来嘛？谢谢您~

您好，我想用其他的数据集也复现一下您的结果，所以我想问一下您可以把其他的分好的数据集公开一下吗？谢谢您啦~

您好~或者您可以说一下您的天猫数据集处理时是改了代码嘛，我现在训练需要很久的时间，谢谢~

Baseline Code

Hi @graytowne,
thanks a lot for an amazing code! I was wondering if you could also share the code for the baselines (particularly GRU4Rec),that would help me a lot.
Best,
Anastasia

Bug report

I get a strange error after 40+ epoches, as it show below:

Traceback (most recent call last):
File "train_caser.py", line 350, in
model.fit(train, test, verbose=False)
File "train_caser.py", line 169, in fit
negative_loss = -torch.mean(torch.log(1 - F.sigmoid(negative_prediction)))
RuntimeError: value cannot be converted to type double without overflow: -inf

migrate to PyTorch v0.4.0

the current repo is compatible with PyTorch v0.4.x, but to make it better compatible (use Tensor for Variable, which is deprecated), some codes need to be updated.

MaxPooling

Hello!
I noticed that maxpooling is not used in the code for horizontal convolution, which is inconsistent with the model introduced in the paper.
Can you tell me why?
Looking forward to your reply，thank you!

how to embedding recall ?

hi,dear
没看懂代码，实际应用中能改成向量化召回么？

Is this implementation split validation dataset?

I think it doesn't do validation test..

Can I see which part of implement is creating validation dataset and where to validation??

question about data construction

Hello, I want know if removing the user-item rating less than 3 (only remaining the ratings more than 3) in MoiveLen-1M dataset when constructing the dataset. Thank you!

caser所使用的CNN建模属于item-level还是feature-level?

作者您好，
在看最近的一些文章里有引用您的论文，他们提到说caser属于item level的建模。我个人认为CNN的使用是对item的latent-feature的一个extract，所以个人觉得应该属于feature-level，我不知道我的理解对不对，恳请指教，谢谢！

如何分数据集

Following[17,33,35],we hold the first70% of actions in each user’s sequence as the training set and use the next 10%of actions as the validation set to search the optimal hyperparameter settings for all models.There maining 20%actions in each user’s sequence are used as the test set for evaluating a model’s performance

你好，我现在在分数据集，您可以把数据集的部分代码公开一下嘛或者，你可以提示一下怎么按照用户序列分数据集，谢谢~非常感激！