kang205 / sasrec Goto Github PK

View Code? Open in Web Editor NEW

709.0 709.0 151.0 17.23 MB

SASRec: Self-Attentive Sequential Recommendation

License: Apache License 2.0

Python 100.00%

deep-learning recommender-system

sasrec's People

Contributors

Stargazers

Watchers

Forkers

mindis shubhampachori12110095 jdc08161063 yyht zhrlove voladorlu azizilyosov cf904c27 bulba826 nguyenvo09 maryanmorel nickyongzhang hearfishle shuyangli94 chocoluffy jufangshen hui-li imkant qianrenjian zhaoyangliu-leo romeowen tsarvy chritter danifree ffffffffire nullees qianyaoyy ccdllyy fishexpert wuqianliangsresearch gy1900 tobyge marymijin reverliu wangyoucao douboo swan815 saurabh3949 liuyn0505 sentiment101 lindsayxx crystal22 taolian marsstones xrosliang yonghangzhou arisohn xiaoliang8006 mysqlsc yu3401 russellkim larspendragon gocaption sandy4321 furongpeng joetag mvijaikumar maosengshulei veralily herrzyz huangniu1124 helenligit ericsimonzhu gbordyugov beathahahaha jzdsml ybling boluochuile duanchao tyacui woolr 623851394 jjjjjie vgeek-z supzq zhaoyuecheng shenyi0516 florianscheidl ctk117 qyq-bot naquiao shkim1980 juyongjiang shaohong352 zhaosiheng willowxy luckdog1 waston-li javi2481 abis330 llxsd bupt-yxy zhangqianjin godofrap hyacinthschatten jhhugo eperrier liliuhaha umapornp zeng1028796085

sasrec's Issues

Questions about Performance of Caser

I modified the code based on https://github.com/graytowne/caser_pytorch and ran it on ML-1M dataset but the end result is not ideal.

The summary of the running results is as follows:
hit@10:0.622 , NDCG@10 = 0.481
The above evaluation metrics are much lower than those reported in the paper：
Hit@10 = 0.7886, NDCG@10 = 0.5538

I was wondering whether the difference was made due to my negligence. Could you please share your code. Looking forward to your reply! Thanks!

SASRec performance is inferior than BPR-MF!

I find the performance of SASRec is inferior than BPR-MF on some sparse dataset like Amazon beauty, game, and steam dataset. Is it normal? I use the BPR-MF baseline form https://github.com/duxy-me/ConvNCF.

Why put the test[u][0] into the candidate seqences?

In the evaluate function, it makes a item_index list and put the test[u][0] in it.
What I consider is that the test[u][0] should be what we want to predict, but in this way, the model knows it should predict from the possibility of these candidates, including the one we want to predict.
Is this a kind of data leaking? Or did I misunderstand something?

self.test_logits = self.test_logits[:, -1, :] Wrong if sequence not full?

Selecting the last time step in the sequence makes no sense when the sequence doesn't reach maxlen. So is this a bug that would potentially hurt the test performance?

help,please

line 199 of modules.py : outputs *= query_masks # broadcasting. (N, T_q, C)
I think,is it outputs = query_masks # broadcasting. (hN, T_q, C)?

`tf.sign(tf.abs(tf.reduce_sum` vs `tf.sign(tf.reduce_sum(tf.abs(` for generating masks?

SASRec/modules.py

Line 185 in 641c378

key_masks = tf.sign(tf.abs(tf.reduce_sum(keys, axis=-1))) # (N, T_k)

Hi Guys,
I'm reading the code for porting the implementation to PyTorch for personal use, the code looks well written and documented, thx for the great work :)

Moreover, as self attention module is borrowed from another project, some details may not be 100% right according to my observation(despite some magic numbers like -2^32+1 for enforcing softmax to output 0 for the entry which kills code readability), as an example, for query and key mask generation, the code used tf.sign+tf.abs+tf.reduce_sum combination for generating the masks but the order seems slightly wrong, as we are trying to mask the query/key of all 0 values in channel/embedding-dim, the right way might be firstly apply abs, then do reduce_sum and finally use sign to generate the results, but current implementation firstly use reduce_sum, later used abs and lastly apply sign, the two approaches should generate same results for most case as sum-to-zero is of low probability for high-dimensional fp32 vectors but it's still wrong and may generate incorrect outputs for corner cases.

Just want to check my assumption as stated above, pls respond if you happen to have time @kang205 @JiachengLi1995, thx!

Regards,
Zan

Application for adding third-party re-implemenation link on README

Hi Team, @kang205
I had prepared a PyTorch version of SASRec based on your tf implementation which behaves almost the same:

https://github.com/pmixer/SASRec.pytorch

could u pls consider adding the Third-party Re-implementation section like in https://github.com/wy1iu/LargeMargin_Softmax_Loss to let people know the work?

I hope to get the PyTorch implementation useful for wider audience and have someone help checking why it converges bit slower than tf implementation in model training 🤣

Regards,
Zan

why the causality is True

key_masks

"key_masks = tf.sign(tf.abs(tf.reduce_sum(keys, axis=-1))) # (N, T_k)" at line 185 in modules.py indicates that the key_masks depend on the embedding of those keys, but why not use the original "key_masks = tf.sequence_mask(keys_length, tf.shape(keys)[1]) # (N, T_k)"?

Training checkpoint

Can you save the weight of the model (checkpoint) after some epochs?

How do we calculate AUC for SASRec?

Hello!

I am currently working on CTR and Sequential Recommendation Tasks. I see many recent papers like TallRec uses AUC to compare their result with SASRec. I am really curious as to how can I calculate AUC for SASRec. In my understanding SASRec require metrics like NDCG@k and HitRate@k. It would be really helpful if you could shed some light on it.

Thanks and Regards
Millennium Bismay

Need help pls

I my name is Mauro, and I am a computer science student. I am trying to run your code of SASRec, but I have troubles due to the versions of tensorflow, could be possible to you, make a running of it using a lighter version of the sport amazon revies dataset (that i will give you, it is just 60 MB). Your help will be really precious to me. If you want to help me you could send me an e mail to this account [email protected]

Training method of baseline GRU4Rec

Could you tell me the detail of GRU4Rec?
For example, GRU4Rec model is trained by BPTT or normal BP?
Additionally, could you share the code of baselines?

提醒大家一个坑，range里边要为整数：num_batch = math.floor(len(user_train) / args.batch_size)。

如题

Could you provide the original data of Amazon dataset

Thanks to your outstanding work. But I still have some question about data.
When I open the Amazon dataset website, I find that the data have been updated in 2018. If I try to use preprocess code in new dataset of beauty with 5 core set (in your code) , only a few thousand records are kept, Much less than the preprocessed data in your Repositories.
I need to re-preprocess original data because of my model need to use time-stamp data, not only the interaction order.
If you could provide the original amazon data and the preprocess code(for amz game), I will be very appreciate!

Why does the queries are normalized while keys not when using multihead_attention?

 # Self-attention
self.seq = multihead_attention(queries=normalize(self.seq),
                               keys=self.seq,
                               num_units=args.hidden_units,
                               num_heads=args.num_heads,
                               dropout_rate=args.dropout_rate,
                               is_training=self.is_training,
                               causality=True,
                               scope="self_attention")

https://github.com/kang205/SASRec/blob/e3738967fddab206d6eeb4fda433e7a7034dd8b1/model.py#L54

Thank you!

cuda error

Hi, thanks for your excellent work.
When i run the code, the following error occurred
E tensorflow/stream_executor/cuda/cuda_blas.cc:652] failed to run cuBLAS routine cublasGemmBatchedEx: CUBLAS_STATUS_NOT_SUPPORTED
2021-07-22 23:05:32.120816: E tensorflow/stream_executor/cuda/cuda_blas.cc:2574] Internal: failed BLAS call, see log for details

tensorflow version = 1.12.0
python version=2.7.18
Looking forward to your reply!

another key_mask minor bug

SASRec/modules.py

Line 185 in d0b823a

key_masks = tf.sign(tf.reduce_sum(tf.abs(keys, axis=-1))) # (N, T_k)

tf.abs has no axis parameter, should be tf.reduce_sum(tf.abs(keys), axis=-1)

Preprocessing for ml1m?

Hello.

Your data preprocessing code seems to only regard beauty reviews.

Can you also provide the code for preprocessing ml1m data?

Thank you.

Problem with num_batch

In your code, the num_batch is calculated by below way:
num_batch = len(user_train) / args.batch_size
But it could cause error in
for step in tqdm(range(num_batch), total=num_batch, ncols=70, leave=False, unit='b')
Because

Traceback (most recent call last):
File "main.py", line 61, in
for step in tqdm(range(num_batch), total=num_batch, ncols=70, leave=False, unit='b'):
TypeError: 'float' object cannot be interpreted as an integer

'review' dataset not correct

In the 'steam_reviews' dataset, the 'recommend' property of all samples is 'True'.

Could that be caused by crawling bugs during data collection?

http://cseweb.ucsd.edu/~wckang/steam_reviews.json.gz

WarpSampler

How to set the amount of negative samples?

Quote the paper:

For each user u, we randomly sample 100 negative items, and rank these items
with the ground-truth item. Based on the rankings of these 101 items, Hit@10 and NDCG@10 can be evaluated.

How could I set the number of negative items in the code?
It seems the (pos, neg) pairs are generated independently.

SASRec/sampler.py

Line 28 in e373896

if nxt != 0: neg[idx] = random_neq(1, itemnum + 1, ts)

为什么经过过滤少于5次的用户或者项目后，beauty数据集中仍然存在少于5次的用户

will it work on Windows?

rank = predictions.argsort().argsort()[0] ?

SASRec/util.py

Line 119 in 641c378

rank = predictions.argsort().argsort()[0]

Why do you do argsort() twice?

SASRec/util.py

Line 116 in 641c378

predictions = -model.predict(sess, [u], [seq], item_idx)

why do you multiply -1 to the prediction?

 An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.

To fix this issue I added
if __name=="__main__": main()
but because of close function in sampler, program is closed without training
I used python 3.6.
Thanks a lot beforehand !

kang205 / sasrec Goto Github PK

sasrec's People

Contributors

Stargazers

Watchers

Forkers

sasrec's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs