kang205 / sasrec Goto Github PK
View Code? Open in Web Editor NEWSASRec: Self-Attentive Sequential Recommendation
License: Apache License 2.0
SASRec: Self-Attentive Sequential Recommendation
License: Apache License 2.0
I modified the code based on https://github.com/graytowne/caser_pytorch and ran it on ML-1M dataset but the end result is not ideal.
The summary of the running results is as follows:
hit@10:0.622 , NDCG@10 = 0.481
The above evaluation metrics are much lower than those reported in the paper:
Hit@10 = 0.7886, NDCG@10 = 0.5538
I was wondering whether the difference was made due to my negligence. Could you please share your code. Looking forward to your reply! Thanks!
I find the performance of SASRec is inferior than BPR-MF on some sparse dataset like Amazon beauty, game, and steam dataset. Is it normal? I use the BPR-MF baseline form https://github.com/duxy-me/ConvNCF.
In the evaluate function, it makes a item_index list and put the test[u][0] in it.
What I consider is that the test[u][0] should be what we want to predict, but in this way, the model knows it should predict from the possibility of these candidates, including the one we want to predict.
Is this a kind of data leaking? Or did I misunderstand something?
Selecting the last time step in the sequence makes no sense when the sequence doesn't reach maxlen. So is this a bug that would potentially hurt the test performance?
line 199 of modules.py : outputs *= query_masks # broadcasting. (N, T_q, C)
I think,is it outputs = query_masks # broadcasting. (hN, T_q, C)?
Line 185 in 641c378
Hi Guys,
I'm reading the code for porting the implementation to PyTorch for personal use, the code looks well written and documented, thx for the great work :)
Moreover, as self attention module is borrowed from another project, some details may not be 100% right according to my observation(despite some magic numbers like -2^32+1 for enforcing softmax to output 0 for the entry which kills code readability), as an example, for query and key mask generation, the code used tf.sign+tf.abs+tf.reduce_sum
combination for generating the masks but the order seems slightly wrong, as we are trying to mask the query/key of all 0 values in channel/embedding-dim, the right way might be firstly apply abs, then do reduce_sum and finally use sign to generate the results, but current implementation firstly use reduce_sum, later used abs and lastly apply sign, the two approaches should generate same results for most case as sum-to-zero is of low probability for high-dimensional fp32 vectors but it's still wrong and may generate incorrect outputs for corner cases.
Just want to check my assumption as stated above, pls respond if you happen to have time @kang205 @JiachengLi1995, thx!
Regards,
Zan
Hi Team, @kang205
I had prepared a PyTorch version of SASRec based on your tf implementation which behaves almost the same:
https://github.com/pmixer/SASRec.pytorch
could u pls consider adding the Third-party Re-implementation
section like in https://github.com/wy1iu/LargeMargin_Softmax_Loss to let people know the work?
I hope to get the PyTorch implementation useful for wider audience and have someone help checking why it converges bit slower than tf implementation in model training 🤣
Regards,
Zan
"key_masks = tf.sign(tf.abs(tf.reduce_sum(keys, axis=-1))) # (N, T_k)" at line 185 in modules.py indicates that the key_masks depend on the embedding of those keys, but why not use the original "key_masks = tf.sequence_mask(keys_length, tf.shape(keys)[1]) # (N, T_k)"?
Can you save the weight of the model (checkpoint) after some epochs?
Hello!
I am currently working on CTR and Sequential Recommendation Tasks. I see many recent papers like TallRec uses AUC to compare their result with SASRec. I am really curious as to how can I calculate AUC for SASRec. In my understanding SASRec require metrics like NDCG@k and HitRate@k. It would be really helpful if you could shed some light on it.
Thanks and Regards
Millennium Bismay
I my name is Mauro, and I am a computer science student. I am trying to run your code of SASRec, but I have troubles due to the versions of tensorflow, could be possible to you, make a running of it using a lighter version of the sport amazon revies dataset (that i will give you, it is just 60 MB). Your help will be really precious to me. If you want to help me you could send me an e mail to this account [email protected]
Could you tell me the detail of GRU4Rec?
For example, GRU4Rec model is trained by BPTT or normal BP?
Additionally, could you share the code of baselines?
如题
Thanks to your outstanding work. But I still have some question about data.
When I open the Amazon dataset website, I find that the data have been updated in 2018. If I try to use preprocess code in new dataset of beauty with 5 core set (in your code) , only a few thousand records are kept, Much less than the preprocessed data in your Repositories.
I need to re-preprocess original data because of my model need to use time-stamp data, not only the interaction order.
If you could provide the original amazon data and the preprocess code(for amz game), I will be very appreciate!
# Self-attention
self.seq = multihead_attention(queries=normalize(self.seq),
keys=self.seq,
num_units=args.hidden_units,
num_heads=args.num_heads,
dropout_rate=args.dropout_rate,
is_training=self.is_training,
causality=True,
scope="self_attention")
https://github.com/kang205/SASRec/blob/e3738967fddab206d6eeb4fda433e7a7034dd8b1/model.py#L54
Thank you!
Hi, thanks for your excellent work.
When i run the code, the following error occurred
E tensorflow/stream_executor/cuda/cuda_blas.cc:652] failed to run cuBLAS routine cublasGemmBatchedEx: CUBLAS_STATUS_NOT_SUPPORTED
2021-07-22 23:05:32.120816: E tensorflow/stream_executor/cuda/cuda_blas.cc:2574] Internal: failed BLAS call, see log for details
tensorflow version = 1.12.0
python version=2.7.18
Looking forward to your reply!
Hello.
Your data preprocessing code seems to only regard beauty reviews.
Can you also provide the code for preprocessing ml1m data?
Thank you.
In your code, the num_batch is calculated by below way:
num_batch = len(user_train) / args.batch_size
But it could cause error in
for step in tqdm(range(num_batch), total=num_batch, ncols=70, leave=False, unit='b')
Because
Traceback (most recent call last):
File "main.py", line 61, in
for step in tqdm(range(num_batch), total=num_batch, ncols=70, leave=False, unit='b'):
TypeError: 'float' object cannot be interpreted as an integer
In the 'steam_reviews' dataset, the 'recommend' property of all samples is 'True'.
Could that be caused by crawling bugs during data collection?
Quote the paper:
For each user u, we randomly sample 100 negative items, and rank these items
with the ground-truth item. Based on the rankings of these 101 items, Hit@10 and NDCG@10 can be evaluated.
How could I set the number of negative items in the code?
It seems the (pos, neg) pairs are generated independently.
Line 28 in e373896
will it work on Windows?
Hi, Caser uses Softmax to get the interaction probability of each item. In the experiment of this paper, you select 100 negative samples to test. I would like to know how you deal with it. Thank you very much.
For multi-head attention module, why you set num_head=1 according to args
in main.py
? then it is not using multi-head structure of the attention block, is it?
Thanks,
When I run program I got following error.
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
To fix this issue I added
if __name=="__main__": main()
but because of close function in sampler, program is closed without training
I used python 3.6.
Thanks a lot beforehand !
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.