paxcema / kerasgru4rec Goto Github PK

Keras implementation of GRU4Rec session-based recommender system

Python 100.00%

kerasgru4rec's Introduction

GRU4Rec in Keras

🚩 Important 🚩

There is now an official implementation of GRU4Rec in both TensorFlow and PyTorch. We recommend that users opt for these updated official implementations, instead:

As noted in this paper, our implementation currently has some important drawbacks:

No support for BPR-max loss
No support for item embeddings (either separate or shared)
No support for negative sampling, which leads to poor scaling for large datasets

Ultimately, it has been shown this implementation does not achieve the same performance as the original model (see table 6 on the paper above for more details).

We intend to fix these in the near future, but we urge users to beware and instead consider official implementaions linked above.

Summary

This repository offers an implementation of the "Session-based Recommendations With Recurrent Neural Networks" paper (https://arxiv.org/abs/1511.06939) using the Keras framework, tested with TensorFlow backend.

A script that interprets the MovieLens 20M dataset as if each user's history were one anonymous session (spanning anywhere from months to years) is included. Our implementation presents comparable results to those obtained by the original Theano implementation offered by the GRU4Rec authors, both over this new domain and also one of the original datasets used in the paper: 2015 RecSys Challenge dataset (RSC15).

Aditionally, a script can be found for determining a dataset's Dwell Time information, as seen on "Incorporating Dwell Time in Session-Based Recommendations with Recurrent Neural Networks" (http://ceur-ws.org/Vol-1922/paper11.pdf). Used with the RSC15 dataset, augmentation results can be reproduced, although we have not been able to replicate the final reported performance metrics.

Credit goes to yhs-968 for his parallel-batch data loader, as shown in his pyGRU4REC repository (https://github.com/yhs-968/pyGRU4REC).

Instructions

To train the RNN model from scratch:

python model/gru4rec.py --train-path path/to/train.csv --dev-path path/to/validation.csv --test-path path/to/test.csv --epoch n_epochs.

To resume training from a checkpoint, add --resume path/to/model_weights.h5 to the previous command.

To run the Dwell Time augmentation process: python preprocess/extractDwellTime.py --train-path path/to/train.csv --output-path path/to/augmented_train.csv.

Future work contemplates incorporating dwell time in an online manner to the model, hoping to leverage said information in the learning process, instead of in a previous preprocessing stage.

Changelog

[23/8/2020] Updated backend to TensorFlow 2.3.0

[04/09/2021] Updated backend to TensorFlow 2.6.0

[24/03/2023] Updated backend to TensorFlow 2.11.1

Requirements

The code has been tested with Python 3.6.8, using the following versions of the required dependencies:

numpy == 1.18.5
pandas == 1.0.5
tqdm == 4.41.1
tensorflow == 2.6.0
keras == 2.4.3
matplotlib == 3.3.1

kerasgru4rec's People

Contributors

Stargazers

Watchers

kerasgru4rec's Issues

KerasGRU4Rec does not process on all possible events

Thanks for the implementation of Keras gru4rec.
I notify some issues in your code:

The get_metric function does not take use of mask variable to reset the state after each session, thus the evaluation results may be wrong.
Issue with SessionDataLoader. Since every session is varied in length, all the remaining events in the last batch_size sessions will not be processed as soon as one of this session finishes (while loop stops when maxiter >= len(click_offsets) - 1 but maxiter starts from max(batch_size), apparently we did not process all events). You can confirm it by comparing the total number of generated feat with (the total number of events - the total number of unique session).
This can affect the evaluation result where number of batch_size is large and number of unique session is small. One potential fix is to use zero masking.

RNN cell reset done one step too late ...

Hi there,
As it seems your training generator basically generates the data so that for each batch you receive the reset mask at the beginning of the batch. To be more precise here are some examples:

Mask: []
Rest: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
Feat: [ 0  4  9 12 14 16 18 19 20 29]
Trgt: [ 1  4 10 13 15 17 18 19 20 29]

Mask: [3 4 5 6 9]
Rest: [0. 0. 0. 1. 1. 1. 1. 0. 0. 1.]
Feat: [ 1  4 10 30 32 36 34 19 20 32]
Trgt: [ 2  5 11 31 33 36 35 19 21 32]

As you can see from the generated data the model has to reset the RNN cells states prior to the fw/bp passes done in the train_on_batch, otherwise the reset will be done after the current session last step is "connected" to the next session first step. However in your code you are resetting the RNN cells after the train_on_batch.

RECALL and MRR very low

I've tested the gru4rec implementation on 150K fashion clicks sequences.
I got very low MRR and Recall. (0.001698 RECALL and 0.000509 MRR )
batch size = 10 (because my seq length are small

any advice can help me.
Thanks

gru4rec Real-Time prediction

Thanks for the gru4rec implementation with Keras.
What is the best approach to real-time prediction?

When the model is ready for real-time prediction and the sequence length are different. How can we set the whole sequence as one input compare to set one by one item?

Any answer will help
Thanks

License

Hey,
I saw that there is no license in the repo.
Can you please explain about using your code? What is permitted?
Thanks in advance

Set different weights according to a sequence type

Thanks for implementing GRU4Rec in Keras.

I have a more theoretical question, like in the collaborative filtering approach, we can set the total rank to an item based on different interactions like view, add to cart, purchase.
It is possible to set different weights to different sequence types in this approach as well?

Thanks in advance