aaronheee / recformer Goto Github PK

Replication of the paper "Text Is All You Need: Learning Language Representations for Sequential Recommendation" on KDD'23.

Python 99.04% Shell 0.96%

recformer's People

Stargazers

Watchers

Forkers

wujia0 yutongli2024 dengyuanyy liangda-w super2014001 lyclyc52 likeucode toyolabo themightyrobin lineshen199466 govind1771 jaeinkr changxinwang amehi huangshangfo ying-yuchen architect-ll

recformer's Issues

Regarding the issue with reproduction

Hello. I have encountered an issue with reproducing the results. I replicated two methods, Recformer and UniSRec, using default parameters. The results for Recformer seem reasonable, but the performance of the reproduced UniSRec is significantly better than reported in the paper. What could be the reason for this?

the process of dataset

Traceback (most recent call last):
File "D:\download\RecFormer-main\RecFormer-main\pretrain_data\meta_data_process.py", line 53, in
if line['asin'] is not None and line['title'] is not None:
KeyError: 'title'

(text) D:\download\RecFormer-main\RecFormer-main\pretrain_data>python meta_data_process.py
Check meta asins: 25%|███████████▊ | 2/8 [01:18<03:55, 39.31s/it]
Traceback (most recent call last):
File "D:\download\RecFormer-main\RecFormer-main\pretrain_data\meta_data_process.py", line 53, in
if line['asin'] is not None and line['title'] is not None:
KeyError: 'title'
there have some trouble
may i ask why

Difference between RecformerModel and RecformerForSeqRec

Firstly, congratulations on the publication! The idea presented is interesting, and the it is very well written.

I'm a bit confused about the distinction between RecformerModel and RecformerForSeqRec.

I understand that you used RecformerForSeqRec to derive the results in the paper for sequential recommendations. However, I'm curious about the purpose of RecformerModel. Is it essentially the model allenai/longformer-base-4096, but with variations in token type embedding and item position embedding?

Thanks!

Candidates for ranking

In the paper, it is said that "We rank the ground-truth item of each sequence among all items for evaluation.."
Do the candidates include items in all train, val, and test sequences?

AttributeError: 'NoneType' object has no attribute 'encode_item'

Overview

When I followed the steps described in the Training section on the README, I stumbled upon an issue and couldn't move forward.

Note

I downloaded the processed data and then ran python save_longformer_ckpt.py after storing them in the pretrain_data folder.
The next code bash lightning_run.sh gives me the following error.

Given I go with the default setting including the training strategy deepspeed_stage_2, it looks like not running python zero_to_fp32.py . pytorch_model.bin could be the culprit but I don't see it created automatically to the checkpoint folder)
Thanks for your guidance.

Error Detail

zero_to_fp32.py

Traceback (most recent call last):
  File "./anaconda3/lib/python3.11/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
                    ^^^^^^^^^^^^^^^^^^^
  File "./RecFormer-main/lightning_pretrain.py", line 46, in _par_tokenize_doc
    input_ids, token_type_ids = tokenizer_glb.encode_item(item_attr)
                                ^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'encode_item'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "./RecFormer-main/lightning_pretrain.py", line 156, in <module>
    main()
  File "./RecFormer-main/lightning_pretrain.py", line 102, in main
    doc_tuples = list(tqdm(pool_func, total=len(item_attrs), ncols=100, desc=f'[Tokenize] {path_corpus}'))
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "./anaconda3/lib/python3.11/site-packages/tqdm/std.py", line 1178, in __iter__
    for obj in iterable:
  File "./anaconda3/lib/python3.11/multiprocessing/pool.py", line 873, in next
    raise value
AttributeError: 'NoneType' object has no attribute 'encode_item'

Question about calculating rank (line 95 in utils.py)

The rank is calculated as "rank = (predicts < scores).sum(-1).float()" in line 95 of utils.py.
Should "<" be replaced with "<=" to avoid the rank being 0 when all scores are the same value?

aaronheee / recformer Goto Github PK

recformer's People

Stargazers

Watchers

Forkers

recformer's Issues

Regarding the issue with reproduction

the process of dataset

Difference between RecformerModel and RecformerForSeqRec

Candidates for ranking

AttributeError: 'NoneType' object has no attribute 'encode_item'

Overview

Note

Error Detail

Question about calculating rank (line 95 in utils.py)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs