wxl1999 / unicrs Goto Github PK
View Code? Open in Web Editor NEW[KDD22] Official PyTorch implementation for "Towards Unified Conversational Recommender Systems via Knowledge-Enhanced Prompt Learning".
License: MIT License
[KDD22] Official PyTorch implementation for "Towards Unified Conversational Recommender Systems via Knowledge-Enhanced Prompt Learning".
License: MIT License
Hi, thank you for sharing code.
I think DB Pedia link in readme file has beed changed.
Could you change it into https://databus.dbpedia.org/dbpedia/mappings/mappingbased-objects/2021.09.01/mappingbased-objects_lang=en.ttl.bzip2?
I ran your code successfully. Although the result in recommendation part is almost the same, the result of conversational part is so much better than your paper's. And I didn't change the source code. Would you tell me the reasons about it?
The following is the result in my device.
'test/dist@2': 0.5503033723719486, 'test/dist@3': 0.9362212501763792, 'test/dist@4': 1.211090729504727
The following is the result in your paper.
When I'm trying to reproduce the method, there are two main challenges:
Hope to update~
Hi Xiaolei, thank you for your work. I'm interested in various works with knowledge graph.
Anyway, I have a question. What is the meaning of the pooling in the pre-training step?
$$ h_{u} = Pooling[f(\tilde{C}{pre} | \Theta{plm} ; \Theta_{fuse})] $$ in this equation, I cannot understand the meaning of pooling, and find corresponding code in github.
It would be very grateful indeed if you can give me anyhelp.
Best regards.
Yongtaek
Hi, thanks for sharing the code.
While running the code, I found that in src/utils.py
, the output shape is modified when fp16 mixed precision training
is activated.
https://github.com/wxl1999/UniCRS/blob/main/src/utils.py#L44
Due to this, when activating fp16
training, the code throws error.
What is the line t = t // 8 * 8
for?
Hey Xiaolei. First of all, thanks for your work! I have successfully run your code on Github, but I have a few questions about the preprocessing code.
I got very high recall@k scores in pretrained-prompt model.
{'test/recall@1': 0.5659283956497578, 'test/recall@10': 0.9083115027387473, 'test/recall@50': 0.9323648487735176, 'test/ndcg@1': 0.5659283956497578, 'test/ndcg@10': 0.757117269984917, 'test/ndcg@50': 0.7626062380924029, 'test/mrr@1': 0.5659283956497578, 'test/mrr@10': 0.7063158638961657, 'test/mrr@50': 0.7075750767791935, 'test/loss': 2.209772330341917, 'epoch': 4}
Is this due to the file preprocessed(test_data_processed.jsonl) through process.py containing user and system responses? So this recall@k is not accurate?
I wonder if it is correct to understand the code for the three preprocessed datasets in this way.
(1) the 'process.py' is for extracting user and system response and their context for semantic fusion.
(2) the 'process_mask.py' is for extracting system response for conv prompt.
(3) the 'merge.py' is for merge the template by conv prompt model and items for rec prompt.
It would be very grateful indeed if you can give me anyhelp.
Best regards.
siqi
Dear Author,
I am trying to reproduce the rec performance on INSPIRED dataset.
I use the hyperparameters you recommend and the "best" model as prompt-encoder. Unfortunately, I was not able to reproduce the performance on the paper.
---- Here I attached the loss and recall@1 on testset for prompt pre-training, conversational training, and recommendation training steps:
prompt pre-training
recommendation training (as you can see, the best recall@1 I got is around 0.04, far from 0.09)
---- and here are the configuration I use for prompt pre-training, conversational training, and recommendation training steps:
python3 train_pre.py \
--dataset inspired \
--tokenizer microsoft/DialoGPT-small \
--model microsoft/DialoGPT-small \
--text_tokenizer roberta-base \
--text_encoder roberta-base \
--num_train_epochs 5 \
--gradient_accumulation_steps 1 \
--per_device_train_batch_size 64 \
--per_device_eval_batch_size 128 \
--num_warmup_steps 168 \
--max_length 200 \
--prompt_max_length 200 \
--entity_max_length 32 \
--learning_rate 6e-4 \
--output_dir UniCRS/src/result_promptpretraining_inspired \
--use_wandb \
--project crs-prompt-pre-inspired \
--name exp1 \
--gpu 0
prompt pre-training
python3 train_conv.py \
--dataset inspired \
--tokenizer microsoft/DialoGPT-small \
--model microsoft/DialoGPT-small \
--text_tokenizer roberta-base \
--text_encoder roberta-base \
--n_prefix_conv 20 \
--prompt_encoder UniCRS/src/result_promptpretraining_inspired/best/ \
--num_train_epochs 10 \
--gradient_accumulation_steps 1 \
--ignore_pad_token_for_loss \
--per_device_train_batch_size 8 \
--per_device_eval_batch_size 16 \
--num_warmup_steps 976 \
--context_max_length 200 \
--resp_max_length 183 \
--prompt_max_length 200 \
--entity_max_length 32 \
--learning_rate 1e-4 \
--output_dir UniCRS/src/result_convprompt_inspired \
--use_wandb \
--project crs-prompt-conv-inspired \
--name exp1 \
--gpu 0
conv training
python3 infer_conv.py \
--dataset inspired \
--split test \
--tokenizer microsoft/DialoGPT-small \
--model microsoft/DialoGPT-small \
--text_tokenizer roberta-base \
--text_encoder roberta-base \
--n_prefix_conv 20 \
--prompt_encoder UniCRS/src/result_convprompt_inspired/best \
--per_device_eval_batch_size 64 \
--context_max_length 200 \
--resp_max_length 183 \
--prompt_max_length 200 \
--entity_max_length 32 \
--gpu 1
conv infer
python3 train_rec.py \
--dataset inspired_gen \
--tokenizer microsoft/DialoGPT-small \
--model microsoft/DialoGPT-small \
--text_tokenizer roberta-base \
--text_encoder roberta-base \
--n_prefix_rec 10 \
--prompt_encoder UniCRS/src/result_promptpretraining_inspired/best \
--num_train_epochs 5 \
--per_device_train_batch_size 64 \
--per_device_eval_batch_size 64 \
--gradient_accumulation_steps 1 \
--num_warmup_steps 33 \
--context_max_length 200 \
--prompt_max_length 200 \
--entity_max_length 32 \
--learning_rate 1e-4 \
--output_dir UniCRS/src/result_rec_inspired \
--use_wandb \
--project crs-prompt-rec-inspired \
--name exp1 \
--gpu 0
rec training
Thank you!
I trained according to the code provided on GitHub, but since the dataset link you provided cannot be opened, I used mapping based objects_ Lang=en_ 202112.ttl dataset. The final results of my training are as follows:
conv:
'test/dist@2': 0.310709750246931, 'test/dist@3': 0.49851841399746016, 'test/dist@4': 0.6383519119514605
rec:
'test/recall@1': 0.029324894514767934, 'test/recall@10': 0.16729957805907172, 'test/recall@50': 0.37953586497890296
(1)These results differ greatly from the results presented in the paper. Can you give me some guidance? I hope to reproduce results similar to yours. Thank you very much.
(2)According to your paper, do I need to set --n_prefix_conv 50 in the train_conv.py and --use_resp in the train_rec. py?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.