pochih / rl-chatbot Goto Github PK

View Code? Open in Web Editor NEW

417.0 23.0 140.0 52.68 MB

🤖 Deep Reinforcement Learning Chatbot

License: MIT License

Shell 1.05% Python 98.95%

deep-learning reinforcement-learning seq2seq-model chatbot tensorflow nlp

rl-chatbot's People

Contributors

Stargazers

Watchers

Forkers

chairgraveyard alvincjin xinyufeng lamuguo alexhct colinsongf maozhiqiang yaolili shubhampachori12110095 jeknov linpingchuan xiaopyyy cindy21td benjamism chongyangtao dmjvictory raymonddixon pcbhuang mellinging satadru5 neuron888 bobbercheng binderwang tomlisankie reiisky sotiristsak zhaoguangxiang estelleaf sakusss 2020zyc paolominguzzi iamsile abubakar-ucr davidtranno1 fengjuncn akeemedes tobby2002 vidhushinisrinivasan16 kelizhong sneelapu77 csobe jiunhaojhan datctbk tilan7663 luomuqinghan wesleytao haodong-liu jeffreysijuntan joelvarma qbetterk xingzai0617 wangjing0128 ramdhanoriya lebahoang dongcin hunter89 zixufang newenglandml smallsmallwood msj905 saravananpsg nareshnesh vanvibig tjevgerres zorrock sunyancn allensmile nicemartin nahidalam wjwmichael zhaoyun630 beethovenvirus fedorajzf ethan-cho wengbenjue jieliorz dntai reloadbrain m4gr4th34 spuronlee phelimonsarpaning gissemari jiwant wuqingzhou828 rollingstone xiaohong-deng roymachinelearning reichenbch scape1989 durgaprasd ashishkiitm anigi98932 adolfoeliazat syedmahmoodroomi endpress rizwanbinyasin handsomekiwi biddwan09 ganeshgs sparkingdark

rl-chatbot's Issues

how to train with tensorflow gpu?

how to train with tensorflow gpu : such as CUDA version, CUDNN version, memory share of GPU NVIDIA,...

chat bot was bad learning

I tried to learn seq 2 seq from the beginning using python3, but when I tested it by turning the first 10 epochs, all the responses were . Will this be cured if I study more?

=== Use model ./model/Seq2Seq/model-1 ===

question => Have you heard about 'machine learning and having it deep and structured'?
generated_sentence => .
question => Machine learning.
generated_sentence => .
question => I don't know. Maybe we should watch the tape to be sure.
generated_sentence => .
question => Listen man, I don't need this shit.
generated_sentence => .
question => Will you stand up for me?
generated_sentence => .
question => How do you trun this on?
generated_sentence => .
question => Thank God it's Friday!
generated_sentence => .
question => I'm sure a lot of people down in L.A. are worried sick about you.
generated_sentence => .
question => I forgot to get the Coca-Cola.
generated_sentence => .
question => How about you graduation thesis?
generated_sentence => .

Unable to load pretrained model

Use default model

Traceback (most recent call last):
File "python/test.py", line 138, in
test()
File "python/test.py", line 72, in test
saver.restore(sess, default_model_path)
File "/home/kamalraj/anaconda2/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1428, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/home/kamalraj/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 767, in run
run_metadata_ptr)
File "/home/kamalraj/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 965, in _run
feed_dict_string, options, run_metadata)
File "/home/kamalraj/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1015, in _do_run
target_list, options, run_metadata)
File "/home/kamalraj/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1035, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [7634,1000] rhs shape= [6851,1000]
[[Node: save/Assign_4 = Assign[T=DT_FLOAT, _class=["loc:@wemb"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/cpu:0"](Wemb,
save/RestoreV2_4)]]

wrong output

hey, thanks for the code. I used your pre-trained model. but getting absurd responses. Should I train it to more epochs?
A: Have you heard about 'machine learning and having it deep and structured'?
B: Woods path woods bee sir takes nowhere thompson.
A: Reminds 'a name's working rebels secrets guts procedure fairy missus pain warned procedure ignorant troops wrap famous pain dna wheel troops.
B: Tongue hooked ignorant tongue tongue manner ignorant positively pain warned break real travel sir.
A: Pain las tricks putting ears shack warm ignorant positively arrives brand woods knew jersey domino wrap.
B: Reminds whores pain warned river rio brand.
A: Mob longer. Procedure woody luther wasted tricks specialty assumed window pain scientist.
B: Wigand much shift mud traitor woody stab brand submit submit touches tanks department woody feed pain middle wrap ignorant unlikely.
A: House real policeman term.
B: Mob slaves knee. Meters bat assumed martin woods tongue pit.
A: Jesse electrical reports domino real.
B: Redi whores real. Redi whores real. Desmond real.

pip install -r requirements.txt is failing -Could not find a version that satisfies the requirement tensorflow==1.0.1

I am getting an error when try to run
pip install -r requirements.txt
Is saying
Could not find a version that satisfies the requirement tensorflow==1.0.1 (from -r requirements.txt (line 1)) (from versions: 1.13.0rc1, 1.13.0rc2, 1.13.1, 1.13.2, 1.14.0rc0, 1.14.0rc1, 1.14.0, 1.15.0rc0, 2.0.0a0, 2.0.0b0, 2.0.0b1, 2.0.0rc0, 2.0.0rc1)
No matching distribution found for tensorflow==1.0.1 (from -r requirements.txt (line 1))
I have tensorflow cloned recent version- since i have windows10 also my python version
Python 3.7.3 (default, Mar 27 2019, 17:13:21) [MSC v.1915 64 bit (AMD64)] :: Anaconda, Inc. on win32
Please let me know how to run pip install -r requirements.txt
Issue.txt

您好，我重新训练SEQ2SEQ模型后用 text.py 但是得到以下结果:

您好，我重新训练SEQ2SEQ模型后用 text.py 但是得到以下结果:
Concentration concentration concentration concentration planning priest.
Concentration concentration concentration concentration planning concentration planning priest.
Concentration concentration concentration concentration planning priest.
Concentration concentration concentration concentration planning concentration planning priest.
Concentration concentration concentration concentration planning priest.
Concentration concentration concentration concentration planning concentration planning priest.
您知道是什么原因么？

Where is RL environment?

Hi,

Thanks for this code repo.
I have one question , which environment you are using for RL ?

Thanks
Mahesh

procedure

can you upload file which can explain step by step procedure to run this

A question about ease of answering

Thanks for your sharing.

In RL for ease of answering, the reward is calculated by RL model itself, not another model?

Why not input the action into another pretrained model to obtain the response, and measure its likelihood with a dull response?

How to use saved_model api to store the model?

Hi, i wander to know if you have tried save the chatbot with saved_model api?Do you have any ideas?
你好，我想知道你有尝试过用saved_model来保存模型吗？我想在tensorflow serving上加载chatbot模型，但是只能加载saved_model保存的模型文件。

请问你有遇过reward爆炸的情况吗？

我只用了 Ease of answering 作为reward，但是随着训练这一项从-2.x开始一直减小到负无穷。我没有用sigmoid，但是也很奇怪，因为原作者也没有加sigmoid。

How to train reversed model for RL model

You mentioned "
When training with policy gradient (pg)

you may need a reversed model

the reversed model is also trained by cornell movie-dialogs dataset, but with source and target reversed.
"
Except downloading pre-trained reversed model, could you please tell how to rain it?

Thank you a lot.

Why the sigmoid in count_rewards()

Hi,

In python\RL\train.py, after adding the ease of answering reward and semantic coherence, the sigmoid of the reward is scaled by 1.1

total_loss = sigmoid(total_loss) * 1.1

What was the purpose of the sigmoid and the scaling (1.1) in this line 261?

Also, I noticed you didn't weight each reward by lambda like in the "Deep Reinforcement Learning for Dialogue Generation" paper. Was this on purpose?

Thanks!

Variable Wemb/Adam/ does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=None in VarScope?

code：
with tf.variable_scope(tf.get_variable_scope(), reuse=None):
train_op = tf.train.AdamOptimizer(self.lr).minimize(loss)

error：
Variable Wemb/Adam/ does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=None in VarScope?

I dont kown why

RL training

Hi,
Thanks for having shared your implementation of the RL chatbot.
I might ask stupid questions since I am not an expert in RL neither in NLP so sorry in advance!
1- In python/RL/train.py
l307, saver.restore(sess, os.path.join(model_path, model_name)) seems to intialize the weight of the model with some pretrained params, correct? Is it the ones given by the Seq2seq trained as usual in a supervised way? I dont find anywhere the 'model-55' you are using for this... Am I missing something?

2- In python/RL/rl_mpdel.py
Why do we have build_model and build_generator, it seems to have the same setup but not the same output. Is it RL specific?

3- In the paper
Also, in the paper they specified that for the reward they use a seq2seq2 model and not the RL model. Is this taken into consideration in your code?

Thanks a lot for your answers!

Qusetion about ease of answering

Thanks for your sharing.

In RL for ease of answering, the reward is calculated by RL model itself, not another model?

Why not input the action into another pretrained model to obtain the response, and measure its likelihood with a dull response?

dont understand

I don't get how to do this?

Did you use MMI or MLE?

Difference between Pytorch and Tensorflow

Hi,
Is there a specific reason on why you decided to implement this model in Tensorflow and not in Pytorch?

rl还不是很熟悉，但想问一下关于中文语料训练的问题

Did you share the parameters of LSTM in encoder and decoder

In the python/model.py script, the decoding stage:

`with tf.variable_scope("LSTM1"):
output1, state1 = self.lstm1(padding, state1)

with tf.variable_scope("LSTM2"):
output2, state2 = self.lstm2(tf.concat([current_embed, output1], 1), state2)`
which is the same in encoding stage. Did you share the same parameters between encoder and decoder?

can't download model-77

When I download model-77 from https://www.dropbox.com/s/ea5pz0jmp5dyrv0/model-77.data-00000-of-00001?dl=0, it shows "ERROR".

Tensorflow2.x

I would like to ask whether you have tried to modify this into the version of tensorflow2.x?

Dialogue History

How is the dialogue history encoded here? In the paper they say "The previous two dialogue turns are transformed to a vector representation by feeding the concatenation of them into an LSTM encoder model".

I'm not sure how to interpret this and I'm interested in how it's realized here.

Thanks

unable to run ./scripts/train_RL.sh

Traceback (most recent call last):
File "python/RL/train.py", line 470, in
train()
File "python/RL/train.py", line 297, in train
train_op, loss, input_tensors, inter_value = model.build_model()
File "/home/ubuntu/AI_studio/Lakshmi/RL-Chatbot/python/RL/rl_model.py", line 92, in build_model
train_op = tf.train.AdamOptimizer(self.lr).minimize(pg_loss)
File "/home/ubuntu/.local/lib/python2.7/site-packages/tensorflow/python/training/optimizer.py", line 413, in minimize
name=name)
File "/home/ubuntu/.local/lib/python2.7/site-packages/tensorflow/python/training/optimizer.py", line 597, in apply_gradients
self._create_slots(var_list)
File "/home/ubuntu/.local/lib/python2.7/site-packages/tensorflow/python/training/adam.py", line 131, in _create_slots
self._zeros_slot(v, "m", self._name)
File "/home/ubuntu/.local/lib/python2.7/site-packages/tensorflow/python/training/optimizer.py", line 1155, in _zeros_slot
new_slot_variable = slot_creator.create_zeros_slot(var, op_name)
File "/home/ubuntu/.local/lib/python2.7/site-packages/tensorflow/python/training/slot_creator.py", line 190, in create_zeros_slot
colocate_with_primary=colocate_with_primary)
File "/home/ubuntu/.local/lib/python2.7/site-packages/tensorflow/python/training/slot_creator.py", line 164, in create_slot_with_initializer
dtype)
File "/home/ubuntu/.local/lib/python2.7/site-packages/tensorflow/python/training/slot_creator.py", line 74, in _create_slot_var
validate_shape=validate_shape)
File "/home/ubuntu/.local/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 1496, in get_variable
aggregation=aggregation)
File "/home/ubuntu/.local/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 1239, in get_variable
aggregation=aggregation)
File "/home/ubuntu/.local/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 562, in get_variable
aggregation=aggregation)
File "/home/ubuntu/.local/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 514, in _true_getter
aggregation=aggregation)
File "/home/ubuntu/.local/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 882, in _get_single_variable
"reuse=tf.AUTO_REUSE in VarScope?" % name)
ValueError: Variable Wemb/Adam/ does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=tf.AUTO_REUSE in VarScope?

Please provide info on how to solve this error.

Pre-trained model of different dataset

Can you please share pre-trained model of any dataset other than Cornell?

how to generate the candidate action list given an input source[p,q]

Thanks for the code.
In the paper, there is a description "Given an input source [pi; qi], we generate a candidate list A." how to generate the candidate action list in your code? I could not find the code for this.

pochih / rl-chatbot Goto Github PK

rl-chatbot's People

Contributors

Stargazers

Watchers

Forkers

rl-chatbot's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs