lukalabs / cakechat Goto Github PK

CakeChat: Emotional Generative Dialog System

License: Apache License 2.0

Python 100.00%

conversational-ai conversational-agents conversational-bots dialogue-agents dialogue-systems dialog-systems nlp deep-learning seq2seq seq2seq-chatbot

cakechat's People

Contributors

Stargazers

Watchers

Forkers

iamsile sth4k xuanhan863 oyeyipo nilportugues stevenlol bigdatasciencegroup fence yolandamiao lilomarry limin2021 neo4reo little1tow alphadl allensmile chenmoshushi wuyijian aakarkun nikitos9000 sun0f3 datumbox hooram raghparihar mysqlsc kkaushikvarma nicolas-ivanov m0sth8 gridl dsparling cyrke tcxdgit timmoti kalininskiy playchimp cromvell hellozjj rwreynolds lucas-chu oxy emkamal luciany jjzhx1211 zhengjunzhao1991 gulhati tristanpfost marvelousgirl pandinosaurus alikhalilli 7990satyam200 sofineismine melidisc jacobdanovitch hxyshare hydercps terminalkitten meelement tree-ind nic42 lillycorp benjamism abinj winjia ljohansenjunk njncalub strandline ai-jie01 dantodor hatleon g-wang vidhushinisrinivasan16 xwixcn hdulbj hailiang-wang shihuaxing cyzhangathit chinashijiashuai mintdawn martline1 jacswork samsgates ivan2005 teeso guidachengong amironoff apheliongroup dea6cat dieuthu lunayach gusrblanco gauravg8 shubhampachori12110095 jmew tpzjj612 thanatchon36 binojohnthomas khalman-m gossamr benrunciman sungjinlees dfraser74

cakechat's Issues

Unable to download or generate new model

I'm experiencing difficulties with downloading the ready model via tools/download_model.py It would seem that the content from amazon web services cannot be found, see the following from the stack trace:

[22.03.2018 20:40:47.562][WARNING][1108][cakechat.utils.s3.resolver.S3FileResolver][43] File can not be downloaded from AWS S3 because: An error occurred (404) when calling the HeadObject operation: Not Found

It then attempts to create a model which seems to complete successfully.

[22.03.2018 20:40:47.565][INFO][1108][cakechat.dialog_model.model][619] Can't find previously calculated model, so will use a fresh one [22.03.2018 20:40:47.565][INFO][1108][cakechat.dialog_model.model][621] Model is built

However no model is placed in the file path specified by the stack trace.
Here is the full result of executing pythons tools/download_model.py.

Any insight into how to resolve this issue would be much appreciated, thank you!

(base) path: >python tools/download_model.py

PATH\Anaconda2\lib\site-packages\gensim\utils.py:855: UserWarning: detected Windows; aliasing chunkize to chunkize_serial
  warnings.warn("detected Windows; aliasing chunkize to chunkize_serial")
[22.03.2018 20:37:32.302][INFO][1108][cakechat.tools/download_model.py][21] Fetching and pre-compiling pre-trained model...
[22.03.2018 20:37:32.331][INFO][1108][cakechat.dialog_model.model][598] Initializing NN model with the following params:
[22.03.2018 20:37:32.332][INFO][1108][cakechat.dialog_model.model][600] NN input dimension: 256 (token vector size)
[22.03.2018 20:37:32.332][INFO][1108][cakechat.dialog_model.model][601] NN hidden dimension: 512
[22.03.2018 20:37:32.332][INFO][1108][cakechat.dialog_model.model][602] NN output dimension: 39 (dict size)
[22.03.2018 20:37:35.569][INFO][1108][cakechat.dialog_model.model][407] Compiling predict function (log_prob=False)...
[22.03.2018 20:38:18.410][INFO][1108][cakechat.dialog_model.model][434] Compiling one-step predict function (log_prob=False)...
[22.03.2018 20:38:43.671][INFO][1108][cakechat.dialog_model.model][407] Compiling predict function (log_prob=True)...
[22.03.2018 20:39:12.292][INFO][1108][cakechat.dialog_model.model][434] Compiling one-step predict function (log_prob=True)...
[22.03.2018 20:39:36.957][INFO][1108][cakechat.dialog_model.model][483] Compiling sequence scoring function...
[22.03.2018 20:40:05.703][INFO][1108][cakechat.dialog_model.model][506] Compiling sequence scoring function (with thought vectors as arguments)...
[22.03.2018 20:40:45.812][INFO][1108][cakechat.utils.s3.bucket][21] Getting file
 nn_models\processed_dialogs_gru_hd512_drop0.2_encd2_decd2_il30_cs3_ansl32_lr1.0
_gc_5.0_learnemb_cdim128_window10_voc39_vec128_sgTrue from AWS S3 and saving it
as PATH\cakechat-master\data/nn_models\processed_dialogs_g
ru_hd512_drop0.2_encd2_decd2_il30_cs3_ansl32_lr1.0_gc_5.0_learnemb_cdim128_windo
w10_voc39_vec128_sgTrue
[22.03.2018 20:40:47.562][WARNING][1108][cakechat.utils.s3.resolver.S3FileResolv
er][43] File can not be downloaded from AWS S3 because: An error occurred (404)
when calling the HeadObject operation: Not Found
[22.03.2018 20:40:47.565][INFO][1108][cakechat.dialog_model.model][619] Can't fi
nd previously calculated model, so will use a fresh one
[22.03.2018 20:40:47.565][INFO][1108][cakechat.dialog_model.model][621] Model is
 built

[22.03.2018 20:40:47.625][INFO][1108][cakechat.dialog_model.model][625] Model pa
th is PATH\cakechat-master\data/nn_models\processed_dialog
s_gru_hd512_drop0.2_encd2_decd2_il30_cs3_ansl32_lr1.0_gc_5.0_learnemb_cdim128_window10_voc39_vec128_sgTrue

Traceback (most recent call last):
  File "tools/download_model.py", line 22, in <module> get_trained_model(fetch_from_s3=True)
  File "PATH\cakechat-master\cakechat\dialog_model\factory.py", line 53, in get_trained_model
    raise Exception('Can\'t get the model. '
Exception: Can't get the model. Run tools/download_model.py first to get all required files or train it by yourself.

When the training model is successfully restarted, the start-up service fails to find the model.

The custom training model was successful, and an error occurred when running the python bin/cakechat_server.py command to start the service.

The operating system is Centos7.

Amount of training data and training time

Hi, first I would like to say thank you for this amazing repository.

I have 2 questions related with training the model:

What is the amount of data that you used to train your model?

What are the specifications of the hardware that you used to train your model and how much time did it take to completely train?

Thank you very much and I apologize for my bad english.

Training own model

Hi,

I've loaded my own training and validation corpus, ran prepare_index_files.py, and trained it with no issue. Afterwards, when I ran python bin/cakechat_server.py, it continually threw this error:

Traceback (most recent call last):
  File "bin/cakechat_server.py", line 10, in <module>
    from cakechat.api.v1.server import app
  File "C:\...\cakechat\cakechat\api\v1\server.py", line 3, in <module>
    from cakechat.api.response import get_response
  File "C:\...\cakechat\cakechat\api\response.py", line 14, in <module>
    _cakechat_model = get_trained_model(fetch_from_s3=False)
  File "C:\...\cakechat\cakechat\dialog_model\factory.py", line 53, in get_trained_model
    raise Exception('Can\'t get the model. '
Exception: Can't get the model. Run tools/download_model.py first to get all required files or train it by yourself.

I messed around with get_nn_model() in dialog_model/model.py a bit and realized it was looking for a file named:
processed_dialogs_gru_hd512_drop0.2_encd2_decd2_il30_cs3_ansl32_lr1.0_gc_5.0_learnemb_cdim128_window10_voc11786_vec128_sgTrue

My file in data/nn_models was called:
processed_dialogs_gru_hd7_drop0.2_encd2_decd2_il7_cs3_ansl9_lr1.0_gc_5.0_learnemb_cdim128_window10_voc11786_vec15_sgTrue_pp_free2926.32_sensitive3066.80

I made a copy and renamed it. I then tried to both run the server and train it again, and as soon as it tried to load the model, in both instances I got:
ValueError: mismatch: parameter has shape (11786L, 128L) but value to set has shape (11786L, 15L)

Not really sure where to go from here; thanks in advance. Running Windows / Anaconda with py2.7. All dependencies installed and everything else is running fine thus far. I had it working with the pre-trained model. Tried running the server through both Git Bash and cmd, if that makes a difference. Trained through Bash.

Training process gets killed.

Hi!
When running train.py the process always gets killed for some reason.
Could you please help with pointing out if I am doing anything wrong?
I have attached my terminal output.
trainkilled.txt

Emotion Condition vs. Emotion detection

From what I can make out in the code (get_response in cakechat.api.response), you are using the input emotion category (that the user can set - {joy, anger, sadness etc.}) to condition the response. So, do I understand correctly that you are actually not detecting any emotion from the user text input but rather hardwire the emotion in the response to the emotion input category, no matter what the user's emotion in the input text is?

Looks like you are multiplying the emotion condition (from the input) with the condition ids that you gathered from the tokenized user text. What do these condition ids actually relate to?

condition_ids = transform_conditions_to_ids([emotion] * condition_ids_num, _cakechat_model.condition_to_index,
                                                condition_ids_num)

Thanks in advance for the clarifications 👍

Dataset Format

When preparing for training, I was looking through the sample dataset;

[{"text": "Hello", "condition": "neutral"}, {"text": "Oh, hi! :) How are you, my friend?", "condition": "joy"}, {"text": "Doing good", "condition": "neutral"}]

Which phrase is being said by the model and which is manually typed by the user?

To me it looks of the form [{USER STATEMENT}, {MODEL RESPONSE}, {USER AGAIN}]

But this doesn’t make sense to me. I would think the data should be formatted more like [{MODEL STATEMENT}, {USER RESPONSE}, {MODEL AGAIN}]?

Could someone help me clarify which party is intended to be saying which statement in the example and why it is in that order? Thanks.

no such option: --process-dependency-links

--process-dependency-links got deprecated and is now removed from pip so docker build fails. See here.

Tensorflow GPU issue

CLIENT:

root@c6bf55d8c25f:~/cakechat# python tools/test_api.py -f localhost -p 8080 -c "hi!" -c "hi, how are you?" -c "good!" -e "joy"

Output:

Using TensorFlow backend.
{'message': 'Can\'t process request: OOM when allocating tensor with shape[10,39,50000] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc\n\t [[{{node decoder_model/softmax_with_temperature/Softmax}} = Softmax[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](decoder_model/softmax_with_temperature/sub)]]\nHint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.\n\n\t [[{{node decoder_model/softmax_with_temperature/Softmax/_203}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_715_decoder_model/softmax_with_temperature/Softmax", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]\nHint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.\n'}

SERVER output:

2019-08-16 12:03:08.905231: W tensorflow/core/common_runtime/bfc_allocator.cc:271] *************************************************************************************************xxx
2019-08-16 12:03:08.905317: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at softmax_op_gpu.cu.cc:158 : Resource exhausted: OOM when allocating tensor with shape[10,39,50000] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[16.08.2019 12:03:08.906][ERROR][1][cakechat.api.v1.server][5] Can't process request: OOM when allocating tensor with shape[10,39,50000] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[{{node decoder_model/softmax_with_temperature/Softmax}} = Softmax[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](decoder_model/softmax_with_temperature/sub)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

	 [[{{node decoder_model/softmax_with_temperature/Softmax/_203}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_715_decoder_model/softmax_with_temperature/Softmax", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

127.0.0.1 - - [16/Aug/2019 12:03:08] "POST /cakechat_api/v1/actions/get_response HTTP/1.1" 500 -

Setup Issues

I first tried the docker setup but it errored out and I just assumed that it was my fault since I haven't used docker before. Then I moved to the manual setup and it threw the same error.

While installing, the bdust_wheel for scipy errors which gives me:

Failed building wheel for scipy
Failed cleaning build dir for scipy
Failed building wheel for scikit-learn

which then breaks the install with:
Command "/usr/bin/python -u -c "import setuptools, tokenize;file='/tmp/pip-build-Je8Y7m/scipy/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-VPk4Mt-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-Je8Y7m/scipy/

Syntax Error:Invalid Syntax

Compiled on MacOSX 10.14.5

executed "pip install -r requirements.txt -r requirements-local.txt"
then "open cakechat/cakechat/api/config.py" to remove emoji character because it was complaining about non-ascii character encoding
then "python cakechat_server.py -b 127.0.0.1:8080" from the bin directory

Should I update python? 2.7.15

Using TensorFlow backend.
[2019-06-01 11:08:31 -0700] [52444] [ERROR] Exception in worker process
Traceback (most recent call last):
File "/usr/local/lib/python2.7/site-packages/gunicorn/arbiter.py", line 583, in spawn_worker
worker.init_process()
File "/usr/local/lib/python2.7/site-packages/gunicorn/workers/base.py", line 129, in init_process
self.load_wsgi()
File "/usr/local/lib/python2.7/site-packages/gunicorn/workers/base.py", line 138, in load_wsgi
self.wsgi = self.app.wsgi()
File "/usr/local/lib/python2.7/site-packages/gunicorn/app/base.py", line 67, in wsgi
self.callable = self.load()
File "/usr/local/lib/python2.7/site-packages/gunicorn/app/wsgiapp.py", line 52, in load
return self.load_wsgiapp()
File "/usr/local/lib/python2.7/site-packages/gunicorn/app/wsgiapp.py", line 41, in load_wsgiapp
return util.import_app(self.app_uri)
File "/usr/local/lib/python2.7/site-packages/gunicorn/util.py", line 350, in import_app
import(module)
File "/Users/rob/Desktop/cakechat_v2./bin/cakechat_server.py", line 11, in
from cakechat.api.v1.server import app
File "/Users/rob/Desktop/cakechat_v2./cakechat/api/v1/server.py", line 3, in
from cakechat.api.response import get_response
File "/Users/rob/Desktop/cakechat_v2./cakechat/api/response.py", line 6, in
from cakechat.dialog_model.factory import get_trained_model, get_reverse_model
File "/Users/rob/Desktop/cakechat_v2./cakechat/dialog_model/factory.py", line 8, in
from cakechat.dialog_model.inference_model import InferenceCakeChatModel
File "/Users/rob/Desktop/cakechat_v2./cakechat/dialog_model/inference_model.py", line 1, in
from cakechat.dialog_model.keras_model import KerasTFModelIsolator
File "/Users/rob/Desktop/cakechat_v2./cakechat/dialog_model/keras_model.py", line 114
class AbstractKerasModel(AbstractModel, metaclass=abc.ABCMeta):

SyntaxError: invalid syntax
[2019-06-01 11:08:31 -0700] [52444] [INFO] Worker exiting (pid: 52444)
[2019-06-01 11:08:31 -0700] [52441] [INFO] Shutting down: Master
[2019-06-01 11:08:31 -0700] [52441] [INFO] Reason: Worker failed to boot.

Responses are not context-oriented

Hello, I came across your repository and it's a great project! Thank you for sharing!
I tried training a "chit-chat" model on it and it generates sentences that look "correct", but unfortunately quite "irrelevant" to the user's input.
Do you have any suggestion on how to improve the "relevanceness" of the responses to the user's input? (e.g., which decoding algorithm to choose, tuning parameters, or how to affect the sampling process?)
Thanks!

Proper Dataset Formatting

I'm working on some some tools to automate the creation of my own training datasets. I've noticed that there are discrepancies between the supplied dummy data and the downloaded sets from AWS.

The dummy datasets include punctuation which will get tokenized as separate tokens when generating indices. But the AWS dataset does not include punctuation in the token index. Is the right plan of action to strip punctuation from the dataset?

Also there are placeholders like "_unk_" and "_pad_" that get added in as well, these are also not present in the AWS token index, but will be added to a generated index presumably due to the default's set in config.py

cannot get trained model

please close this.

ValueError: numpy.ufunc has the wrong size when launching Docker container

When starting the Docker container generated with
sudo docker build -t cakechat:latest -f dockerfiles/Dockerfile.cpu dockerfiles/

Then running "python tools/download_model.py" inside the container I get the error ValueError: numpy.ufunc has the wrong size.

Complete error below:

Traceback (most recent call last):
  File "bin/cakechat_server.py", line 10, in <module>
    from cakechat.api.v1.server import app
  File "/root/cakechat/cakechat/api/v1/server.py", line 3, in <module>
    from cakechat.api.response import get_response
  File "/root/cakechat/cakechat/api/response.py", line 9, in <module>
    from cakechat.dialog_model.inference import get_nn_responses, warmup_predictor
  File "/root/cakechat/cakechat/dialog_model/inference/__init__.py", line 1, in <module>
    from cakechat.dialog_model.inference.utils import get_sequence_log_probs, get_sequence_score_by_thought_vector, \
  File "/root/cakechat/cakechat/dialog_model/inference/utils.py", line 5, in <module>
    from cakechat.dialog_model.model_utils import get_training_batch
  File "/root/cakechat/cakechat/dialog_model/model_utils.py", line 15, in <module>
    from cakechat.utils.w2v import get_w2v_model
  File "/root/cakechat/cakechat/utils/w2v/__init__.py", line 1, in <module>
    from cakechat.utils.w2v.model import get_w2v_model
  File "/root/cakechat/cakechat/utils/w2v/model.py", line 4, in <module>
    from gensim.models import Word2Vec
  File "/usr/local/lib/python2.7/dist-packages/gensim/__init__.py", line 6, in <module>
    from gensim import parsing, matutils, interfaces, corpora, models, similarities, summarization
  File "/usr/local/lib/python2.7/dist-packages/gensim/models/__init__.py", line 7, in <module>
    from .coherencemodel import CoherenceModel
  File "/usr/local/lib/python2.7/dist-packages/gensim/models/coherencemodel.py", line 30, in <module>
    from gensim.models.wrappers import LdaVowpalWabbit, LdaMallet
  File "/usr/local/lib/python2.7/dist-packages/gensim/models/wrappers/__init__.py", line 8, in <module>
    from .fasttext import FastText
  File "/usr/local/lib/python2.7/dist-packages/gensim/models/wrappers/fasttext.py", line 38, in <module>
    from gensim.models.word2vec import Word2Vec
  File "/usr/local/lib/python2.7/dist-packages/gensim/models/word2vec.py", line 135, in <module>
    from gensim.models.word2vec_inner import train_batch_sg, train_batch_cbow
  File "__init__.pxd", line 861, in init gensim.models.word2vec_inner (./gensim/models/word2vec_inner.c:10917)
ValueError: numpy.ufunc has the wrong size, try recompiling. Expected 192, got 216

Cakechat "can't find token"

Hello guys, I've started recently working with cakechat and I'm facing some issues. I've run prepare_index_files.py (with a french dataset of ~10 000 dialogs) with no issues.

Afterwards, when I run python tools/train.py, it continually threw this kind of error:

[25.04.2018 15:18:33.495][INFO][15][cakechat.utils.s3.bucket][21] Got file w2v_models/train_processed_dialogs_window10_voc50000_vec128_sgTrue.bin from S3
[25.04.2018 15:18:33.510][INFO][15][cakechat.utils.w2v.model][51] Loading model from /root/cakechat/data/w2v_models/train_processed_dialogs_window10_voc50000_vec128_sgTrue.bin
[25.04.2018 15:18:33.794][INFO][15][cakechat.utils.w2v.model][53] Model "train_processed_dialogs_window10_voc50000_vec128_sgTrue.bin" has been loaded.
[25.04.2018 15:18:33.794][INFO][15][cakechat.utils.w2v.model][80] Successfully got w2v model
[25.04.2018 15:18:33.794][INFO][15][cakechat.dialog_model.model_utils][205] Preparing embedding matrix based on w2v_model and index_to_token dict

[25.04.2018 15:18:33.806][WARNING][15][cakechat.dialog_model.model_utils][195] Can't find token [ça] in w2v dict
[25.04.2018 15:18:33.806][WARNING][15][cakechat.dialog_model.model_utils][195] Can't find token [avec] in w2v dict
[25.04.2018 15:18:33.806][WARNING][15][cakechat.dialog_model.model_utils][195] Can't find token [même] in w2v dict
[25.04.2018 15:18:33.806][WARNING][15][cakechat.dialog_model.model_utils][195] Can't find token [ils] in w2v dict
[25.04.2018 15:18:33.806][WARNING][15][cakechat.dialog_model.model_utils][195] Can't find token [être] in w2v dict
[25.04.2018 15:18:33.807][WARNING][15][cakechat.dialog_model.model_utils][195] Can't find token [suis] in w2v dict
[25.04.2018 15:18:33.807][WARNING][15][cakechat.dialog_model.model_utils][195] Can't find token [quand] in w2v dict
...

I have about 38 521 warning like that.
I've checked, all these tokens are in token_index/t_idx_processed_dialogs.json (it's weird because there's 50 000 words inside, and some of them are found)

Here is what my data/ folder looks like:
data/condition_index/c_idx_processed_dialogs.json

corpora_processed/train_processed_dialogs.txt
corpora_processed/train_processed_dialogs.txt

quality/context_free_questions.txt
quality/context_free_test_set.txt
quality/context_free_validation_set.txt

tensorboard/steps

token_index/t_idx_processed_dialogs.json

w2v_models/train_processed_dialogs_window10_voc50000_vec128_sgTrue.bin

Finally, when the step bellow comes, the train.py processed is killed:

...
[25.04.2018 15:20:26.123][INFO][15][cakechat.dialog_model.model][348] Computing train updates...
[25.04.2018 15:22:27.128][INFO][15][cakechat.dialog_model.model][351] Compiling train function...
Killed

(IS_DEV flag has been set to 0)

Thanks a lot for you help and for all your work !

How to start the model in continuous Q&A flow?

Once we have tested the model using

python tools/test_api.py -f 127.0.0.1 -p 8080
-c "Hi, Eddie, what's up?"
-c "Not much, what about you?"
-c "Fine, thanks. Are you going to the movies tomorrow?"

how do we start up a session that works like an actual chatbot? Like, immediate question and answers..?

How to get only 1 sentence of answer?

When I input a question (single sentence), I have received a respond (with multi sentences) from my custom model. How to get an answer with only 1 sentence?
Thank you for helping!

parameter mismatch error

When I was trainning the model, I had this parameter mismatch error. I use Windows and Anaconda with Python 2.7. The trainning corpus is the dummy corpus provided. I did not use Docker since Docer-gpu is not supported on Windows. Thanks a lot!

APK?

I'd rather not install this through the Google play store, but I don't trust these various apk pages either. I guess technically the issue I'm reporting is the lack of an APK on the releases page.

Thanks for your patience.

Server Not Found

Hello!
Just getting started with cakechat, installing it without docker on macos.
worked perfectly up until I ran "python bin/cakechat_server.py"
It loaded the server, and said it was running, but when I visited the site it said
"Not Found
The requested URL was not found on the server. If you entered the URL manually please check your spelling and try again."

Honestly, have no idea how to go about this -- would really appreciate your help :)

Can't wait to get started messing around and building awesome stuff with CakeChat! I'll be sure to keep you updated with my project, hopefully, if I can get cakechat running properly.

Propose Logo

Hi. I'm graphic disigner. I would like to know if you are interested that I make a logo for your project? If you allowed me, i"ll make logo for your project and it's free.

Would I need to make a lot of changes to the algorithms to introduce two conditions in each dataset sentence ?

Hey guys, I am trying to put two conditions on each line so that the bot can reply on bit more specific topics than just the single user condition behind them. Would this require a massive change in the files or can I just feed more conditions on each dataset and change the condition values in config.py etc?
I changed EMOTIONS_TYPES = create_namedtuple_instance() from config.py; and MAX_CONDITIONS_NUM = $ from prepare_index_files. What else would I have to change to have more than one condition?
This is more of a technical discussion rather than an issue. Thanks!

Grammatically-correct response

How to get answer with grammatically-correct words?
I have trained new model on ru data and response generated by neural network consist of bunch of words in different forms (morphology, case, plur, etc.) - not normalized like https://pymorphy2.readthedocs.io/en/latest/user/guide.html does. So it is hard to interpret is it correct or not.

I cannot find any implementation related.
Also, word2vec model is generated for train_processed_dialogs.txt ? if so i think it's bad, why dont you use general w2v like piskvorky/gensim-data#3

tools/fetch.py ImportError: DLL load failed: The specified module could not be found. error

I am running python 3.6.7 on windows 10 pro and when I try and run the command to download the pre-trained model I get this error related to scipy

Traceback (most recent call last):
  File "tools/fetch.py", line 15, in <module>
    from cakechat.dialog_model.factory import get_trained_model
  File "D:\Python\salvation\cakechat\cakechat\dialog_model\factory.py", line 8, in <module>
    from cakechat.dialog_model.inference_model import InferenceCakeChatModel
  File "D:\Python\salvation\cakechat\cakechat\dialog_model\inference_model.py", line 1, in <module>
    from cakechat.dialog_model.keras_model import KerasTFModelIsolator
  File "D:\Python\salvation\cakechat\cakechat\dialog_model\keras_model.py", line 11, in <module>
    from cakechat.dialog_model.abstract_model import AbstractModel
  File "D:\Python\salvation\cakechat\cakechat\dialog_model\abstract_model.py", line 6, in <module>
    from cakechat.dialog_model.quality.metrics.utils import MetricsSerializer
  File "D:\Python\salvation\cakechat\cakechat\dialog_model\quality\__init__.py", line 2, in <module>
    from cakechat.dialog_model.quality.metrics.lexical_simlarity import calculate_lexical_similarity, get_tfidf_vectorizer
  File "D:\Python\salvation\cakechat\cakechat\dialog_model\quality\metrics\lexical_simlarity.py", line 3, in <module>
    from sklearn.feature_extraction.text import TfidfVectorizer
  File "C:\Python36\lib\site-packages\sklearn\__init__.py", line 76, in <module>
    from .base import clone
  File "C:\Python36\lib\site-packages\sklearn\base.py", line 16, in <module>
    from .utils import _IS_32BIT
  File "C:\Python36\lib\site-packages\sklearn\utils\__init__.py", line 20, in <module>
    from .validation import (as_float_array,
  File "C:\Python36\lib\site-packages\sklearn\utils\validation.py", line 21, in <module>
    from .fixes import _object_dtype_isnan
  File "C:\Python36\lib\site-packages\sklearn\utils\fixes.py", line 18, in <module>
    from scipy.sparse.linalg import lsqr as sparse_lsqr  # noqa
  File "C:\Python36\lib\site-packages\scipy\sparse\linalg\__init__.py", line 113, in <module>
    from .isolve import *
  File "C:\Python36\lib\site-packages\scipy\sparse\linalg\isolve\__init__.py", line 6, in <module>
    from .iterative import *
  File "C:\Python36\lib\site-packages\scipy\sparse\linalg\isolve\iterative.py", line 10, in <module>
    from . import _iterative
ImportError: DLL load failed: The specified module could not be found.

How to improve the responses of the model?

AttributeError: 'module' object has no attribute '_get_ndarray_c_version'

Outputs this right here... Not quite sure how to fix...

Tried deleting the stuff in w2v_models to no avail - help!

[17.05.2019 14:44:06.637][INFO][11844][cakechat.utils.files_utils][87] Loading /root/cakechat/data/tensorboard/steps
[17.05.2019 14:44:06.637][INFO][11844][cakechat.tools/train.py][102] THEANO_FLAGS: floatX=float32,device=cpu
[17.05.2019 14:44:06.639][INFO][11844][cakechat.tools/train.py][42] Getting train iterator for w2v...
[17.05.2019 14:44:06.639][INFO][11844][cakechat.tools/train.py][48] Getting text-filtered train iterator...
[17.05.2019 14:44:06.639][INFO][11844][cakechat.tools/train.py][51] Getting tokenized train iterator...
[17.05.2019 14:44:06.640][INFO][11844][cakechat.utils.w2v.model][64] Getting w2v model
[17.05.2019 14:44:06.789][INFO][11844][cakechat.utils.s3.bucket][19] Getting file w2v_models/train_LUCI_window10_voc891_vec128_sgTrue.bin from AWS S3 and saving it as /root/cakechat/data/w2v_models/train_LUCI_window10_voc891_vec128_sgTrue.bin
[17.05.2019 14:44:07.206][WARNING][11844][cakechat.utils.s3.resolver.S3FileResolver][42] File can not be downloaded from AWS S3 because: An error occurred (404) when calling the HeadObject operation: Not Found
[17.05.2019 14:44:07.206][INFO][11844][cakechat.utils.w2v.model][18] Word2Vec model will be trained now. It can take long, so relax and have fun.
[17.05.2019 14:44:07.206][INFO][11844][cakechat.utils.w2v.model][21] Parameters for training: window10_voc891_vec128_sgTrue
[17.05.2019 14:44:07.255][INFO][11844][cakechat.utils.w2v.model][44] Saving model to /root/cakechat/data/w2v_models/train_LUCI_window10_voc891_vec128_sgTrue.bin
[17.05.2019 14:44:07.263][INFO][11844][cakechat.utils.w2v.model][47] Model has been saved
[17.05.2019 14:44:07.263][INFO][11844][cakechat.utils.w2v.model][80] Successfully got w2v model

[17.05.2019 14:44:07.263][INFO][11844][cakechat.dialog_model.model_utils][202] Preparing embedding matrix based on w2v_model and index_to_token dict
[17.05.2019 14:44:07.263][WARNING][11844][cakechat.dialog_model.model_utils][192] Can't find token [_unk_] in w2v dict
Traceback (most recent call last):
  File "tools/train.py", line 107, in <module>
    train(init_path=args.init_weights, is_reverse_model=args.reverse)
  File "tools/train.py", line 79, in train
    resolver_factory=nn_model_resolver_factory, is_reverse_model=is_reverse_model)
  File "/root/cakechat/cakechat/dialog_model/model.py", line 714, in get_nn_model
    is_reverse_model=is_reverse_model)
  File "/root/cakechat/cakechat/dialog_model/model.py", line 88, in __init__
    self._compile_theano_functions_for_prediction()
  File "/root/cakechat/cakechat/dialog_model/model.py", line 144, in _compile_theano_functions_for_prediction
    self.predict_prob = self._get_predict_fn(logarithm_output_probs=False)
  File "/root/cakechat/cakechat/dialog_model/model.py", line 464, in _get_predict_fn
    output_probs = self._get_nn_output()
  File "/root/cakechat/cakechat/dialog_model/model.py", line 443, in _get_nn_output
    output_probs = get_output(self._net['dist'], deterministic=True)
  File "/usr/local/lib/python2.7/dist-packages/lasagne/layers/helper.py", line 197, in get_output
    all_outputs[layer] = layer.get_output_for(layer_inputs, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/lasagne/layers/recurrent.py", line 1489, in get_output_for
    strict=True)[0]
  File "/usr/local/lib/python2.7/dist-packages/theano/scan_module/scan.py", line 1048, in scan
    local_op = scan_op.Scan(inner_inputs, new_outs, info)
  File "/usr/local/lib/python2.7/dist-packages/theano/scan_module/scan_op.py", line 216, in __init__
    [])
  File "/usr/local/lib/python2.7/dist-packages/theano/gof/cc.py", line 1300, in cmodule_key_variables
    c_compiler)
  File "/usr/local/lib/python2.7/dist-packages/theano/gof/cc.py", line 1350, in cmodule_key_
    np.core.multiarray._get_ndarray_c_version())
AttributeError: 'module' object has no attribute '_get_ndarray_c_version'

Python test.api cannot establish new connectioon.

Installing on Google Cloud Platform

I was able to successfully install cakechat on my laptop (Mac OSX), but I ran into error messages when I tried installing it on Google Cloud Platform. Could you help with this or update your README to include instructions for running it there?

Using the trained model

I have retrained the default model and trained it on a different data using the downloaded model weights as initial model weights(in Reverse Mode).
So now the question is how to use the model that i trained and not the one that was downloaded through the provided script download_model.py
Also, i wish to use colab environment and i am not sure how to use gpu, as i succesfully configured the libgpuarray but dont know where to put the gpu id or how to find it in colab. The most i could figure was, the gpu device name is "device:GPU:0" with similar steps given in(https://www.kdnuggets.com/2018/02/google-colab-free-gpu-tutorial-tensorflow-keras-pytorch.html/2)
i reached the utils/env.py file in cakechat and there a 'GPU-ID' was discovered but what value to supply is not clear.
It will be great if you could help me with that.

Thanks

Training on group chat instead of one on one conversation

Would it be possible to train the existing architecture on group data instead of one on one conversation?
if so:
whats the best way to do this?
if not:
what changes could be made to make this possible?

DialogFlow

If you were to do this app today, would you have used DialogFlow?

Recommended server specs?

I've got cakechat running now on a Debian box that I set up on Google Cloud Platform and am using it on a test app receives messages and sends back response to a Facebook messenger app. It's nice!

I'm wondering if you have any suggestions regarding server requirements: recommended RAM, disk size, number of CPUs, etc.?

Configuring to work with google colab

how to add support to train using Google Colab to take advantage of cloud computing power and train better models?

Adding "memory"

Is there a way to add a sort of memory to the AI? Remembering semantics like information about them or information about them. Similar to how Replika.ai remembers details.

If there is no built feature for this how could I go about creating it myself?

Setup documentation

I've downloaded and trained the model, but when I launch the server and try to call it in my browser I get a This site can't be reached error. I've tried http://localhost:8080 and http://127.0.0.1:8080/.

I looked around and saw that I could use docker inspect <CONTAINER-NAME> which yielded an ip of 172.17.0.2, which could also not be reached.

Also, when I try the test conversation I get

requests.exceptions.ConnectionError: HTTPConnectionPool(host='127.0.0.1', port=8080): Max retries exceeded with url: /cakechat_api/v1/actions/get_response (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x103607eb8>: Failed to establish a new connection: [Errno 61] Connection refused',))

I found this resource helpful, although I still don't know how I can access the Docker port mapping.

I'm super psyched to get involved and help out with this project, even as a total bot beginner, so perhaps I can help build out a fuller "Absolute Beginner" doc to support such simple client-server debugging?

Quick start for GPU version:

I got an error after using the command:

docker pull lukalabs/cakechat-gpu:latest && \
nvidia-docker run --name cakechat-gpu-server -p 127.0.0.1:8080:8080 -it lukalabs/cakechat-gpu:latest bash -c "CUDA_VISIBLE_DEVICES=0 python bin/cakechat_server.py"

Output:

2019-08-18 14:09:11.496399: W tensorflow/core/common_runtime/bfc_allocator.cc:271] ***************************************************************************************xxxxxxxxxxxxx
2019-08-18 14:09:11.496449: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at random_op.cc:202 : Resource exhausted: OOM when allocating tensor with shape[50000,128] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1334, in _do_call
    return fn(*args)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[768,2304] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[{{node decoder_scope/decoder_1/random_uniform/RandomUniform}} = RandomUniform[T=DT_INT32, dtype=DT_FLOAT, seed=87654321, seed2=5561963, _device="/job:localhost/replica:0/task:0/device:GPU:0"](decoder_scope/decoder_1/random_uniform/shape)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "bin/cakechat_server.py", line 11, in <module>
    from cakechat.api.v1.server import app
  File "/root/cakechat/cakechat/api/v1/server.py", line 3, in <module>
    from cakechat.api.response import get_response
  File "/root/cakechat/cakechat/api/response.py", line 14, in <module>
    _cakechat_model = get_trained_model(reverse_model=get_reverse_model(PREDICTION_MODE))
  File "/usr/local/lib/python3.5/dist-packages/cachetools/__init__.py", line 46, in wrapper
    v = func(*args, **kwargs)
  File "/root/cakechat/cakechat/dialog_model/factory.py", line 76, in get_trained_model
    model.init_model()
  File "/root/cakechat/cakechat/dialog_model/keras_model.py", line 30, in wrapper
    return func(*args, **kwargs)
  File "/root/cakechat/cakechat/dialog_model/keras_model.py", line 279, in init_model
    self.print_weights_summary()
  File "/root/cakechat/cakechat/dialog_model/keras_model.py", line 30, in wrapper
    return func(*args, **kwargs)
  File "/root/cakechat/cakechat/dialog_model/keras_model.py", line 263, in print_weights_summary
    weights = self._model.get_weights()
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/network.py", line 492, in get_weights
    return K.batch_get_value(weights)
  File "/usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py", line 2420, in batch_get_value
    return get_session().run(ops)
  File "/usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py", line 206, in get_session
    session.run(tf.variables_initializer(uninitialized_vars))
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 929, in run
    run_metadata_ptr)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1152, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1328, in _do_run
    run_metadata)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1348, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[768,2304] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[node decoder_scope/decoder_1/random_uniform/RandomUniform (defined at /usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py:4139)  = RandomUniform[T=DT_INT32, dtype=DT_FLOAT, seed=87654321, seed2=5561963, _device="/job:localhost/replica:0/task:0/device:GPU:0"](decoder_scope/decoder_1/random_uniform/shape)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.


Caused by op 'decoder_scope/decoder_1/random_uniform/RandomUniform', defined at:
  File "bin/cakechat_server.py", line 11, in <module>
    from cakechat.api.v1.server import app
  File "<frozen importlib._bootstrap>", line 969, in _find_and_load
  File "<frozen importlib._bootstrap>", line 958, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 673, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 665, in exec_module
  File "<frozen importlib._bootstrap>", line 222, in _call_with_frames_removed
  File "/root/cakechat/cakechat/api/v1/server.py", line 3, in <module>
    from cakechat.api.response import get_response
  File "<frozen importlib._bootstrap>", line 969, in _find_and_load
  File "<frozen importlib._bootstrap>", line 958, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 673, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 665, in exec_module
  File "<frozen importlib._bootstrap>", line 222, in _call_with_frames_removed
  File "/root/cakechat/cakechat/api/response.py", line 14, in <module>
    _cakechat_model = get_trained_model(reverse_model=get_reverse_model(PREDICTION_MODE))
  File "/usr/local/lib/python3.5/dist-packages/cachetools/__init__.py", line 46, in wrapper
    v = func(*args, **kwargs)
  File "/root/cakechat/cakechat/dialog_model/factory.py", line 76, in get_trained_model
    model.init_model()
  File "/root/cakechat/cakechat/dialog_model/keras_model.py", line 30, in wrapper
    return func(*args, **kwargs)
  File "/root/cakechat/cakechat/dialog_model/keras_model.py", line 277, in init_model
    self._model = self._build_model()
  File "/root/cakechat/cakechat/dialog_model/model.py", line 253, in _build_model
    decoder_training_model, decoder_model = self._decoder(y_tokens_emb_model, condition_emb_model)
  File "/root/cakechat/cakechat/dialog_model/model.py", line 412, in _decoder
    (outputs_seq_0, initial_state=dec_hs_1)
  File "/usr/local/lib/python3.5/dist-packages/keras/layers/recurrent.py", line 570, in __call__
    output = super(RNN, self).__call__(full_input, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/base_layer.py", line 431, in __call__
    self.build(unpack_singleton(input_shapes))
  File "/usr/local/lib/python3.5/dist-packages/keras/layers/cudnn_recurrent.py", line 237, in build
    constraint=self.kernel_constraint)
  File "/usr/local/lib/python3.5/dist-packages/keras/legacy/interfaces.py", line 91, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/base_layer.py", line 249, in add_weight
    weight = K.variable(initializer(shape),
  File "/usr/local/lib/python3.5/dist-packages/keras/initializers.py", line 218, in __call__
    dtype=dtype, seed=self.seed)
  File "/usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py", line 4139, in random_uniform
    dtype=dtype, seed=seed)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/random_ops.py", line 243, in random_uniform
    rnd = gen_random_ops.random_uniform(shape, dtype, seed=seed1, seed2=seed2)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_random_ops.py", line 733, in random_uniform
    name=name)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
    op_def=op_def)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1770, in __init__
    self._traceback = tf_stack.extract_stack()

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[768,2304] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[node decoder_scope/decoder_1/random_uniform/RandomUniform (defined at /usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py:4139)  = RandomUniform[T=DT_INT32, dtype=DT_FLOAT, seed=87654321, seed2=5561963, _device="/job:localhost/replica:0/task:0/device:GPU:0"](decoder_scope/decoder_1/random_uniform/shape)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

But cpu version is ok.

Adding more than 5 emotions as conditions.

Hey guys, Im using cakechat to learn emotions as the original model, But i'm adding more than 5 emotions as my condition in the training data. I also changed config.py to look for all my emotions and save them as the proper variable. However, on training my data, I get a cmd response as 'killed', and my models are not made. Any idea why?
Here's my changes of the code :

Example of data :
[{"text" : "Hi there!", "condition" : "Anticipation"}, {"text" : "Hey you. Long time no see!", "condition" : "Surprise"}, {"text" : "Sorry, Ive been busy", "condition" : "Expectation"}, {"text" : "No problem. Being busy is a part of life", "condition" : "Acceptance"}]

Changes in config.py :
EMOTIONS_TYPES = create_namedtuple_instance(
'EMOTIONS_TYPES', neutral='Neutral', anger='Anger', joy='Joy', sadness='Sadness', sadnessLonely='SadnessLonely', anticipation='Anticipation', trust="Trust", acceptance='Acceptance', surprise="Surprise").

How to deploy model in production?

Hello there! Thanks for the valuable and well-rounded project.

I see that you're using Flask app to serve model predictions using simple REST API. Can you please do a guide on how to deploy a trained model in production that could handle multiple requests at once? And maybe even scale (number of machines) with the number of requests? Including setting up the VM's, environment, etc...?

The community is really lacking guides like that so this can be very helpful.

iOS Support

How would I be able to use this model in iOS. Is there an API to call to or can I convert the model to CoreML?

tools/fetch.py fails. File can not be downloaded.

Hi there.

When I run fetch.py It fails with the following errors:

[13.06.2019` 03:43:43.186][INFO][16608][cakechat.dialog_model.inference_model.InferenceCakeChatModel][130] Looking for the previously trained model
[13.06.2019 03:43:43.186][INFO][16608 [cakechat.dialog_model.inference_model.InferenceCakeChatModel][131] Model params str: {"corpus_name": "processed_dialogs", "dense_dropout_ratio": 0.2, "epochs_num": 2, "hidden_layer_dim": 768, "input_context_size": 3, "input_seq_len": 30, "is_reverse_model": true, "optimizer": {"clipvalue": 5.0, "decay": 0.0, "epsilon": 1e-07, "lr": 6.0, "rho": 0.95}, "output_seq_len": 32, "token_embedding_dim": 128, "train_batch_size": 196, "training_callbacks": {"CakeChatEvaluatorCallback": {"eval_state_per_batches": 500}}, "training_data": "train_processed_dialogs", "validation_data": "context_free_validation_set,val_processed_dialogs", "voc_size": 101, "w2v_model": "train_processed_dialogs_window10_voc50000_vec128_sgTrue"}
[13.06.2019 03:43:43.260][INFO][16608][cakechat.utils.s3.bucket][19] Getting file nn_models/reverse_cakechat_v2.0_keras_tf_617dfa4a1691.tar.gz from AWS S3 and saving it as /mnt/amadeus/chatbot/results/nn_models/reverse_cakechat_v2.0_keras_tf_617dfa4a1691.tar.gz
[13.06.2019 03:43:44.084][WARNING][16608][cakechat.utils.s3.resolver.S3FileResolver][45] File can not be downloaded from AWS S3 because: An error occurred (404) when calling the HeadObject operation: Not Found
[13.06.2019 03:43:44.085][ERROR][16608][cakechat.dialog_model.inference_model.InferenceCakeChatModel][136] Can't find previously trained model in /mnt/amadeus/chatbot/results/nn_models/reverse_cakechat_v2.0_keras_tf_617dfa4a1691

Thanks for all your work on this project!

Is continuous training supported?

I was just wondering if cakechat works off of precompiled static models, or if it can "learn" from people talking to it. It's not really a big deal either way as I could just periodically retrain it using an updated dataset.

Docker containers does not start on docker run

Now in the context of using your Docker container in container platforms like OpenShift or in a orchestration platform like Kubernetes, this is a bit of a no-no.

I'm proposing to add another CMD layer that would be used by container platforms to tell the container to start instead of dropping to the shell, which would cause a back off in OpenShift/Kubernetes,

This is doable in CPU but GPU would need some special configuration, which makes it a bit unachievable at the moment.

AMD ROCm support

Since TensorFlow just introduced ROCm support on their ML framework, I think CakeChat should also follow the trend since it will allow our AMD folks to run CakeChat without any dependency to NVIDIA and CUDA

Quickstart

Hi there!

I've followed the Quickstart instructions, yet am hitting an error I'm wondering if you have advice on:

simplejson.errors.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

And on the server side
127.0.0.1 - - [18/Jan/2019 20:38:11] code 501, message Unsupported method ('POST')
127.0.0.1 - - [18/Jan/2019 20:38:11] "POST /cakechat_api/v1/actions/get_response HTTP/1.1" 501 -

I'm running Docker on Mac, which is working fine (help, ps, etc.). I'll keep exploring, but wanted to post this in case there's something obvious I can try.

Thanks!

Sio

How to prepare Corpus?

I have a corpus file, but I don't know how to easily make it have "Each line of the corpus file should be a JSON object containing a list of dialog messages sorted in chronological order."

Is there a tool where I can take a downloaded corpus and translate it to cakechat format

TypeError: unsupported format string passed to tuple.format

I'm trying to train the model using my own data. I used Dockerfile3.cpu as a step-by-step installation guide so I'm running latest master version with python3. Then I replaced

data/corpora_processed/train_processed_dialogs.txt.
data/corpora_processed/val_processed_dialogs.txt
data/quality/context_free_validation_set.txt
data/quality/context_free_questions.txt
data/quality/context_free_test_set.txt
with my own data.
All my messages in all chats have neutral condition.
After running python tools/prepare_index_files.py I got a c_idx_processed_dialogs.json file with the following content:

{"0": "neutral"}

Then I run python tools/train.py and it fails with the following error:

Log output:

[23.01.2019 17:40:37.133][INFO][11346][cakechat.utils.files_utils][91] Creating /home/unnamed/Projects/ml_chatbot/cakechat/data/tensorboard/steps
[23.01.2019 17:40:37.134][INFO][11346][cakechat.tools/train.py][102] THEANO_FLAGS: floatX=float32,device=cpu
[23.01.2019 17:40:37.141][INFO][11346][cakechat.tools/train.py][42] Getting train iterator for w2v...
[23.01.2019 17:40:37.142][INFO][11346][cakechat.tools/train.py][48] Getting text-filtered train iterator...
[23.01.2019 17:40:37.142][INFO][11346][cakechat.tools/train.py][51] Getting tokenized train iterator...
[23.01.2019 17:40:37.142][INFO][11346][cakechat.utils.w2v.model][64] Getting w2v model
[23.01.2019 17:40:37.182][INFO][11346][cakechat.utils.s3.bucket][19] Getting file w2v_models/train_processed_dialogs_window10_voc12477_vec128_sgTrue.bin from AWS S3 and saving it as /home/unnamed/Projects/ml_chatbot/cakechat/data/w2v_models/train_processed_dialogs_window10_voc12477_vec128_sgTrue.bin
[23.01.2019 17:40:38.899][WARNING][11346][cakechat.utils.s3.resolver.S3FileResolver][42] File can not be downloaded from AWS S3 because: An error occurred (404) when calling the HeadObject operation: Not Found
[23.01.2019 17:40:38.899][INFO][11346][cakechat.utils.w2v.model][18] Word2Vec model will be trained now. It can take long, so relax and have fun.
[23.01.2019 17:40:38.899][INFO][11346][cakechat.utils.w2v.model][21] Parameters for training: window10_voc12477_vec128_sgTrue
[23.01.2019 17:40:39.382][INFO][11346][cakechat.utils.w2v.model][44] Saving model to /home/unnamed/Projects/ml_chatbot/cakechat/data/w2v_models/train_processed_dialogs_window10_voc12477_vec128_sgTrue.bin
[23.01.2019 17:40:39.495][INFO][11346][cakechat.utils.w2v.model][47] Model has been saved
[23.01.2019 17:40:39.495][INFO][11346][cakechat.utils.w2v.model][80] Successfully got w2v model

[23.01.2019 17:40:39.495][INFO][11346][cakechat.dialog_model.model_utils][202] Preparing embedding matrix based on w2v_model and index_to_token dict
[23.01.2019 17:40:39.497][WARNING][11346][cakechat.dialog_model.model_utils][192] Can't find token [_unk_] in w2v dict
[23.01.2019 17:40:40.214][INFO][11346][cakechat.dialog_model.model][466] Compiling predict function (log_prob=False)...
[23.01.2019 17:40:44.393][INFO][11346][cakechat.dialog_model.model][493] Compiling one-step predict function (log_prob=False)...
[23.01.2019 17:40:47.393][INFO][11346][cakechat.dialog_model.model][466] Compiling predict function (log_prob=True)...
[23.01.2019 17:40:51.105][INFO][11346][cakechat.dialog_model.model][493] Compiling one-step predict function (log_prob=True)...
[23.01.2019 17:40:54.295][INFO][11346][cakechat.dialog_model.model][542] Compiling sequence scoring function...
[23.01.2019 17:40:57.781][INFO][11346][cakechat.dialog_model.model][565] Compiling sequence scoring function (with thought vectors as arguments)...
Net shapes:
	input_y              	(None, None)
	emb_y                	(None, None, 128)
	thought_vector       	(None, 512)
	input_x              	(None, None, None)
	None                 	(None, None)
	emb_x                	(None, None, 128)
	mask_x               	(None, None)
	encoder_forward      	(None, None, 512)
	encoder_backward     	(None, None, 512)
	encoder_bidirectional_concat 	(None, None, 1024)
	encoder_1            	(None, 512)
	None                 	(None, None, 512)
	context_encoder      	(None, 512)
	None                 	(None, 512)
	repeat_layer         	(None, None, 512)
	input_condition_id   	(None,)
	embedding_condition_id 	(None, 128)
	embedding_condition_id_repeated 	(None, None, 128)
	decoder_concated_input 	(None, None, 768)
	mask_y               	(None, None)
	hid_states_decoder   	(None, 2, None)
	None                 	(None, None)
	decoder_1            	(None, None, 512)
	None                 	(None, None)
	decoder_2            	(None, None, 512)
	None                 	(None, 512)
	decoder_dropout_layer 	(None, 512)
	dense_output_probs   	(None, 12477)
[23.01.2019 17:41:03.248][INFO][11346][cakechat.utils.s3.bucket][19] Getting file nn_models/cakechat_v1.3_processed_dialogs_gru_hd512_cdim128_drop0.2_encd2_decd2_il30_cs3_ansl32_lr1.0_gc5.0_learnemb from AWS S3 and saving it as /home/unnamed/Projects/ml_chatbot/cakechat/data/nn_models/cakechat_v1.3_processed_dialogs_gru_hd512_cdim128_drop0.2_encd2_decd2_il30_cs3_ansl32_lr1.0_gc5.0_learnemb
[23.01.2019 17:41:32.900][INFO][11346][cakechat.utils.s3.bucket][21] Got file nn_models/cakechat_v1.3_processed_dialogs_gru_hd512_cdim128_drop0.2_encd2_decd2_il30_cs3_ansl32_lr1.0_gc5.0_learnemb from S3
[23.01.2019 17:41:32.904][INFO][11346][cakechat.dialog_model.model][626] 
Loading saved weights from file:
/home/unnamed/Projects/ml_chatbot/cakechat/data/nn_models/cakechat_v1.3_processed_dialogs_gru_hd512_cdim128_drop0.2_encd2_decd2_il30_cs3_ansl32_lr1.0_gc5.0_learnemb


Restored saved params:
	encoder_forward.W_in_to_updategate
	encoder_forward.W_hid_to_updategate
	encoder_forward.b_updategate
	encoder_forward.W_in_to_resetgate
	encoder_forward.W_hid_to_resetgate
	encoder_forward.b_resetgate
	encoder_forward.W_in_to_hidden_update
	encoder_forward.W_hid_to_hidden_update
	encoder_forward.b_hidden_update
	encoder_forward.hid_init
	encoder_backward.W_in_to_updategate
	encoder_backward.W_hid_to_updategate
	encoder_backward.b_updategate
	encoder_backward.W_in_to_resetgate
	encoder_backward.W_hid_to_resetgate
	encoder_backward.b_resetgate
	encoder_backward.W_in_to_hidden_update
	encoder_backward.W_hid_to_hidden_update
	encoder_backward.b_hidden_update
	encoder_backward.hid_init
	encoder_1.W_in_to_updategate
	encoder_1.W_hid_to_updategate
	encoder_1.b_updategate
	encoder_1.W_in_to_resetgate
	encoder_1.W_hid_to_resetgate
	encoder_1.b_resetgate
	encoder_1.W_in_to_hidden_update
	encoder_1.W_hid_to_hidden_update
	encoder_1.b_hidden_update
	encoder_1.hid_init
	context_encoder.W_in_to_updategate
	context_encoder.W_hid_to_updategate
	context_encoder.b_updategate
	context_encoder.W_in_to_resetgate
	context_encoder.W_hid_to_resetgate
	context_encoder.b_resetgate
	context_encoder.W_in_to_hidden_update
	context_encoder.W_hid_to_hidden_update
	context_encoder.b_hidden_update
	context_encoder.hid_init
	decoder_1.W_in_to_updategate
	decoder_1.W_hid_to_updategate
	decoder_1.b_updategate
	decoder_1.W_in_to_resetgate
	decoder_1.W_hid_to_resetgate
	decoder_1.b_resetgate
	decoder_1.W_in_to_hidden_update
	decoder_1.W_hid_to_hidden_update
	decoder_1.b_hidden_update
	decoder_2.W_in_to_updategate
	decoder_2.W_hid_to_updategate
	decoder_2.b_updategate
	decoder_2.W_in_to_resetgate
	decoder_2.W_hid_to_resetgate
	decoder_2.b_resetgate
	decoder_2.W_in_to_hidden_update
	decoder_2.W_hid_to_hidden_update
	decoder_2.b_hidden_update

Missing saved params:

Shapes-mismatched params (saved -> current):
Traceback (most recent call last):
  File "tools/train.py", line 107, in <module>
    train(init_path=args.init_weights, is_reverse_model=args.reverse)
  File "tools/train.py", line 79, in train
    resolver_factory=nn_model_resolver_factory, is_reverse_model=is_reverse_model)
  File "/home/unnamed/Projects/ml_chatbot/cakechat/cakechat/dialog_model/model.py", line 723, in get_nn_model
    model.load_weights()
  File "/home/unnamed/Projects/ml_chatbot/cakechat/cakechat/dialog_model/model.py", line 659, in load_weights
    laconic_logger.warning('\t{0:<40} {1:<12} -> {2:<12}'.format(var_name, saved_shape, default_shape))
TypeError: unsupported format string passed to tuple.__format__

What am I doing wrong?

Can't adjust input size

Trying to increase input size to anything more than 3 results in an error saying I need to download a new model

Problem running the chat using Docker on Windows 10

Hi there,

I have attempted to run CakeChat on my Windows 10 machine.

I've just followed the Quick Start guide and executed the following command:
docker run --name cakechat-dev -p 127.0.0.1:8080:8080 -it lukalabs/cakechat:latest
bash -c "python bin/cakechat_server.py"

The image is dowloaded just fine, but at the end Docker throws this error:
docker: Error response from daemon: OCI runtime create failed: container_linux.go:348: starting container process caused "exec: "\\": executable file not found in $PATH": unknown.

What could be the issue?

Thank you in advance.

Newbie Training our responses

Hello, I would like to ask on how are we going to train the cakechat model. We used this as a group project on school and we have a very limited knowledge on programming. We asked helped on one of our professors and was able to run it but we have a big problem training it since he won't help us anymore. We already read your steps on training our own model and also some of the issues but we couldn't understand or maybe we were just overthinking the process.
Our questions are :

Where are we going to edit our responses?
How do we make the cakechat respond lwhat we want him to respond?
We've read that the corpus file is a JSON file can we edit this just on notepad? And does this file contains all our data responses or is it one response only in 5 different emotions?

This embarrassing but we know you have posted the answers but we can't understand. We ask specific questions because this so new to us. We hope you can help us :(

lukalabs / cakechat Goto Github PK

cakechat's People

Contributors

Stargazers

Watchers

Forkers

cakechat's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs