qywu / ardm Goto Github PK

Alternate Recurrent Dialog Model

Jupyter Notebook 36.06% Python 63.94%

ardm's Issues

hwo to use chinese-gpt

I want to do some training in chinese and find your python lib "chinese-gpt", but I can't find any user guide about it. https://github.com/qywu/Chinese-GPT/tree/master/tutorials seems had be removed by you.

What's necessary in order to train dialogues with length longer than 1023?

I know the context supported by GPT2 is 1024, but I assume there's some technique they utilized to train and generate dialogues longer than that in their results. Also, I saw many gpt2-based repos training text with length longer than 1024. Can you please explain what's necessary to train longer dialogues? And, would you consider implementing that?

Inference Colab file of ARDM

I am faces few issues in ARDM Inference file of colab
I am getting error while using
model_A.load_state_dict(model_A_states)
model_B.load_state_dict(model_B_states).
Its unable to match all the keys.
I tried adding strict = False

model_A.load_state_dict(model_A_states,strict =False)
model_B.load_state_dict(model_B_states , strict = False)
The code runs but few keys don't match.
On proceeding in this manner I am facing error here
logits, past = model_A(prev_input, past=past)
error : past argument not supported. I tried removing past argument
but it gives error for the next line of code.
logits = logits[:, -1, :] / temperature

Can you please help me fix this code.

how to get processed multiwoz dataset

in your "multiwoz/MultiWOZ Multi-Turn Train.ipynb" have read some processed multiwoz data, as
`with open("../yichi_data/clean_train_data.json") as f:
train_data = json.load(f)

with open("../yichi_data/val_data.json") as f:
val_data = json.load(f)

with open("../yichi_data/test_data.json") as f:
test_data = json.load(f)`
what do you do to get this data and you used multiwoz2.1 or multiwoz2.0 ? thanks

When will it be ready for testing?

Hello! Just read your paper. Very interesting. When will it be ready for testing? I would like to try your model on open-domain chit-chat data. Do you think it would work well on it?

Keep up the good work.

Questions regarding the code

What does this line check?

if sum([len(item) for item in batch[0][1]]) > 1024:

What is the maximum number of turns a dialogue can have? Or is it set by maximum length a dialogue can have? If so, where is it specified? I saw a number of constants number who can be contenders for that:

train_data = [data[idx] for idx in indices[100:]]
val_data = [data[idx] for idx in indices[:100]]

self.tokenizer.max_len = 1500
        # tokenizer weird behavior

qywu / ardm Goto Github PK

ardm's Issues

hwo to use chinese-gpt

What's necessary in order to train dialogues with length longer than 1023?

Inference Colab file of ARDM

how to get processed multiwoz dataset

When will it be ready for testing?

Questions regarding the code

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs