GithubHelp home page GithubHelp logo

lancopku / dpgan Goto Github PK

View Code? Open in Web Editor NEW
144.0 144.0 38.0 24.32 MB

Diversity-Promoting Generative Adversarial Network for Generating Informative and Diversified Text (EMNLP2018)

Python 99.90% Shell 0.10%

dpgan's People

Contributors

jingjingxupku avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dpgan's Issues

Q-value computing problem

Hi, I am wondering whether there is a simulation process like MC search involved in computing Q-value.
I don't understand the codes quite clearly.

How to use other data? How to create vocab.txt file?

How to use other data? How to create vocab.txt file?

The program crashed / stalled my PC after about 8 hours creating the training. How ever it was using CPU, so I tried to create a smaller data set.

I assumed : https://github.com/lancopku/DPGAN/blob/master/review_generation_dataset/generate_review.py is what formats the data.

I've being trying to read this program, I was / am hoping it formats the data some way, but there aren't any comments for a "non coder" to follow. I assumed I had to change the path? I'm on Linux.

generate_review.py
L52 : file_path = "F:\dataset\yelp_dataset\sorted_data"

Vocab.txt file issue

I am unable to understand , How vocab.txt generated any many words are assigned same integer value,Why not real value?

OpenSubtitle dataset

Could you please provide the "OpenSubtitle dataset" or the code for pre-processing the dataset? Thanks a lot.

using for Japanese

Hi,
Thank you for your code. However, when I used with Japanese dataset, I cannot get the generated text. It showed "" in generation file.
Please tell me how to fix it?
Thank you so much.

Problem of the scale of the reward

Great thanks for sharing your code!
It is not clear for me why do you scale the reward in the following way:
if reward['y_pred_auc'][i][j][k] > 12: reward['y_pred_auc'][i][j][k] = 12/ 10000.0 else: reward['y_pred_auc'][i][j][k] = reward['y_pred_auc'][i][j][k] / 10000.0

Could you please help?

Failed to load checkpoint

Hi!
When I run python main.py with default settings, the console prints:

(/home/ymzhu/anaconda2/envs/tf3) ymzhu@yuncao-All-Series:~/Desktop/code/DPGAN-master$ python main.py 
INFO:tensorflow:Starting running in train mode...
max_size of vocab was specified as 50000; we now have 50000 words. Stopping reading.
Finished constructing vocabulary of 50000 total words. Last word added: westbrook
Start pre-training......
INFO:tensorflow:Building generator graph...
INFO:tensorflow:Tensor("seq2seq/embedding/concat:0", shape=(64, ?, 512), dtype=float32, device=/device:GPU:0)
INFO:tensorflow:Time to build graph: 22 seconds
2018-02-26 10:48:03.203934: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2018-02-26 10:48:03.203959: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2018-02-26 10:48:03.203965: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2018-02-26 10:48:03.203970: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2018-02-26 10:48:03.203976: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
INFO:tensorflow:Failed to load checkpoint from myexperiment/train-generator. Sleeping for 10 secs...
INFO:tensorflow:Failed to load checkpoint from myexperiment/myexperiment/train-generator. Sleeping for 10 secs...
INFO:tensorflow:Failed to load checkpoint from myexperiment/myexperiment/myexperiment/train-generator. Sleeping for 10 secs...
INFO:tensorflow:Failed to load checkpoint from myexperiment/myexperiment/myexperiment/myexperiment/train-generator. Sleeping for 10 secs...
INFO:tensorflow:Failed to load checkpoint from myexperiment/myexperiment/myexperiment/myexperiment/myexperiment/train-generator. Sleeping for 10 secs...
INFO:tensorflow:Failed to load checkpoint from myexperiment/myexperiment/myexperiment/myexperiment/myexperiment/myexperiment/train-generator. Sleeping for 10 secs...
INFO:tensorflow:Failed to load checkpoint from myexperiment/myexperiment/myexperiment/myexperiment/myexperiment/myexperiment/myexperiment/train-generator. Sleeping for 10 secs...
INFO:tensorflow:Failed to load checkpoint from myexperiment/myexperiment/myexperiment/myexperiment/myexperiment/myexperiment/myexperiment/myexperiment/train-generator. Sleeping for 10 secs...
INFO:tensorflow:Failed to load checkpoint from myexperiment/myexperiment/myexperiment/myexperiment/myexperiment/myexperiment/myexperiment/myexperiment/myexperiment/train-generator. Sleeping for 10 secs...

Seems it won't stop. How could I solve it? Thank you!

Generated Review problem

I want to ask how data is splitted into postive and negative reviews.When I checked manually many reviews whose score is 1 or 2 has been assigned to positive folder (discriminator_train/positive).Logically it is negative review.
After adversial training in this (train_sample_generated/7epoch_step2_temp_positive/000012.txt) file generated review is just copy of original given review.
Orginal Input Review-- {"review": "i wasnt thrilled with the taste of the food compared to how pricy it is . i would have enjoyed a juicy steak somewhere else . however i do like the romance of fondue", "score": "3"}
Generated Review in file 000012.txt---{"label": "1", "example": "i would have enjoyed a juicy steak somewhere else . however i do like the romance of fondue"}

It has just reduced one sentence. Is it like that.

Question for label processing

Question 1 Why the label for positive samples is -0.0001 and the labels for negative samples are 1?
Code in batcher_discriminator.py: 267
if int(label) == 1: label = -0.0001
elif int(label) == 0: label = 1

Question 2 Why the reward for generator is processed as follow:
Code in main.py:277
if reward['y_pred_auc'][i][j][k] > 12: reward['y_pred_auc'][i][j][k] = 12/ 10000.0
else: reward['y_pred_auc'][i][j][k] = reward['y_pred_auc'][i][j][k] / 10000.0
also in #4

@jingjingxupku

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.