GithubHelp home page GithubHelp logo

openai / generating-reviews-discovering-sentiment Goto Github PK

View Code? Open in Web Editor NEW
1.5K 242.0 384.0 305.73 MB

Code for "Learning to Generate Reviews and Discovering Sentiment"

Home Page: https://arxiv.org/abs/1704.01444

License: MIT License

Python 100.00%
paper

generating-reviews-discovering-sentiment's Introduction

Status: Archive (code is provided as-is, no updates expected)

Generating Reviews and Discovering Sentiment

Code for Learning to Generate Reviews and Discovering Sentiment (Alec Radford, Rafal Jozefowicz, Ilya Sutskever).

Right now the code supports using the language model as a feature extractor.

from encoder import Model

model = Model()
text = ['demo!']
text_features = model.transform(text)

A demo of using the features for sentiment classification as reported in the paper for the binary version of the Stanford Sentiment Treebank (SST) is included as sst_binary_demo.py. Additionally this demo visualizes the distribution of the sentiment unit like Figure 3 in the paper.

Sentiment Unit Visualization

Additionally there is a PyTorch port made by @guillitte which demonstrates how to train a model from scratch.

This repo also contains the parameters of the multiplicative LSTM model with 4,096 units we trained on the Amazon product review dataset introduced in McAuley et al. (2015) [1]. The dataset in de-duplicated form contains over 82 million product reviews from May 1996 to July 2014 amounting to over 38 billion training bytes. Training took one month across four NVIDIA Pascal GPUs, with our model processing 12,500 characters per second.

[1] McAuley, Julian, Pandey, Rahul, and Leskovec, Jure. Inferring networks of substitutable and complementary products. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. ACM, 2015.

generating-reviews-discovering-sentiment's People

Contributors

christopherhesse avatar embreinhardt avatar newmu avatar openai-sys-okta-integration avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

generating-reviews-discovering-sentiment's Issues

Docker build for CPU and GPU

I have here added a Docker file for both CPU and GPU builds.

How To Build the Docker image

To build the Docker image for CPU only

git clone https://github.com/loretoparisi/generating-reviews-discovering-sentiment.git
cd generating-reviews-discovering-sentiment
docker build -t sentiment-neuron -f Dockerfile .

or execute ./build.sh

while to build the Docker image for GPU

cd generating-reviews-discovering-sentiment
docker build -t sentiment-neuron -f Dockerfile.gpu .

or you execute ./build.sh GPU

How To Run the Docker image

To run for CPU

cd generating-reviews-discovering-sentiment
docker run --rm -it sentiment-neuron bash

or execute ./run.sh

while to run for GPU you have to attach the nvidia-docker driver and device (here we attach the device 0, that is the first GPU as default):

docker run -it --device=/dev/nvidiactl --device=/dev/nvidia-uvm --device=/dev/nvidia0 --volume-driver nvidia-docker -v nvidia_driver_367.57:/usr/local/nvidia:ro $IMAGE $CMD

or execute ./run.sh GPU

How To Use it

As soon as you run the image you will be in the /sentiment folder.
Then you can run the provided examples test_sentiment.py:

root@718644c454d5:/sentiment# python test_sentiment.py 
7.592 seconds to transform 8 examples
it was a nice day 0.012658
it was a great day 0.371533
it was a bad day -0.499269
It was a wonderful day 0.503395
It was an excellent day 0.44557
It was a super excellent day 0.623401
It was such a bad bad day  -0.858701
It was such a bad bad bad day -1.04497

and the test_generative.py example, adapted from this fork.

root@e713b094abb6:/sentiment# python test_generative.py 
'I couldn't figure out'... --> (argmax sampling):
Positive sentiment (1 sentence): 
 I couldn't figure out how to use the stand and the stand but I love it and it is so easy to use.

Negative sentiment (+100 chars):
 I couldn't figure out how to get the product to work and the company would not even try to resolve the problem.  I would ...


'I couldn't figure out'... --> (weighted samples after each word):
Positive sentiment (3 examples, 2 sentences each):
(0) I couldn't figure out what was going on with the characters from page one. I was so engrossed in the story that I read all day.
(1) I couldn't figure out how to install the installation video that came with it but I am so glad I did. My son was so excited to put this together for me.
(2) I couldn't figure out what it was until finding this book by accident.  Every time I encounter a book from this trilogy I enjoy it as much now as I did when I was a child.

Negative sentiment (3 examples, 2 sentences each):
(0) I couldn't figure out how to get the stupid thing to play youtube videos.  I should have never bought this product.
...

Notes

  • I had to merge the PR here to support the generative test that adds the generate_sequence method.
  • To enable the Nvidia GPU on the host machine, you need to have nvidia-docker installed. To check the nvidia toolkit installation please run the nvidia-smi command to list the available connected gpu.
  • To address some python language compatibility issues, I'm using the tensorflow latest python3 docker image -
    tensorflow:latest-py3 and tensorflow:latest-gpu-py3 for the gpu.
  • I'm adding the tqdm module via pip in the Dockerfile.

If there will be further info how to train this model, I will add to the Docker image.

Invalid literal

When I am trying to run the example. But i am facing this issue at this line

xmb[i, -l:] = list(x)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: invalid literal for long() with base 10: '\n'

Weight Initialization

Hello,

Thank you very much for sharing this code. I am attempting to re-train a model like this from scratch and was wondering which weight initialization method was used for training the model?

Thanks,
Jonny

hyperparameters

Hi,
I'm a little confused with the hyper-parameters of the model.
In the paper, it is stated that:

The model was trained for a single epoch on mini-batches of 128 subsequences of length 256 for a total of 1 million weight updates.

But, If I understand correctly, the hyperparameter 'nsteps' represents the length of each subsuence, and it is set to be 64, and not 256. Why is that? Am I understanding the meaning of 'nsteps' correctly?
Also, I couldn't figure out what the 'nembd' and 'nstates' hyperparameters stand for. If someone can clear things up for me, it would be great.

Thanks!

issue about run sst_binary_demo

model coluld't transform trX
err msssage ValueError: invalid literal for long() with base 10: '\n'

because hyperparameter nsteps(256) of Model is too small? most len of data is larger than that
the model coluld''t fit with long data?
I don't know how to deal with that,Hope for help

What is the process of extracting features from text?

Hey, this is an awesome project. I just went through the paper but I could not understand the process of extracting features from an arbitrary text.

For example, if we have some text with 150 characters, how would we get the final feature vector of 4096 values and how would we process the 150 characters?

Regards

Transform using multiple GPUs

I'm looking for a way to parallel the transform function. I know that TensorFlow supports that but I couldn't manage to implement it properly in the code.

Help anyone?

Initial weights and learning rate decay

Hi,

I am looking for some help with understanding the model details.

First of all, I couldn't find any mention of the model's initial weights (including the initial embedding).
Also, it is stated that:

an initial 5e-4 learning rate that was decayed linearly to zero over the course of training.

But I couldn't find what was the decay function exactly.

Any help would be highly appreciated.

Thanks!

Only last 64 chars?

Is that true that using the current transform() function, we only get features for the last 64 chars of the review, rather than for all of the review?
smb[:, offset+start:offset+end, :] = batch_smb seem to overwrite previous features.

Is neutral prediction possible?

We are trying to build sentiment analysis for email conversations. The problem we are facing is most of the emails were neutral conversations.

This models predicts neutral sentiment as positive or negative.

My questions is there any specific range for negative, positive and neutral sentiments?

Is it possible to predict sentiments as neutral?

How to 'use' this model in our own project ?

I can't find a way how to incorporate this model to my code? I just need to get sentiment scores on some Feedbacks. As this model is pre-trained, it will be of much help. How do i do this? Thanks!

Does not work with TensorFlow 1.0

I tried to run example, but it does not work due to changes in TensorFlow 1.0.
Why don't you use the newest version?

https://www.tensorflow.org/install/migration

$ python -c 'import tensorflow as tf; print(tf.__version__)' 
1.0.1
$ python encoder.py 
Traceback (most recent call last):
  File "encoder.py", line 209, in <module>
    mdl = Model()
  File "encoder.py", line 143, in __init__
    cells, states, logits = model(X, S, M, reuse=False)
  File "encoder.py", line 90, in model
    cstart, hstart = tf.unpack(S, num=hps.nstates)
AttributeError: module 'tensorflow' has no attribute 'unpack'

Tips for training on GPU?

Hello,

I am trying to train this model in Tensorflow using the values for batch size and sequence length given in the paper (batches of 128 and sequence length of 256) though I am struggling to implement the model with these hyper-parameters. I am able to train the model with the same hidden size as reported in the paper (hidden-size of 4096), but only with smaller batch and sequence length settings. As I increase the values of these hyper parameters I encounter OOM memory errors. Debugging the causes of these errors is tricky. I am currently looking into using tfdbg and also tfprof. My model crashes during the session.run() call to my optimizer op.

Could you share any details of how you implemented this model? or give any recommendations
for creating efficient implementations (e.g creating efficient input pipelines, device placement in TF graph, common pitfalls, debugging tips)

I am using a Google Cloud Platform Compute Instance for my implementation.

Any tips or tricks to help with implementing this would be greatly appreciated!

Thanks,
Jonny

How to train on GPU a new dataset

@embreinhardt I would like to train the sentiment neuron a on brand new dataset.
I have built and posted here the Dockerfile to call the inference and the generation on both CPU and GPU. Is possible to have info on the train part. I would then add this to the Dockerfile as well.

I would like to test performances on both Nvidia GPU GRID K520

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.57                 Driver Version: 367.57                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GRID K520           Off  | 0000:00:03.0     Off |                  N/A |
| N/A   38C    P8    17W / 125W |      0MiB /  4036MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

and the Tesla K80:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.66                 Driver Version: 375.66                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K80           Off  | 0000:00:1E.0     Off |                    0 |
| N/A   41C    P8    30W / 149W |      0MiB / 11439MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Thank you.

Few questions on training data

  • Is the dataset used for training available in raw format(text - human understandable) ? Or can we get a sampled version of the dataset ? This can help us create our own custom dataset.
  • How to structure a custom dataset for this Model ? I believe many people have asked the same question before. Would be incredibly helpful if someone can guide.
  • This is tied to the previous questions. How to create new training data on top of a trained model ? I want to utilise the existing capabilities as well as add new training data.

There are many cases in which we are getting only 40-50% accuracy for our own testing data. My understanding is that we can improve it beyond 90% if good training data is added.

Please let me know if you need any clarifications.

local sentiment

I want to generate an online estimation of the local sentiments (character sentiment) in the sentences, something like the figure 4 in the paper "Generating Reviews and Discovering Sentiment".
Any help about the way to proceed? thanks

Is the sentiment neuron bi-modal on other corpus than IMDB ?

I've computed the mLSTM features on each of the 6920 sentence of the Stanford Sentiment Treebank with

text_features = model.transform(text)

and extracted the last 4096-feature-vector for each sentence, and plotted the distribution of neuron 2387 (see below) and 2388. But it looks unimodal. I don't have IMDB at hand to try on it, but did I do something wrong, or is the bi-modal distribution plotted in the paper only for IMDB ?

Thanks !

image

(sorry, the previous histogram was the histogram of all features together, I've now uploaded the histogram of feature 2387)

Train over new data

I'm sorry if it is just me that couldn't find it, but can you also provide the code to (re)train the model on new data?

How to generate text?

I see that your model is designed for sentiment classification. But in the paper it was mentioned that it was able to generate text (specifically, I quote: "Examples of synthetic text generated by the trained model"). Is there a way to do this? If so, can somebody give a clear explanation?

Weight Regularization/hidden state clipping parameters

Is there any plans to release what hyperparameters were used for regularizing the training process.

I've been trying to retrain these weights on amazon reviews and a different dataset using guillitte's implementation as suggested on this repo's README; however, because of the multiplicative nature of the mlstm, the weights tend to overfit and have very high norms. The input->hidden weights tend to be fine and have constant values throughout, but the hidden->hidden weights seem to continually grow in norm throughout the training process as it unearths the patterns of the training corpus.

This is problematic for scenarios when I have a rare character/sequence of chars such as a name in finnish with utf-8 supported accenting/diaresis (eg. Väinämö) that comes up frequently in otherwise english text. If multiple of these names appear in a batch, it causes massive gradient spikes and can lead to gradient explosion in the network, and even if the gradients recover, the net is incapable of getting back to previous performance levels if the gradient spike pushes the weights too far from their local.

Obviously I could make an effort to preprocess this data/drop it and clip activation outputs/their associated gradients (and I have), but it is inconvenient to have to rely on data processing and hope that I thought of all possible data transformations or have to extensively tune clipping hyperparameters.

This explosion doesn't happen with an LSTM model (since it's additive) either after extensive testing, even though it doesn't do too well without preprocessed data.

TL;DR
Please release hyperparameters, as the network is too prone to overfitting and training instability to the point where I can't even guarantee a stable training run on amazon reviews (even with your saved weights as initialization). (it's about 1 fail:5 succeed) An LSTM model does worse, but doesn't have these training instabilities.

UPDATE:
Nevermind, I saw that weight normalization was mentioned in the paper, despite not being used in the pytorch implementation.

Some hint about loading the npy with correlated paramters

Through the study on OpenAI's model, here is some useful information for developers who wrote their own version of mlstm and try to import OpenAI's model paramters. In mlstm function in encoder.py, defines the tensors' name, this is the baseline.

  1. Computation Graphic and tensor
    Under the name scope model, there are three sub name scope:

    • embedding
      • tensors: w
    • out
      • tensors: w, b
    • rnn
      • tensors: b, gh, gmb, gmx, gx, wh, wmh, wmx, wx
        The tensors are listed as follow:
    1. tensor_name: model/embedding/w
    2. tensor_name: model/out/b
    3. tensor_name: model/out/w
    4. tensor_name: model/rnn/b
    5. tensor_name: model/rnn/gh
    6. tensor_name: model/rnn/gmh
    7. tensor_name: model/rnn/gmx
    8. tensor_name: model/rnn/gx
    9. tensor_name: model/rnn/wh
    10. tensor_name: model/rnn/wmh
    11. tensor_name: model/rnn/wmx
    12. tensor_name: model/rnn/wx
  2. Table for the correlation between tensor and .npy files
    For detailed information about each tensor and which .npy it is correlated, please check the table

Name Correlated-tensor Array Shape npy file index line of code
params[0] embedding/w (256,64) 0 embd, line 23
params[1] rnn/wx (64, 16384) 1 mlstm, line 47
params[2] rnn/wh (4096, 16384) hstack 2-5 mlstm, line 48
params[3] rnn/wmx (64, 4096) 6 mlstm, line 49
params[4] rnn/wmh (4096, 4096) 7 mlstm, line 50
params[5] rnn/b (16384,) 8 mlstm, line 51
params[6] rnn/gx (16384,) 9 mlstm, line 53
params[7] rnn/gh (16384,) 10 mlstm, line 54
params[8] rnn/gmx (4096,) 11 mlstm, line 55
params[9] rnn/gmh (4096,) 12 mlstm, line 56
params[10] out/w (4096, 256) 13 fc, line 31
params[11] out/b (256,) 14 fc, line 38

Hopyfully this would help.

Add a requirements.txt

Getting a weird module version issue with tensorflow and the method 'unpack' not being an attribute. Would be helpful if a requirements.txt file was included to use with virtualenv.

training speed confirmation

The readme says that the model trained at 12,500 ch/s across 4 nvidia pascal gpus.

Is this 12,500 per gpu, or in total for the 4 gpus.

I'm trying to benchmark the performance of my set up/trying to estimate how long I need to wait for the equivalent 1 month of training time mentioned in the paper.

speed up suggestions?

Hello,
I'm thinking about using the sentiment neuron to create sentiment-colored, blocks of text like they did in the paper in Figure 4. The code I'm using to calculate sentiment is simple, something like:

`from encoder import Model

model = Model()

text = ['This is a terrible product. Very bad!']
text_features = model.transform(text)
text_features[:, 2388]`

I've got a quad-core macbook pro and no GPU. Any suggestions on how I could speed up the calculations? Is there a way to specify that all available cores should be used?

Question: How long does it take you to run sst_binary_demo.py?

I'm having trouble running the sst_binary_demo.py script on my laptop, with common normal specs (i5, 8GB RAM, 500GB SSD, no dedicated GPU).

I tested the code on a small dataset, where each call of the transform function takes approximately 2 seconds.

How long did it take you to run this example?

python2.7 IndexError: list index out of range

Traceback (most recent call last):
File "demo.py", line 4, in
model = Model()
File "/Users/zhaoyingjun/Downloads/generating-reviews-discovering-sentiment-master/encoder.py", line 143, in init
cells, states, logits = model(X, S, M, reuse=False)
File "/Users/zhaoyingjun/Downloads/generating-reviews-discovering-sentiment-master/encoder.py", line 93, in model
inputs = [tf.squeeze(v, [1]) for v in tf.split(1, nsteps, words)]
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/ops/array_ops.py", line 1203, in split
num = size_splits_shape.dims[0]
IndexError: list index out of range
who can tell me why please?

python encoder.py causes IndexError: list index out of range

I'm having the following error when running: python encoder.py

Traceback (most recent call last):
File "encoder.py", line 211, in
mdl = Model()
File "encoder.py", line 144, in init
cells, states, logits = model(X, S, M, reuse=False)
File "encoder.py", line 93, in model
inputs = [tf.squeeze(v, [1]) for v in tf.split(1, nsteps, words)]
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/array_ops.py", line 1208, in split
num = size_splits_shape.dims[0]
IndexError: list index out of range

And to note that 'size_splits_shape' is empty and 'num_or_size_splits' = 64
I tested with python2 and python3 (same error), version of tensorflow = 1.1.0

Am i using the script incorrectly? is it possible please to help me resolve this issue?

model.transform(text)

Is text a list of words?
It seems the max length of element in the text list is 60.

from encoder import Model
model = Model()
text = ['demo!']
text_features = model.transform(text)

Remove dependency on html? (It won't install on python3)

See similar issue here: j-towns/iprofiler#2

Long story short, the version of the html library on pip is from 2011, and no longer install correctly:

pip3 install html
Collecting html
  Using cached html-1.16.tar.gz
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-build-szl57hlm/html/setup.py", line 12, in <module>
        long_description = __doc__.decode('utf8'),
    AttributeError: 'str' object has no attribute 'decode'

Training Regime and Backprop

Love your project and hope very much it's not impolite to ask whether you intend to publish the learning regime as well so that I might try my luck with my own text data (training on the text corpus and the later the classifier on top). Even "dirty" code would do, just to get a starting point.

Would be very happy to hear from you

Opinion2vec?

I want to use this model to convert an opinion to a vector somehow (or at least arrive at an approximation of that).

E.g. "I'm an atheist" and "I believe in god" should be polar opposites. "I hate potatoes" and "I love potatoes" should be polar opposites.

Let's take "I hate potatoes" and "I love potatoes" for now since that seems easier.

If I do:

from encoder import Model

model = Model()


text = ['I hate potatoes', 'I love potaotes']
transformed = model.transform(text)

print('Vectors:', transformed)

print('Sentiment:', transformed[:, 2388])

I get:

Vectors: [[-0.21318194  0.2849596   0.00824564 ...  0.6167125   0.06999006
   0.00869564]
 [-0.10017553 -0.01693083 -0.00667866 ...  0.657396   -0.00200943
   0.00810469]]

Sentiment: [-0.16220786  0.305778  ]

I make the assumption that the above is a vector and the next one is the sentiment.

I assume this model somehow encodes some meaning about the text because when I use my numba cosine distance function on the 2 vectors:

from numba import jit
import numpy as np
import csv
import pandas as pd

@jit(nopython=True)
def cosine_similarity_numba(u:np.ndarray, v:np.ndarray):
    assert(u.shape[0] == v.shape[0])
    uv = 0
    uu = 0
    vv = 0
    for i in range(u.shape[0]):
        uv += u[i]*v[i]
        uu += u[i]*u[i]
        vv += v[i]*v[i]
    cos_theta = 1
    if uu!=0 and vv!=0:
        cos_theta = uv/np.sqrt(uu*vv)
    return cos_theta
  
numba_distance = cosine_similarity_numba(transformed[0], transformed[1])

print(numba_distance)

I get:

0.8250403321263183

And the function gives me 1 when I set both sentences to "I love potatoes"

However, is it possible to somehow incorporate the sentiment and the semantic similarity into one?

Meaning given "I love potatoes" and "I hate potatoes" somehow measure the distance of that with its sentiment distance?

So that I end up with -1? or at least something like that?

Perhaps it could be something like "the greater the semantic similarity is (cosine similarity), the greater the effect the sentiment measurement has on the distance between those 2 points?

Error with html.unescape()

From what i have read, this is solved for python 3.5. It would be great if someone can suggest solution for this error in python 2.7. Also please post requirements in a text file to help understand which versions are needed.

Code compatibility with Tensorflow 1.7

Recently update my Tensorflow to 1.7, when loading the model, got an warning:

model = Model()
WARNING:tensorflow: From xxxxxxx\encoder.py: 59: calling l2_normalize (from tensorflow.python.ops.nn_impl) with dim is deprecated and will be removed in a future version.

Changed the code from encoder.py: 58-62
change the dim parameter into axis

    if wn:
        wx = tf.nn.l2_normalize(wx, axis=0) * gx
        wh = tf.nn.l2_normalize(wh, axis=0) * gh
        wmx = tf.nn.l2_normalize(wmx, axis=0) * gmx
        wmh = tf.nn.l2_normalize(wmh, axis=0) * gmh

The warning is gone.

range of values for the sentiment neuron

Here is my sample code:

from encoder import Model
model=Model()
text = ['it was a nice day','it was a great day','it was a bad day','It was a wonderful day','It was an excellent day','It was a super excellent day','It was such a bad bad day ','It was such a bad bad bad day']
text_features = model.transform(text)
#17.660 seconds to transform 8 examples
for i in range(len(text)):
sentiment = text_features[i, 2388]
print(text[i],sentiment)

Here is the result:

it was a nice day 0.012658
it was a great day 0.371533
it was a bad day -0.499269
It was a wonderful day 0.503395
It was an excellent day 0.44557
It was a super excellent day 0.623401
It was such a bad bad day -0.858701
It was such a bad bad bad day -1.04497

My questions are:
1-what is the range for sentiment values?
2-is there any specific range for negative, positive and neutral sentiments?

Performance with CPU TF & zero vector output

Hello,

For those of you who had an issue with the mdl.transform returning a zero vector, here how I managed to solve it :
First, it seems that the input text must be at least 64 characters long (see nsteps variable in encoder.py).
Also, the html.unescape function returns bytearray like data. If you replace it with HTMLParser().unescape, you will manipulate strings and this will probably cause an exception in the batch_pad function.

And when I finnaly managed to figure this out, I realized that the transformation was extreamly slow on tiny inputs (5 minutes on 10 imdb review).

So, my question is : Is it caused by my fix or is it caused by the cpu implementation ?

Thank you

Compatibility with TF >= 1.0.0

tf.unpack (used in 90th line of encoder.py) was deprecated in favor of tf.unstack since release 1.0.0. However, a simple replacement tf.unpack->tf.unstack does not work because tf.split has been changed too.

A full compatible with TF 1.0.1 version can be found here. Thanks to @blester125

Visualization ambiguity

How did you map the sentiment neuron activation to colors?

Were the max and min values generated globally? If so what are they?
Do you just scale the min and max activation of a single sample?
Do you assume the min is -1 and max is -1?

I assumed the min and max and generated the following:
image
vs
from blog:
image

Transforming on one VM takes 20 sec VS 120 sec on a different VM

I have this really weird issue I just noticed, for transforming 5000 sentences it takes 20 seconds on my local PC (GPU - GTi 1080)and 120 seconds on a remote machine (GPU - K80).

At first I thought that it might be because of not updated packages so I updated them all but this issue continues.

Any ideas on what might be the reason? In most cases of training a NN both machines are around the same timings so this is something Iv'e never really seen before.

Which feature out of 4096 is sentiment?

It is not obvious from simple visualizations

from encoder import Model

model = Model()
text = ['demo!', 'great!', 'fuck shit']
text_features = model.transform(text)

import matplotlib.pyplot as plt

for i in range(len(text)):
    plt.plot(range(text_features[i].size), text_features[i], label=text[i])
plt.legend()
plt.show()

plt.scatter(text_features[1], text_features[2])
plt.xlabel(text[1])
plt.ylabel(text[2])
plt.show()

Generation

Can we expect generation to be implemented? If not, any pointers to implementing it yourself?

model is runnable but output vector is zeros

I modified some lines of codes and can run it. However the output vector is all zeros.

v1 = model.transform(["hello"])
In [12]: np.allclose(v1[0],np.zeros(4096))
Out[12]: True

It seems that model cannot load the pretrained weights in the proper way.

AttributeError: module 'tensorflow' has no attribute 'unpack'

Hi,
I just download the code and try to make it running the demo code:

model = Model()
text = ['demo!']
text_features = model.transform(text)

But I got an error:

AttributeError Traceback (most recent call last)
in ()
----> 1 model = Model()
2 text = ['demo!']
3 text_features = model.transform(text)
4

C:\Users\teng.fu\OneDrive - TeleWare Plc\Desktop\Development\Data\May_2017_Sentiment_anlaysis_pre\sentiment_neuro\generating-reviews-discovering-sentiment-master\encoder.py in init(self, nbatch, nsteps)
141 M = tf.placeholder(tf.float32, [None, hps.nsteps, 1])
142 S = tf.placeholder(tf.float32, [hps.nstates, None, hps.nhidden])
--> 143 cells, states, logits = model(X, S, M, reuse=False)
144
145 sess = tf.Session()

C:\Users\teng.fu\OneDrive - TeleWare Plc\Desktop\Development\Data\May_2017_Sentiment_anlaysis_pre\sentiment_neuro\generating-reviews-discovering-sentiment-master\encoder.py in model(X, S, M, reuse)
88 def model(X, S, M=None, reuse=False):
89 nsteps = X.get_shape()[1]
---> 90 cstart, hstart = tf.unpack(S, num=hps.nstates)
91 with tf.variable_scope('model', reuse=reuse):
92 words = embd(X, hps.nembd)

AttributeError: module 'tensorflow' has no attribute 'unpack'

Is it because tensorflow has rename the unpack function to unstack? or any other functions? Any update on the code?

it is right? sum(text_features)

from encoder import Model
import matplotlib.pyplot as plt
import numpy as np
model=Model()
text = ['horrendous','good','fuck','nice','well','bad','face','me']
text_features = model.transform(text)
for i in range(len(text)):
t = np.sum(text_features[i])
print(text[i],t)

but the result:
12.180 seconds to transform 8 examples
horrendous -18.1347
good -35.834
fuck -15.2731
nice -26.8903
well -7.1143
bad -17.0147
face -2.13978
me 4.94617

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.