ikostrikov / tensorflow-pointer-networks Goto Github PK

View Code? Open in Web Editor NEW

203.0 203.0 68.0 10 KB

TensorFlow implementation of Pointer Networks

License: MIT License

Python 59.31% Jupyter Notebook 40.69%

tensorflow-pointer-networks's People

Contributors

Stargazers

Watchers

tensorflow-pointer-networks's Issues

I want to use this code for Machine Reading, but I don't know how to feed decoder_inputs

Can I feed the shift of decoder outputs? Or just zeros?
Thank you @ikostrikov @j-min @GalDude33 @chenghuige

Dataset explanation

Could you please explain the dataset and the DataGenerator function you used?
I couldn't fully understand the problem you want to solve with this code.
Is it one of the three performance evaluation written in the original paper?

cell(x, states[-1]) shape mismatch

I download the code and run it with tensorflow version 1.3.0

I encounter the error with:

ImportError: cannot import name core_rnn_cell_impl
>>> from tensorflow.contrib.rnn.python.ops import rnn_cell_impl

I change the line

from tensorflow.contrib.rnn.python.ops import rnn_cell_impl

to be

from tensorflow.python.ops import rnn_cell_impl

I encounter the error with:

ValueError: Trying to share variable rnn/gru_cell/gates/kernel, but specified shape (64, 64) and found shape (33, 64).

It is because x = rnn_cell_impl._linear([inp, attns], cell.output_size, True) outputs the tensor has shape (32, 64) and combine the m_state shape (32, 64)

Now I don't know how to resolve this.

tf.nn.rnn missing in current tensorflow version. Tried tf.contrib.rnn.static_rnn but getting error in the tf.concat function (line 83 main.py)

Why compute prediction over decoder_inputs instead of encoder_inputs?

Hi,

Thanks for sharing the code. I have a doubt in pointer.py. In the below code fragment:

                if feed_prev and i > 0:
                inp = tf.pack(decoder_inputs)
                inp = tf.transpose(inp, perm=[1, 0, 2])
                inp = tf.reshape(inp, [-1, attn_length, input_size])
                inp = tf.reduce_sum(inp * tf.reshape(tf.nn.softmax(output), [-1, attn_length, 1]), 1)
                inp = tf.stop_gradient(inp)
                inps.append(inp)

Here you are computing inp from decoder_inputs, but at the test time, you wouldn't really have decoder_inputs. Shouldn't it be computed from encoder_inputs rather by indexing into them?

Weight Sharing

Is this model supposed to share weights between encoder and decoder?

When I run the code, it shows an ValueError that I attempted to reuse RNNCell.

I can either reuse it or not, but I am not sure what approach I should take.

Thanks in advance.

why use zero attns

Merge input and previous attentions into one vector of the right size.

        x = core_rnn_cell_impl._linear([inp, attns], cell.output_size, True)

which attns = array_ops.zeros(batch_attn_size, dtype=dtype)

why use zeros attns ？

What does attns do?

Hello.

I'm trying to understand the code.

In pointer.py, there is an attns.

It is initialized as zeros and becomes an input to _linear function.

Why do you need it?

Thanks in advance.

How to use this code to output two predictions?

Input a decoder_inputs list of two elements. Is it right?
Thank you @ikostrikov @j-min @GalDude33 @chenghuige

which tf version is this project implemented?

if i run with tf 1.0.0, exception arise as tf.nn.ops have no attribution of rnn

i run the notebook with tf 0.12 got issue below:
`---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
in
----> 1 pointer_network = PointerNetwork(FLAGS.max_steps, 1, FLAGS.rnn_size, 1, 5, FLAGS.batch_size, 1e-2, 0.95)
2 dataset = DataGenerator()
3 pointer_network.step()

in init(self, max_len, input_size, size, num_layers, max_gradient_norm, batch_size, learning_rate, learning_rate_decay_factor)
59 with tf.variable_scope("decoder"):
60 outputs, states, _ = pointer_decoder(
---> 61 self.decoder_inputs, final_state, attention_states, cell)
62
63 with tf.variable_scope("decoder", reuse=True):

~/tutorial/TensorFlow-Pointer-Networks/pointer.py in pointer_decoder(decoder_inputs, initial_state, attention_states, cell, feed_prev, dtype, scope)
121
122 # Merge input and previous attentions into one vector of the right size.
--> 123 x = core_rnn_cell_impl._linear([inp, attns], cell.output_size, True)
124 # Run the RNN.
125 cell_output, new_state = cell(x, states[-1])

/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/tensorflow/contrib/rnn/python/ops/core_rnn_cell_impl.py in _linear(args, output_size, bias, bias_start, scope)
749 res = math_ops.matmul(args[0], weights)
750 else:
--> 751 res = math_ops.matmul(array_ops.concat(args, 1), weights)
752 if not bias:
753 return res

/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/tensorflow/python/ops/array_ops.py in concat(concat_dim, values, name)
1073 ops.convert_to_tensor(concat_dim,
1074 name="concat_dim",
-> 1075 dtype=dtypes.int32).get_shape(
1076 ).assert_is_compatible_with(tensor_shape.scalar())
1077 return identity(values[0], name=scope)

/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/tensorflow/python/framework/ops.py in convert_to_tensor(value, dtype, name, as_ref, preferred_dtype)
667
668 if ret is None:
--> 669 ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
670
671 if ret is NotImplemented:

/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/tensorflow/python/framework/constant_op.py in _constant_tensor_conversion_function(v, dtype, name, as_ref)
174 as_ref=False):
175 _ = as_ref
--> 176 return constant(v, dtype=dtype, name=name)
177
178

/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/tensorflow/python/framework/constant_op.py in constant(value, dtype, shape, name, verify_shape)
163 tensor_value = attr_value_pb2.AttrValue()
164 tensor_value.tensor.CopyFrom(
--> 165 tensor_util.make_tensor_proto(value, dtype=dtype, shape=shape, verify_shape=verify_shape))
166 dtype_value = attr_value_pb2.AttrValue(type=tensor_value.tensor.dtype)
167 const_tensor = g.create_op(

/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/tensorflow/python/framework/tensor_util.py in make_tensor_proto(values, dtype, shape, verify_shape)
365 nparray = np.empty(shape, dtype=np_dt)
366 else:
--> 367 _AssertCompatible(values, dtype)
368 nparray = np.array(values, dtype=np_dt)
369 # check to them.

/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/tensorflow/python/framework/tensor_util.py in _AssertCompatible(values, dtype)
300 else:
301 raise TypeError("Expected %s, got %s of type '%s' instead." %
--> 302 (dtype.name, repr(mismatch), type(mismatch).name))
303
304

TypeError: Expected int32, got list containing Tensors of type '_Message' instead.`

ikostrikov / tensorflow-pointer-networks Goto Github PK

tensorflow-pointer-networks's People

Contributors

Stargazers

Watchers

Forkers

tensorflow-pointer-networks's Issues

Merge input and previous attentions into one vector of the right size.

Recommend Projects

Recommend Topics

Recommend Org

Jobs