GithubHelp home page GithubHelp logo

carpedm20 / ntm-tensorflow Goto Github PK

View Code? Open in Web Editor NEW
1.0K 67.0 214.0 34.41 MB

"Neural Turing Machine" in Tensorflow

License: MIT License

Python 15.70% Jupyter Notebook 84.30%
tensorflow neural-turing-machines

ntm-tensorflow's Introduction

Neural Turing Machine in Tensorflow

Tensorflow implementation of Neural Turing Machine. This implementation uses an LSTM controller. NTM models with multiple read/write heads are supported.

alt_tag

The referenced torch code can be found here.

** 1. Loss sometimes goes to nan even with the gradient clipping (#2).** ** 2. The code is very poorly design to support NTM inputs with variable lengths. Just use this code as a reference.**

Prerequisites

Usage

To train a copy task:

$ python main.py --task copy --is_train True

To test a quick copy task:

$ python main.py --task copy --test_max_length 10

Results

More detailed results can be found [here](ipynb/NTM\ Test.ipynb).

Copy task:

alt_tag alt_tag

Recall task:

(in progress)

Author

Taehoon Kim / @carpedm20

ntm-tensorflow's People

Contributors

alexbw avatar carpedm20 avatar youknowone avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ntm-tensorflow's Issues

Python3?

According to the readme, it supports both python 2 and 3. But using python3 to run the first example (train) in the readme cause an error about xrange/range. So the question is how does it support python3? Using 2to3 is tested to be okay? TIA!

TypeError: softmax_loss_function() got an unexpected keyword argument 'logits'

After running 2to3 -w on all the .py files, I got an error using python3.4 running the first example in the readme (train):

[] Building a NTM model
Percent: [####################] 100.00% Finished.
[
] Build a NTM model finished
[*] Reading checkpoints...
Traceback (most recent call last):
File "main.py", line 72, in
tf.app.run()
File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "main.py", line 62, in main
task.run(ntm, int(FLAGS.test_max_length * 1 / 3), sess)
File "/home/can/NTM-tensorflow/tasks/copy.py", line 35, in run
[ntm.get_loss(seq_length)],
File "/home/can/NTM-tensorflow/ntm.py", line 206, in get_loss
softmax_loss_function=softmax_loss_function)
File "/usr/local/lib/python3.4/dist-packages/tensorflow/contrib/legacy_seq2seq/python/ops/seq2seq.py", line 1134, in sequence_loss
softmax_loss_function=softmax_loss_function))
File "/usr/local/lib/python3.4/dist-packages/tensorflow/contrib/legacy_seq2seq/python/ops/seq2seq.py", line 1089, in sequence_loss_by_example
crossent = softmax_loss_function(labels=target, logits=logit)
TypeError: softmax_loss_function() got an unexpected keyword argument 'logits'

Is that ... the expected output?

Thank you! I ran the code.

true output : 
   #  #  #
       #  
    ##    
 predicted output :
   #  #  #
       #  
    ##    
 Loss : -18.150389

 true output : 
      # # 
  ### ##  
  # ## ## 
   ## ### 
     ###  
      #   
 predicted output :
      # # 
  ### ##  
  # ## ## 
   ## ### 
     ###  
      #   
 Loss : -100.484863

 true output : 
  ###  ## 
  # #  #  
       ## 
     #   #
     #### 
  # #    #
  # ###   
   ##  # #
  # ### ##
   ##     
 predicted output :
  ###  ## 
  # #  #  
       ## 
     #   #
     #### 
  # #    #
  # ###   
   ##  # #
  # ### ##
   ##     
 Loss : -171.060638

Do you think I am doing something wrong? Frankly, all I did is to run the code as is copied from the repository.

torch referenced code

You mention that torch code was used as a reference. But when I run torch implementation it's ability to generalize on copy task is too bad (after 15,000 iterations), while the tensorflow shows incredible outputs.
What did you change in your implementation to make it working from the torch one?
I tried to compare both codes and I found that the basic classes and functions are highly identical..

it stuck at ...cannot done

envy@ub1404:~/os_pri/github/NTM-tensorflow$ python main.py --task copy --is_train True
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcurand.so locally
{'checkpoint_dir': 'checkpoint',
'controller_layer_size': 1,
'epoch': 100000,
'input_dim': 10,
'is_train': True,
'max_length': 10,
'min_length': 1,
'output_dim': 10,
'read_head_size': 1,
'task': 'copy',
'test_max_length': 120,
'write_head_size': 1}
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:900] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties:
name: GeForce GTX 950M
major: 5 minor: 0 memoryClockRate (GHz) 1.124
pciBusID 0000:01:00.0
Total memory: 4.00GiB
Free memory: 3.56GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:755] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 950M, pci bus id: 0000:01:00.0)
[] Building a NTM model
Percent: [####################] 100.00% Finished.
[
] Building a loss model for seq_length 1
[] Building a loss model for seq_length 2
[
] Building a loss model for seq_length 3
[] Building a loss model for seq_length 4
[
] Building a loss model for seq_length 5
[] Building a loss model for seq_length 6
[
] Building a loss model for seq_length 7
[] Building a loss model for seq_length 8
[
] Building a loss model for seq_length 9
[] Building a loss model for seq_length 10
[
] Build a NTM model finished
[] Initialize all variables
[
] Initialization finished
[ 0] 4: 53.60 (618.3s)
[ 5] 1: 13.75 (711.4s)
[ 10] 4: 52.69 (711.8s)
[ 15] 6: 79.43 (742.8s)
[ 20] 1: 14.03 (757.4s)

.....................................................

[ 4775] 8: 0.00 (28215.3s)
[ 4780] 4: 0.00 (28216.3s)
[ 4785] 2: 0.00 (28216.9s)
[ 4790] 9: 0.00 (28218.1s)
[ 4795] 5: 0.00 (28218.9s)

checkpoint_dir not found when test_max_length is greater than max_length

For example, assume max_length is 10 and test_max_length is 20:

Percent: [####################] 100.00% Finished.
 [*] Build a NTM model finished
 [*] Reading checkpoints...
Traceback (most recent call last):
  File "main.py", line 49, in <module>
    tf.app.run()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 30, in run
    sys.exit(main(sys.argv))
  File "main.py", line 38, in main
    ntm.load(FLAGS.checkpoint_dir, 'copy')
  File "/root/NTM-tensorflow/ntm.py", line 257, in load
    raise Exception(" [!] Testing, but %s not found" % checkpoint_dir)
Exception:  [!] Testing, but checkpoint/copy_20 not found

Make a symbolic link from checkpoint/copy_10 to checkpoint/copy_20 works around the problem.

Nice job, thank you!

Runtime Error when it is tested for quick copy task

Hello,

I am having the following weird exception when I run the script for a quick test (python main.py --task copy --test_max_length 10)

Traceback (most recent call last): File "/Users/iliTheFallen/Documents/universityOfHouston/thesis/libs/NTM-tensorflow/main.py", line 72, in <module> tf.app.run() File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 43, in run sys.exit(main(sys.argv[:1] + flags_passthrough)) File "/Users/iliTheFallen/Documents/universityOfHouston/thesis/libs/NTM-tensorflow/main.py", line 62, in main task.run(ntm, FLAGS.test_max_length * 1 / 3, sess) File "/Users/iliTheFallen/Documents/universityOfHouston/thesis/libs/NTM-tensorflow/tasks/copy.py", line 18, in run seq = generate_copy_sequence(seq_length, ntm.cell.input_dim - 2) File "/Users/iliTheFallen/Documents/universityOfHouston/thesis/libs/NTM-tensorflow/tasks/copy.py", line 107, in generate_copy_sequence seq = np.zeros([length, bits + 2], dtype=np.float32) TypeError: 'float' object cannot be interpreted as an integer

Error in tensorflow 0.11.0rc1

$ python main.py --task copy --is_train True
{'checkpoint_dir': 'checkpoint',
 'controller_layer_size': 1,
 'epoch': 100000,
 'input_dim': 10,
 'is_train': True,
 'max_length': 10,
 'min_length': 1,
 'output_dim': 10,
 'read_head_size': 1,
 'task': 'copy',
 'test_max_length': 120,
 'write_head_size': 1}
 [*] Building a NTM model
Traceback (most recent call last):
  File "main.py", line 48, in <module>
    tf.app.run()
  File "/Volumes/UserSpace/Projects/NTM-tensorflow/.env/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 30, in run
    sys.exit(main(sys.argv[:1] + flags_passthrough))
  File "main.py", line 27, in main
    cell, ntm = copy_train(FLAGS, sess)
  File "/Volumes/UserSpace/Projects/NTM-tensorflow/tasks/copy.py", line 72, in copy_train
    ntm = NTM(cell, sess, config.min_length, config.max_length)
  File "/Volumes/UserSpace/Projects/NTM-tensorflow/ntm.py", line 82, in __init__
    self.build_model(forward_only)
  File "/Volumes/UserSpace/Projects/NTM-tensorflow/ntm.py", line 90, in build_model
    _, prev_state = self.cell(self.start_symbol, state=None)
  File "/Volumes/UserSpace/Projects/NTM-tensorflow/ntm_cell.py", line 62, in __call__
    _, state = self.initial_state()
  File "/Volumes/UserSpace/Projects/NTM-tensorflow/ntm_cell.py", line 288, in initial_state
    squeeze=True, name='read_w_%d' % idx)
  File "/Volumes/UserSpace/Projects/NTM-tensorflow/ops.py", line 109, in Linear
    identity_initializer(tf.cast(range_, tf.float32)))
  File "/Volumes/UserSpace/Projects/NTM-tensorflow/.env/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 1022, in get_variable
    custom_getter=custom_getter)
  File "/Volumes/UserSpace/Projects/NTM-tensorflow/.env/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 849, in get_variable
    custom_getter=custom_getter)
  File "/Volumes/UserSpace/Projects/NTM-tensorflow/.env/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 345, in get_variable
    validate_shape=validate_shape)
  File "/Volumes/UserSpace/Projects/NTM-tensorflow/.env/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 330, in _true_getter
    caching_device=caching_device, validate_shape=validate_shape)
  File "/Volumes/UserSpace/Projects/NTM-tensorflow/.env/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 676, in _get_single_variable
    validate_shape=validate_shape)
  File "/Volumes/UserSpace/Projects/NTM-tensorflow/.env/lib/python2.7/site-packages/tensorflow/python/ops/variables.py", line 215, in __init__
    dtype=dtype)
  File "/Volumes/UserSpace/Projects/NTM-tensorflow/.env/lib/python2.7/site-packages/tensorflow/python/ops/variables.py", line 288, in _init_from_args
    initial_value(), name="initial_value", dtype=dtype)
  File "/Volumes/UserSpace/Projects/NTM-tensorflow/.env/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 666, in <lambda>
    shape.as_list(), dtype=dtype, partition_info=partition_info)
TypeError: _initializer() got an unexpected keyword argument 'partition_info'

report a bug with fix solution

envy@ub1404:~/os_pri/github/NTM-tensorflow$ python main.py --task copy --is_train True
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcurand.so locally
{'checkpoint_dir': 'checkpoint',
'controller_layer_size': 1,
'epoch': 100000,
'input_dim': 10,
'is_train': True,
'max_length': 10,
'min_length': 1,
'output_dim': 10,
'read_head_size': 1,
'task': 'copy',
'test_max_length': 120,
'write_head_size': 1}
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:900] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties:
name: GeForce GTX 950M
major: 5 minor: 0 memoryClockRate (GHz) 1.124
pciBusID 0000:01:00.0
Total memory: 4.00GiB
Free memory: 3.55GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0
I sue the diff like:

I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y

envy@ub1404:~/os_pri/github/NTM-tensorflow$ git diff
diff --git a/ntm.py b/ntm.py
index 595345d..18d512b 100644
--- a/ntm.py
+++ b/ntm.py
@@ -147,7 +147,7 @@ class NTM(object):

                 grads = []
                 for grad in tf.gradients(loss, self.params):
  •                    if grad:
    
  •                    if grad is not None:
                         grads.append(tf.clip_by_value(grad,
                                                       self.min_grad,
                                                       self.max_grad))
    
    envy@ub1404:~/os_pri/github/NTM-tensorflow$ gi

otherwise it report:

I tensorflow/core/common_runtime/gpu/gpu_device.cc:755] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 950M, pci bus id: 0000:01:00.0)
[] Building a NTM model
Percent: [####################] 100.00% Finished.
[
] Building a loss model for seq_length 1
Traceback (most recent call last):
File "main.py", line 48, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv))
File "main.py", line 27, in main
cell, ntm = copy_train(FLAGS, sess)
File "/home/envy/os_pri/github/NTM-tensorflow/tasks/copy.py", line 72, in copy_train
ntm = NTM(cell, sess, config.min_length, config.max_length)
File "/home/envy/os_pri/github/NTM-tensorflow/ntm.py", line 82, in init
self.build_model(forward_only)
File "/home/envy/os_pri/github/NTM-tensorflow/ntm.py", line 150, in build_model
if grad:
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 475, in nonzero
raise TypeError("Using a tf.Tensor as a Python bool is not allowed. "
TypeError: Using a tf.Tensor as a Python bool is not allowed. Use if t is not None: instead of if t: to test if a tensor is defined, and use the logical TensorFlow ops to test the value of a tensor.
envy@ub1404:~/os_pri/github/NTM-tensorflow$

Potential error for the binary_cross_entropy_with_logits

In the script 'ops.py', the binary_cross_entropy_with_logits is defined by using the following equations

For brevity, let x = logits, z = targets.
The logistic loss is loss(x, z) = - sum_i (x[i] * log(z[i]) + (1 - x[i]) * log(1 - z[i])).

But I think the meaning of x and z is wrong. x should be target and z should be logits. Here is the reference
http://deeplearning.net/software/theano/library/tensor/nnet/nnet.html#theano.tensor.nnet.nnet.binary_crossentropy

Is __call__ in ntm_cell correct?

Hi
I am some kind of confused about the implementation of ntm_cell, especially the call() part.
According to the code, it seems that the new_output has nothing to do with the memory part.
Should it be more plausible if the new_output adapts itself with memory part?
`
output_list, hidden_list = self.build_controller(input_, read_list_prev,
output_list_prev,
hidden_list_prev)

    # last output layer from LSTM controller
    last_output = output_list[-1]

    # build a memory
    M, read_w_list, write_w_list, read_list = self.build_memory(M_prev,
                                                                read_w_list_prev,
                                                                write_w_list_prev,
                                                                last_output)

    # get a new output
    new_output, new_output_logit = self.new_output(last_output)

`

Restore called with invalid save path

When I run the test mode

$ python main.py --task copy --test_max_length 120 --max_length 130

it shows the error "Restore called with invalid save path" but the path exists! Anyone else encountered the same problem?

I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcurand.so locally
{'checkpoint_dir': 'checkpoint',
'continue_train': None,
'controller_dim': 100,
'controller_layer_size': 1,
'epoch': 100000,
'input_dim': 10,
'is_train': False,
'max_length': 130,
'min_length': 1,
'output_dim': 10,
'read_head_size': 1,
'task': 'copy',
'test_max_length': 120,
'write_head_size': 1}
changepoint/copy_130
ln: failed to create symbolic link ‘copy_130/copy_10’: File exists
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:925] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:951] Found device 0 with properties:
name: GeForce GTX 960
major: 5 minor: 2 memoryClockRate (GHz) 1.253
pciBusID 0000:01:00.0
Total memory: 3.94GiB
Free memory: 229.88MiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:972] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 960, pci bus id: 0000:01:00.0)
[] Building a NTM model
Percent: [####################] 100.00% Finished.
[
] Build a NTM model finished
[*] Reading checkpoints...
Traceback (most recent call last):
File "/home/user/Desktop/NTM-tensorflow-master/main.py", line 83, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv[:1] + flags_passthrough))
File "/home/user/Desktop/NTM-tensorflow-master/main.py", line 70, in main
ntm.load(FLAGS.checkpoint_dir, FLAGS.task)
File "/home/user/Desktop/NTM-tensorflow-master/ntm.py", line 266, in load
self.saver.restore(self.sess, os.path.join(checkpoint_dir, ckpt_name))
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1342, in restore
"File path is: %r" % (save_path, file_path))
ValueError: Restore called with invalid save path: u'checkpoint/copy_130/NTM-copy_copy.model-3402'. File path is: u'checkpoint/copy_130/NTM-copy_copy.model-3402'
[Finished in 286.5s with exit code 1]

A more efficient approach

Hi @carpedm20,

your code helped me a lot when I was just starting to understand NTMs and I am therefore very grateful for your contribution.

Recently, I found out about the Differential Neural Computer implementation from Mostafa-Samir. He approaches the problem of having different sequence lengths in a different way using some dynamic functions provided by TensorFlow. This code, clearly, results to be more efficient than the reference code you have provided.

I decided to replicate the NTM implementation following the same approach and I ended up with this. I have made the proper references to your code and approach. I hope you don't mind.

Best,

Cell's call: shouldn't the input's first dimension be of `batch` size?

I see for the description of the input tensor in the call function: inputs: input Tensor, 2D, 1 x input_size.

Shouldn't it rather be inputs: input Tensor, 2D, batch x input_size.?

It returns something 2D of the batch size. The training is so fast on my laptop compared to a normal LSTM that I am starting to doubt whether or not if it processes the full batch I am feeding to the cell. I assume that it accepts an input of shape batch x output_dim because the output of the call contains 2D tensors of batch size.

def __call__(self, input_, state=None, scope=None):
    """Run one step of NTM.
    Args:
        inputs: input Tensor, 2D, 1 x input_size.
        state: state Dictionary which contains M, read_w, write_w, read,
            output, hidden.
        scope: VariableScope for the created subgraph; defaults to class name.
    Returns:
        A tuple containing:
        - A 2D, batch x output_dim, Tensor representing the output of the LSTM
            after reading "input_" when previous state was "state".
            Here output_dim is:
                 num_proj if num_proj was set,
                 num_units otherwise.
        - A 2D, batch x state_size, Tensor representing the new state of LSTM
            after reading "input_" when previous state was "state".
    """

Found in:
https://github.com/carpedm20/NTM-tensorflow/blob/master/ntm_cell.py

'NTM' object has no attribute '_max_length'

Trying to train the model, it begins well but after a while it shows the error
'NTM' object has no attribute '_max_length'

[] Building a NTM model
Percent: [####################] 100.00% Finished.
[
] Building a loss model for seq_length 1
[] Building a loss model for seq_length 2
[
] Building a loss model for seq_length 3
[] Building a loss model for seq_length 4
[
] Building a loss model for seq_length 5
[] Building a loss model for seq_length 6
[
] Building a loss model for seq_length 7
[] Building a loss model for seq_length 8
[
] Building a loss model for seq_length 9
[] Building a loss model for seq_length 10
[
] Build a NTM model finished
[] Initialize all variables
[
] Initialization finished
[*] Reading checkpoints...
Traceback (most recent call last):
File "main.py", line 72, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv[:1] + flags_passthrough))
File "main.py", line 55, in main
task.train(ntm, FLAGS, sess)
File "/home/lina/deep_learning_frameworks/NTM-tensorflow/tasks/copy.py", line 73, in train
ntm.load(config.checkpoint_dir, config.task, strict=config.continue_train is True)
File "/home/lina/deep_learning_frameworks/NTM-tensorflow/ntm.py", line 260, in load
task_dir = "%s_%s" % (task_name, self._max_length)
AttributeError: 'NTM' object has no attribute '_max_length'

TypeError: sequence_loss() ...

When trying to run
$ python main.py --task copy --is_train True
or
$ python main.py --task copy --test_max_length 10

I am getting this TypeError in 'Building a loss model for seq_length' part:

File "/.../NTM-tensorflow/ntm.py", line 139, in build_model
binary_cross_entropy_with_logits)
TypeError: sequence_loss() got an unexpected keyword argument 'num_decoder_symbols'

I am using python 2.7 in virtual environment.

tensorflow 0.12.1 failed to run NTM

Below I show I have verified I have tensorflow 0.12.1, and the error trying to run the copy task.

VSs-MacBook-Pro-5:NTM-tensorflow dfreelan$ python3 -c 'import tensorflow as tf; print(tf.__version__)'
0.12.1
VSs-MacBook-Pro-5:NTM-tensorflow dfreelan$ python3 main.py --task copy --is_train True
Traceback (most recent call last):
  File "main.py", line 6, in <module>
    from ntm import NTM
  File "/Users/dfreelan/dev/NTM-tensorflow/ntm.py", line 8, in <module>
    from tensorflow.contrib.legacy_seq2seq import sequence_loss
ImportError: No module named 'tensorflow.contrib.legacy_seq2seq'

Any help trying to resolve this issue would be appreciated!

Error with TensorFlow 1.1.0

Hi, when I was running the copy task with the command you provided in the README, I got some variable scope related error. My environment is:
++++
Python: 2.7.12
TensorFlow: 1.1.0
++++
The error message is :
++++
[] Building a NTM model
Percent: [####################] 100.00% Finished.
[
] Building a loss model for seq_length 1
Traceback (most recent call last):
File "main.py", line 72, in
tf.app.run()
File "/net/mlfs01/export/users/byang/env-tensorflow/local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "main.py", line 54, in main
cell, ntm = create_ntm(FLAGS, sess)
File "main.py", line 39, in create_ntm
test_max_length=config.test_max_length, scope=scope, **ntm_args)
File "/net/mlfs01/export/users/byang/working/NTM-tensorflow/ntm.py", line 83, in init
self.build_model(forward_only)
File "/net/mlfs01/export/users/byang/working/NTM-tensorflow/ntm.py", line 169, in build_model
global_step=self.global_step)
File "/net/mlfs01/export/users/byang/env-tensorflow/local/lib/python2.7/site-packages/tensorflow/python/training/optimizer.py", line 446, in apply_gradients
self._create_slots([_get_variable_for(v) for v in var_list])
File "/net/mlfs01/export/users/byang/env-tensorflow/local/lib/python2.7/site-packages/tensorflow/python/training/rmsprop.py", line 99, in _create_slots
self._get_or_make_slot(v, val_rms, "rms", self._name)
File "/net/mlfs01/export/users/byang/env-tensorflow/local/lib/python2.7/site-packages/tensorflow/python/training/optimizer.py", line 727, in _get_or_make_slot
named_slots[_var_key(var)] = slot_creator.create_slot(var, val, op_name)
File "/net/mlfs01/export/users/byang/env-tensorflow/local/lib/python2.7/site-packages/tensorflow/python/training/slot_creator.py", line 113, in create_slot
return _create_slot_var(primary, val, "", validate_shape, None, None)
File "/net/mlfs01/export/users/byang/env-tensorflow/local/lib/python2.7/site-packages/tensorflow/python/training/slot_creator.py", line 66, in _create_slot_var
validate_shape=validate_shape)
File "/net/mlfs01/export/users/byang/env-tensorflow/local/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 1049, in get_variable
use_resource=use_resource, custom_getter=custom_getter)
File "/net/mlfs01/export/users/byang/env-tensorflow/local/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 948, in get_variable
use_resource=use_resource, custom_getter=custom_getter)
File "/net/mlfs01/export/users/byang/env-tensorflow/local/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 356, in get_variable
validate_shape=validate_shape, use_resource=use_resource)
File "/net/mlfs01/export/users/byang/env-tensorflow/local/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 341, in _true_getter
use_resource=use_resource)
File "/net/mlfs01/export/users/byang/env-tensorflow/local/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 671, in _get_single_variable
"VarScope?" % name)
ValueError: Variable NTM-copy/NTM-copy_1/init_cell/Variable/RMSProp/ does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=None in VarScope?
(env-tensorflow) byang@snake10:/net/mlfs01/export/users/byang/working/NTM-tensorflow$ python
Python 2.7.12 (default, Nov 19 2016, 06:48:10)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.

import tensorflow as tf
tf.version
'1.1.0'

++++++
I failed to resolve this issue, can you take a look and see what happened? Thanks.

Some syntax Error

Hello;

There are some syntax errors in the code:

In ntm.py on line 260 task_dir = "%s_%s" % (task_name, self._max_length) , python interpreter says that there is no variable called _max_lentgh. It should have been max_length, I suppose.

Again in ntm.py on line 184 and 187 it complains about non-existing variable called output_logits

Would you mind fixing them, sir?

Kind Regards,
Ilker GURCAN

Print memory contents

In the script 'ntm_cell.py' there is a function named 'build_memory'. It returns a memory matrix M. When I try to print the contents of memory matrix M
it shows:

print (M)
Tensor("NTM_1/memory/Print:0", shape=(128, 20), dtype=float32, device=/device:CPU:0)

when I try to print the contents of this tensor using
print (M.eval())

an error occurs

How can i see what does memory contain?

Traceback (most recent call last):
File "my_main.py", line 72, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv[:1] + flags_passthrough))
File "my_main.py", line 52, in main
cell, ntm = create_ntm(FLAGS, sess)
File "my_main.py", line 36, in create_ntm
test_max_length=FLAGS.test_max_length, **ntm_args) # call NTM class constructor in ntm.py
File "/home/user/Documents/deep_learning_frameworks/NTM-tensorflow-master/ntm.py", line 84, in init
self.build_model(forward_only)
File "/home/user/Documents/deep_learning_frameworks/NTM-tensorflow-master/ntm.py", line 92, in build_model
_, prev_state = self.cell(self.start_symbol, state=None)
File "/home/user/Documents/deep_learning_frameworks/NTM-tensorflow-master/ntm_cell.py", line 83, in call
last_output)
File "/home/user/Documents/deep_learning_frameworks/NTM-tensorflow-master/ntm_cell.py", line 295, in build_memory
print (M.eval())
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 559, in eval
return _eval_using_default_session(self, feed_dict, self.graph, session)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 3761, in _eval_using_default_session
return session.run(tensors, feed_dict)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 717, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 915, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 965, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 985, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors.InvalidArgumentError: You must feed a value for placeholder tensor 'start_symbol' with dtype float and shape [10]
[[Node: start_symbol = Placeholderdtype=DT_FLOAT, shape=[10], _device="/job:localhost/replica:0/task:0/cpu:0"]]

Caused by op u'start_symbol', defined at:
File "my_main.py", line 72, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv[:1] + flags_passthrough))
File "my_main.py", line 52, in main
cell, ntm = create_ntm(FLAGS, sess)
File "my_main.py", line 36, in create_ntm
test_max_length=FLAGS.test_max_length, **ntm_args) # call NTM class constructor in ntm.py
File "/home/user/Documents/deep_learning_frameworks/NTM-tensorflow-master/ntm.py", line 67, in init
name='start_symbol')
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/array_ops.py", line 1332, in placeholder
name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 1748, in _placeholder
name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 749, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2380, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1298, in init
self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'start_symbol' with dtype float and shape [10]
[[Node: start_symbol = Placeholderdtype=DT_FLOAT, shape=[10], _device="/job:localhost/replica:0/task:0/cpu:0"]]

memory usage

Is this model particularly memory intensive? The model hangs on building the loss model for seq_length 10 having used 16GB of RAM.

This might be caused by the fact that I am using tensorflow 1.0.0-rc1 (because of #26, legacy_seq2seq is present in 1.0.0-rc1 and not 0.12.1). I've modified your code to run on this version, and it trains correctly with lower length sequences, but still eats memory.

Is this normal, or some sort of memory leak caused by the newer version of tensorflow?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.