Comments (22)
Thanks for taking a look at this! I no longer get the above error, but now get:
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties:
name: GeForce GTX TITAN X
major: 5 minor: 2 memoryClockRate (GHz) 1.076
pciBusID 0000:03:00.0
Total memory: 11.92GiB
Free memory: 11.81GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:03:00.0)
Restoring model from model.ckpt-1950
Traceback (most recent call last):
File "generate.py", line 86, in <module>
main()
File "generate.py", line 66, in main
feed_dict={samples: window})
File "/home/ubuntu/jupyter_base/venv/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 710, in run
run_metadata_ptr)
File "/home/ubuntu/jupyter_base/venv/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 908, in _run
feed_dict_string, options, run_metadata)
File "/home/ubuntu/jupyter_base/venv/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 958, in _do_run
target_list, options, run_metadata)
File "/home/ubuntu/jupyter_base/venv/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 978, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors.InvalidArgumentError: Expected begin[2] == 0 (got 0) and size[2] == 0 (got -2) when input.dim_size(2) == 0
[[Node: wavenet/dilated_stack/layer2/Slice = Slice[Index=DT_INT32, T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"](wavenet/dilated_stack/layer2/Reshape_1, wavenet/dilated_stack/layer2/Slice/begin, wavenet/dilated_stack/layer2/Slice/size)]]
[[Node: wavenet/Reshape_1/_131 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_2304_wavenet/Reshape_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
Caused by op u'wavenet/dilated_stack/layer2/Slice', defined at:
File "generate.py", line 86, in <module>
main()
File "generate.py", line 51, in main
next_sample = net.predict_proba(samples)
File "/home/ubuntu/jupyter_base/project/tensorflow-wavenet/wavenet.py", line 171, in predict_proba
raw_output = self._create_network(encoded)
File "/home/ubuntu/jupyter_base/project/tensorflow-wavenet/wavenet.py", line 129, in _create_network
self.dilation_channels)
File "/home/ubuntu/jupyter_base/project/tensorflow-wavenet/wavenet.py", line 75, in _create_dilation_layer
conv_filter = self._causal_dilated_conv(input_batch, weights_filter, dilation)
File "/home/ubuntu/jupyter_base/project/tensorflow-wavenet/wavenet.py", line 48, in _causal_dilated_conv
out = tf.slice(restored, 4 * [0], [-1, -1, tf.shape(restored)[2] - pad_elements, -1])
File "/home/ubuntu/jupyter_base/venv/lib/python2.7/site-packages/tensorflow/python/ops/array_ops.py", line 328, in slice
return gen_array_ops._slice(input_, begin, size, name=name)
File "/home/ubuntu/jupyter_base/venv/lib/python2.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 2009, in _slice
name=name)
File "/home/ubuntu/jupyter_base/venv/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 703, in apply_op
op_def=op_def)
File "/home/ubuntu/jupyter_base/venv/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2317, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/home/ubuntu/jupyter_base/venv/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1239, in __init__
self._traceback = _extract_stack()
from tensorflow-wavenet.
Trying even after pulling these patches and retraining (although not for long), I get this (OSX, no GPU):
22:43 $ python generate.py --samples 16000 model.ckpt-150
Restoring model from model.ckpt-150
E tensorflow/core/client/tensor_c_api.cc:485] Expected begin[2] == 0 (got 0) and size[2] == 0 (got -2) when input.dim_size(2) == 0
[[Node: wavenet/dilated_stack/layer2/causal_conv/Slice_6 = Slice[Index=DT_INT32, T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](wavenet/dilated_stack/layer2/causal_conv/Reshape_1, wavenet/dilated_stack/layer2/causal_conv/Slice_6/begin, wavenet/dilated_stack/layer2/causal_conv/Slice_6/size)]]
Traceback (most recent call last):
File "generate.py", line 86, in <module>
main()
File "generate.py", line 66, in main
feed_dict={samples: window})
File "/Users/danbri/tensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 382, in run
run_metadata_ptr)
File "/Users/danbri/tensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 655, in _run
feed_dict_string, options, run_metadata)
File "/Users/danbri/tensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 723, in _do_run
target_list, options, run_metadata)
File "/Users/danbri/tensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 743, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors.InvalidArgumentError: Expected begin[2] == 0 (got 0) and size[2] == 0 (got -2) when input.dim_size(2) == 0
[[Node: wavenet/dilated_stack/layer2/causal_conv/Slice_6 = Slice[Index=DT_INT32, T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](wavenet/dilated_stack/layer2/causal_conv/Reshape_1, wavenet/dilated_stack/layer2/causal_conv/Slice_6/begin, wavenet/dilated_stack/layer2/causal_conv/Slice_6/size)]]
Caused by op u'wavenet/dilated_stack/layer2/causal_conv/Slice_6', defined at:
File "generate.py", line 86, in <module>
main()
File "generate.py", line 51, in main
next_sample = net.predict_proba(samples)
File "/Users/danbri/tensorflow-wavenet/wavenet.py", line 174, in predict_proba
raw_output = self._create_network(encoded)
File "/Users/danbri/tensorflow-wavenet/wavenet.py", line 132, in _create_network
self.dilation_channels)
File "/Users/danbri/tensorflow-wavenet/wavenet.py", line 78, in _create_dilation_layer
conv_filter = self._causal_dilated_conv(input_batch, weights_filter, dilation)
File "/Users/danbri/tensorflow-wavenet/wavenet.py", line 49, in _causal_dilated_conv
out = tf.slice(restored, 4 * [0], [-1, -1, tf.shape(restored)[2] - pad_elements, -1])
File "/Users/danbri/tensorflow/lib/python2.7/site-packages/tensorflow/python/ops/array_ops.py", line 388, in slice
return gen_array_ops._slice(input_, begin, size, name=name)
File "/Users/danbri/tensorflow/lib/python2.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 2001, in _slice
name=name)
File "/Users/danbri/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 703, in apply_op
op_def=op_def)
File "/Users/danbri/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2310, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/Users/danbri/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1232, in __init__
self._traceback = _extract_stack()
from tensorflow-wavenet.
This bug seems to be fixed now, so I'll close the issue.
If there are new problems with the generation script we should open a new issue, as this one is starting to get long.
from tensorflow-wavenet.
I'm also seeing this issue. I suspect that this might be a bug in TensorFlow.
I'll have a look at what's causing this.
If you want to have a go at generating sound, an easy way to avoid this bug is to set the dilations
parameter in the wavenet_params.json
file to a list of 1
s.
It will then use regular convolution, which doesn't have the problem.
(You will have to train a new network with this configuration).
from tensorflow-wavenet.
Thanks for the workaround! It's working up until the 255th step, where it seems to be trying to address something out of range. I get the error:
... previous steps here ...
Sample 255/8000: 151
E tensorflow/stream_executor/cuda/cuda_driver.cc:1140] could not synchronize on CUDA context: CUDA_ERROR_ILLEGAL_ADDRESS :: No stack trace available
F tensorflow/core/common_runtime/gpu/gpu_util.cc:370] GPU sync failed
from tensorflow-wavenet.
I've now fixed the first issue by avoiding to use tf.space_to_batch
in the model, so it should be possible to load models with dilation factors > 1.
I'm also seeing the CUDA_ERROR_ILLEGAL_ADDRESS
error, and will try to fix this one next.
from tensorflow-wavenet.
I got this error
Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
Restoring model from ./logdir/train/2016-09-16T23:00:56.017589/model.ckpt-1800
Traceback (most recent call last):
File "generate.py", line 86, in
main()
File "generate.py", line 66, in main
feed_dict={samples: window})
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 710, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 908, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 958, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 978, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors.InvalidArgumentError: Expected begin[2] == 0 (got 0) and size[2] == 0 (got -2) when input.dim_size(2) == 0
[[Node: wavenet/dilated_stack/layer2/Slice = Slice[Index=DT_INT32, T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"](wavenet/dilated_stack/layer2/Reshape_1, wavenet/dilated_stack/layer2/Slice/begin, wavenet/dilated_stack/layer2/Slice/size)]]
[[Node: wavenet/Reshape_1/_131 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_2835_wavenet/Reshape_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]]
Caused by op u'wavenet/dilated_stack/layer2/Slice', defined at:
File "generate.py", line 86, in
main()
File "generate.py", line 51, in main
next_sample = net.predict_proba(samples)
File "/mnt/sdb/tensorflow-wavenet/wavenet.py", line 171, in predict_proba
raw_output = self._create_network(encoded)
File "/mnt/sdb/tensorflow-wavenet/wavenet.py", line 129, in _create_network
self.dilation_channels)
File "/mnt/sdb/tensorflow-wavenet/wavenet.py", line 75, in _create_dilation_layer
conv_filter = self._causal_dilated_conv(input_batch, weights_filter, dilation)
File "/mnt/sdb/tensorflow-wavenet/wavenet.py", line 48, in _causal_dilated_conv
out = tf.slice(restored, 4 * [0], [-1, -1, tf.shape(restored)[2] - pad_elements, -1])
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/array_ops.py", line 328, in slice
return gen_array_ops.slice(input, begin, size, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 2009, in _slice
name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 703, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2317, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1239, in init
self._traceback = _extract_stack()
from tensorflow-wavenet.
At least samples needs to be initialised to float32 instead of int32 in generate.py it seems. Still trying to figure out.
from tensorflow-wavenet.
@adroit91: Are you sure? The int32 array gets one-hot encoded, which casts the output to float32.
That should be a valid input to the network.
from tensorflow-wavenet.
@ibab, I could be wrong. But on an interactive shell, there seems to be an error (sorry, am away from my computer right now). Also, the predict samples function seems to be missing a preprocessing step. I'm not sure but still trying to understand the code.
from tensorflow-wavenet.
I've reduced the problem to the following code:
import numpy as np
import tensorflow as tf
x = np.ones((256, 256), dtype=np.float32)
s = tf.Session()
X = tf.placeholder(tf.float32)
y = tf.nn.softmax(X)
s.run(y, feed_dict={X: x})
$ python minimal_example.py
[...]
E tensorflow/stream_executor/cuda/cuda_event.cc:49] Error polling for event status: failed to query event: CUDA_ERROR_ILLEGAL_ADDRESS
F tensorflow/core/common_runtime/gpu/gpu_event_mgr.cc:198] Unexpected Event status: 1
zsh: abort python run.py
So I suspect that this is indeed a bug in TensorFlow, unless someone can spot a mistake?
I'll try to run this with the latest master
of TensorFlow.
from tensorflow-wavenet.
I just found runnning tf.initialize_all_variables()
is missing. Doesn't it affect?
from tensorflow-wavenet.
@mecab: The Saver
automatically initializes the variables that it restores, so this should be fine.
from tensorflow-wavenet.
ah, I see, thanks.
from tensorflow-wavenet.
I also confirmed #13 (comment), and found this is not happen with dtype=tf.float64
.
I successfully generated the result by casting out
as follows:
proba = tf.nn.softmax(tf.cast(out, tf.float64))
in predict_proba()
.
It's dirty and could be cause slow down, but works 👹
(I haven't tested the result with well-trained model, though.)
Strange thing I see is generating samples from 0 to 255 is somehow too much slow. I also doubt it is Tensorflow or CUDA's bug.
from tensorflow-wavenet.
@mecab Yeah, casting it is a good workaround.
Do you want to make a PR with this fix?
from tensorflow-wavenet.
OK, I'm writing that
from tensorflow-wavenet.
Made it!
from tensorflow-wavenet.
The implementation of causal convolution was buggy for some inputs. I've rewritten the part that caused this error in 5136dbf
Everything seems to be working now.
It would be great if someone could confirm that it's working for them as well.
from tensorflow-wavenet.
the last commit can generate wave file now, but only generated noise,is there something missing?
from tensorflow-wavenet.
The default hyperparameters are just guesses at the moment, so I wouldn't
expect it to work well unless you've changed them and achieved a low loss
value.
I suspect that we will need a much larger number of layers to reproduce the
DeepMind results.
Another thing would be to increase the window size from the default value
to something like 8000.
That way the network will take into account a larger number of past samples
when generating.
Also, there was a bug in the generation script that reduced the number of
values in the window to a single one, which would definitely lead to noise.
You should make sure that you have that fix in your local version. It's in
commit 6487cd1.
On Sat, Sep 17, 2016, 15:36 iwater [email protected] wrote:
the last commit can generate wave file now, but only generated noise,is
there something missing?—
You are receiving this because you were mentioned.Reply to this email directly, view it on GitHub
#13 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AA2Wo7tglobBGXnTYAbRSXIsvkZx2XRVks5qq-zygaJpZM4J-J6D
.
from tensorflow-wavenet.
I solved the problem by upgrading cuDNN to v5.1 version. You can have a try.
cuDNN download : https://developer.nvidia.com/rdp/cudnn-download
from tensorflow-wavenet.
Related Issues (20)
- how dialated convolution actually work ?
- How to stop and resume training HOT 2
- Problem on runing it on colab HOT 2
- generate.py very slow with GPU HOT 1
- TypeError: cast() missing 1 required positional argument: 'dtype'
- tensorboard result: the generated audio of generate.py is 0 seconds
- Understanding convolution kernels in dilation layers HOT 4
- TypeError: Value passed to parameter 'indices' has DataType float32 not in list of allowed values: uint8, int32, int64 HOT 1
- I failed to download the dataset, how should I resolve the voice HOT 1
- My loss function fluctuates like crazy.
- Colab problem: continue previous training HOT 4
- problem on generate only noise HOT 5
- testing much worse than training?
- QUESTION How long does it take to generate one sample? HOT 1
- Module 'tensorflow' has no attribute 'placeholder' HOT 8
- Why is there no activation function applied to the 1x1 conv that produces the dense output?
- ModuleNotFoundError: No module named 'tensorflow.contrib' HOT 1
- about loading VCTK_Corpus dataset?
- Project dependencies may have API risk issues
- Training wavenet to rap?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tensorflow-wavenet.