ibab / tensorflow-wavenet Goto Github PK
View Code? Open in Web Editor NEWA TensorFlow implementation of DeepMind's WaveNet paper
License: MIT License
A TensorFlow implementation of DeepMind's WaveNet paper
License: MIT License
I train the network with default hyper-parameters:
Exception in thread Thread-2: Traceback (most recent call last): File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner self.run() File "/usr/lib/python2.7/threading.py", line 763, in run self.__target(*self.__args, **self.__kwargs) File "/home/jesse/tensorflow_workspace/tensorflow-wavenet/audio_reader.py", line 87, in thread_main audio = trim_silence(audio[:, 0]) File "/home/jesse/tensorflow_workspace/tensorflow-wavenet/audio_reader.py", line 47, in trim_silence return audio[indices[0]:indices[-1]] IndexError: index 0 is out of bounds for axis 0 with size 0
The error occurs at the last line of trim_silence method:
return audio[indices[0]:indices[-1]]
It seems that some audio consists of only silence and is blank after trimming.
Let's discuss strategies for producing audio samples.
When running over the entire dataset, I've so far only managed to reproduce recording noise and clicks.
Some ideas I've had to improve on this:
librosa
.Hi,
I generated the linguistic features as mentioned in the WaveNet paper for p280 speaker. If anyone is interested to use them for conditioning in WaveNet, please download via https://users.aalto.fi/~bollepb1/binary_labels_p280.zip. Each frame or row corresponds to 5ms of speech.
We've discussed the fact that one-hot encoding the input to the network is kind of weird, and that it would be more natural to use the waveform as a single-channel floating point tensor instead.
Does anyone have experience with running our implementation in this way?
Should we switch to this method?
Paper indicates wavenet encodes waveform in u-law encoding. :)
https://github.com/ritheshkumar95/WaveNet/blob/master/dataset.py#L177
@ibab I noticed you saw the efficient wavenet generation implementation I wrote with my friends:
https://github.com/tomlepaine/fast-wavenet
Can we help you add it to tensorflow-wavenet
?
Hi Igor, I'm getting OOM on GTX1080. Reduced sample size to one directory (175 files) and still getting this error. Do you have any ideas how to fit a tensor in memory on 8Gb cards?
I tensorflow/core/common_runtime/bfc_allocator.cc:698] Stats:
Limit: 7690878976
InUse: 7325787648
MaxInUse: 7465885184
NumAllocs: 9793
MaxAllocSize: 2779725056
W tensorflow/core/common_runtime/bfc_allocator.cc:270] ****_********************_*****************************************************************xxxxxxxxx
W tensorflow/core/common_runtime/bfc_allocator.cc:271] Ran out of memory trying to allocate 928.75MiB. See logs for memory state.
W tensorflow/core/framework/op_kernel.cc:940] Resource exhausted: OOM when allocating tensor with shape[256,256,1,3715]
W tensorflow/core/common_runtime/bfc_allocator.cc:213] Ran out of memory trying to allocate 1.33GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
Traceback (most recent call last):
File "train.py", line 151, in <module>
main()
File "train.py", line 136, in main
summary, loss_value, _ = sess.run([summaries, loss, optim])
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 710, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 908, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 958, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 978, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors.ResourceExhaustedError: OOM when allocating tensor with shape[256,256,1,3715]
[[Node: gradients/dilated_stack/layer4/conv_f_grad/Conv2DBackpropInput = Conv2DBackpropInput[T=DT_FLOAT, data_format="NHWC", padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/gpu:0"](gradients/dilated_stack/layer4/conv_f_grad/Shape, dilated_stack/layer4/Variable/read, gradients/dilated_stack/layer4/conv_f/BatchToSpace_grad/SpaceToBatch)]]
Caused by op u'gradients/dilated_stack/layer4/conv_f_grad/Conv2DBackpropInput', defined at:
Thank you in advance
Currently, we start the generation with a randomly picked waveform sample.
I wonder what kind of effect that has, considering that the dilated convolutions won't be able to reach backwards beyond the beginning of the generated sample.
Maybe we should start off with one of the audio recordings.
For my case, all the generated audio samples seem to be positive integer values.
I used the following architecture for the network. (wavenet_params.json)
"dilations": [1, 2, 4, 8, 16, 32, 64, 128, 256, 512,
1, 2, 4, 8, 16, 32, 64, 128, 256, 512,
1, 2, 4, 8, 16, 32, 64, 128, 256, 512],
I wonder if this only happens to me.. and want to know how to fix it. :)
In each step, the calculation of the loss is based on the cross entropy of the input value and the predicted value. Each data point in the predicted value is predicted based on its receptive field in the input value. During prediction, the input value is padded on the left with many 0s to implement the casual convolution. Thus, the beginning predicted data points are actually predicted based on those padding 0s, and may not be the same as the input values even when the model is trained for many steps. Is this the reason that why the loss drops to around 2 and cannot drop lower? Maybe it is more reasonable to calculate the loss not from the beginning data point, but from the N-th point where N is the receptive field.
Getting the following error when I try to train the network - any idea what this is?
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:924] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties:
name: GeForce GTX TITAN X
major: 5 minor: 2 memoryClockRate (GHz) 1.076
pciBusID 0000:01:00.0
Total memory: 12.00GiB
Free memory: 11.53GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:806] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:01:00.0)
Traceback (most recent call last):
File "main.py", line 129, in <module>
main()
File "main.py", line 83, in main
loss = net.loss(audio_batch)
File "/home/seth/Development/tensorflow-wavenet/wavenet.py", line 97, in loss
raw_output = self._create_network(encoded)
File "/home/seth/Development/tensorflow-wavenet/wavenet.py", line 67, in _create_network
dilation=dilation)
File "/home/seth/Development/tensorflow-wavenet/wavenet.py", line 23, in _create_dilation_layer
name="conv_f")
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/nn_ops.py", line 168, in atrous_conv2d
in_height = int(value_shape[1])
TypeError: __int__ returned non-int (type NoneType)
It could be really interesting to train with singing solo tracks and Lilypond. But probably could be too hard to collect a datest handing copyrights issues.
Training on a truncated sample of the VCTK corpus causes it to hang with no error if it hits the end of the corpus before the number of steps is complete. It looks like this is due to iterate_through_vctk
not resetting once it hits the end. It looks like this is also happening in #65 on the full corpus, since it's stalling at 44256 steps, which is close to the samples in the VCTK corpus (109 speakers x ~400 samples/speaker).
I've tested this with an expanded file list (to iterate through multiple epochs) and it no longer stops at the end of the corpus.
I suggest we implement a tf.train.string_input_producer
to iterate through epochs. That way we can also shuffle input, as has been mentioned in the discussion in #47.
The network currently runs into out of memory issues at a low number of layers.
This seems to be a problem with TensorFlow's atrous_conv2d
operation.
If I set the dilation factor to 1
, which means atrous_conv2d
simply calls conv2d
, I can easily run with 10s of layers.
It could just be the additional batch_to_space
and space_to_batch
operations, in which case I can write a single C++ op for atrous_conv2d
.
There're two issues with this method:
self
argument, it should be:def stop_threads(self):
Seems to me we should remove this method all together...
Please, I have a simple question.
I'm training WaveNet with the VCTK corpus dataset in a CPU only machine (Intel i5) running to 20 sec/step. At this rate, I can reach about 4000 steps per day, 30.000 at week. Question is: ¿How much steps do we need to get a loss from about 2? And, another question: Can anybody tell me what is the seconds per step rate that a good GPU can reach? By the way, what's the rate in which each of you are working on guys?
Regards,
Samu.
Note. I'm using the default params:
{
"filter_width": 2,
"sample_rate": 16000,
"dilations": [1, 2, 4, 8, 16, 32, 64, 128, 256,
1, 2, 4, 8, 16, 32, 64, 128, 256],
"residual_channels": 32,
"dilation_channels":16,
"quantization_channels": 256,
"skip_channels": 256,
"use_biases": false
}
From the model of mine trained 1999 steps(It might be so little steps to sound normally),
It sounds just like noises.
It would be better to give well-trained example output for understanding desired output.
I am trying to generate.py using python generate.py --samples 16000 ./model.ckpt-3999
However, the loading process fails with the following error:
Caused by op u'save/restore_slice', defined at:
File "generate.py", line 174, in <module>
main()
File "generate.py", line 113, in main
saver = tf.train.Saver(variables_to_restore)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 986, in __init__
self.build()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1015, in build
restore_sequentially=self._restore_sequentially)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 620, in build
restore_sequentially, reshape)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 357, in _AddRestoreOps
tensors = self.restore_op(filename_tensor, saveable, preferred_shard)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 270, in restore_op
preferred_shard=preferred_shard))
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/io_ops.py", line 204, in _restore_slice
preferred_shard, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_io_ops.py", line 359, in _restore_slice
preferred_shard=preferred_shard, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 749, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2380, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1298, in __init__
self._traceback = _extract_stack()
NotFoundError (see above for traceback): Tensor name "wavenet/causal_layer/Variable" not found in checkpoint files ./model.ckpt-3999
[[Node: save/restore_slice = RestoreSlice[dt=DT_FLOAT, preferred_shard=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/restore_slice/tensor_name, save/restore_slice/shape_and_slice)]]
[[Node: save/restore_slice_48/_117 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_36_save/restore_slice_48", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
The tensors in my checkpoint file have names that look deliberate, for example, //wavenet/causal_layer/filter/Adam
However, some println debugging shows that the saver in generate.py is trying to load very generic variable names like this:
wavenet/causal_layer/Variable
wavenet/dilated_stack/layer0/Variable
wavenet/dilated_stack/layer0/Variable_1
wavenet/dilated_stack/layer0/Variable_2
wavenet/dilated_stack/layer0/skip
wavenet/dilated_stack/layer1/Variable
I'm using tensorflow master compiled against Cuda 8.0 RC.
Not sure if this is an issue with my setup or a bug. Any help is greatly appreciated.
Would it be helpful to add ability to manually change learning rate during training? I have naive implementation of this feature here.
My computer is not very powerful and I don't have the resources available in my university, can someone or could the readme.md give us a link where we can download a pre-trained model?
I've started to play around with the MagnaTagATune dataset.
There's a small change that needs to be made to the code when training on this dataset:
Because it uses mp3 instead of wav, the pattern in wavenet/audio_reader.py
needs to be adjusted.
It would be nice to write a MagnaReader
class that inherits from the AudioReader
(or contains one), and that's able to filter the content by genre using the provided metadata.
Some of the variables don't have a name, which makes it harder to debug. PR's welcome.
Unfortunately, there's currently no easy way to disable the verbose logging output from ffmpeg that results from loading the .wav
files into TensorFlow.
(I've recompiled TensorFlow to disable the output while developing the network).
I can use a different library to decode the .wav
files, but that would add an extra dependency (and some extra code).
One of the authors mentioned that the skip connections are connected to a separate 1x1 convolution than the one which output goes into the add block.
I test _causal_dilated_conv in wavenet.py with the following toy example:
batch_size: 1
height: 1
width: 20
in_channel: 1
out_channel: 1
dilation: 4
The “value” tensor is set with shape [1, 1, 20, 1], and the values are from 1 to 20. The “filter” is set with shape [1, 2, 1, 1], and both values are 1.
The ”out“ tensor is:
[[[[ 3.], [ 5.], [ 7.], [ 9.], [ 13.], [ 15.], [ 17.], [ 19.], [ 23.], [ 25.], [ 27.], [ 29.], [ 33.], [ 35.], [ 37.], [ 39.]]]]
I suppose the correct ”out“ should be:
[[[[ 6.], [ 8.], [ 10.], [ 12.], [ 14.], [ 16.], [ 18.], [ 20.], [ 22.], [ 24.], [ 26.], [ 28.], [ 30.], [ 32.], [ 34.], [ 36.]]]]
In this example, the "reshaped" tensor is:
[[[[ 1.], [ 2.], [ 3.], [ 4.], [ 5.]]],
[[[ 6.], [ 7.], [ 8.], [ 9.], [ 10.]]],
[[[ 11.],[ 12.],[ 13.],[ 14.],[ 15.]]],
[[[ 16.],[ 17.],[ 18.],[ 19.],[ 20.]]]]
I guess it should be
[[[[ 1.], [ 5.], [ 9.], [ 13.], [ 17.]]],
[[[ 2.], [ 6.], [ 10.], [ 14.], [ 18.]]],
[[[ 3.], [ 7.], [ 11.], [ 15.], [ 19.]]],
[[[ 4.], [ 8.], [ 12.], [ 16.], [ 20.]]]]
and then the result should be correct.
Are the magic numbers for weight initialization chosen for particular reasons (e.g. stddev=0.3)? Would we prefer something like tf.contrib.layers.xavier_initializer?
Has anybody got loss lower than ~2? Tried couple of configurations (default, 3 and 4 stacks of 10 dilation layers), but loss does not get lower, suggesting the network is not learning anymore.
Also, there is what happened happened after ~30k steps:
I believe this is the same problem as reported in #30. There is what happens with weights:
Now running the same network with l2 norm regularization added.
And one more note: training just stops after 44256 steps (already happened twice) without any warnings or errors, despite of num_steps=50000
I'm trying to understand the differences between the implementation in wavenet.py
in this repository and the implementation in @tomlepaine's fast-wavenet. I think I understand the insight in fast-wavenet but are there drawbacks associated with it that mean this repository wouldn't want to just adopt it as the main implementation? Why keep separate "fast" and (presumably) "slow" models?
There's no way to turn off fast generation that I can find, and a model using biases needs to have it switched off in order to generate. The --fast_generation command line arg is always True no matter what you do.
In the white paper, they mention conditioning to a particular speaker as an input they condition globally, and the TTS component as an up-sampled (deconvolution) conditioned locally. For the latter, they also mention that they tried just repeating the values, but found it worked less well than doing the deconvolutions.
Is there effort underway to implement either of these? Practically speaking, implementing the local conditioning would allow us to begin to have this implementation speak recognizable words.
How to use TensorBoard?
Here you say that you will quantize the channels to 256 possible amplitude values (as is mentioned on page 3 in the original paper). When you run the quantization in the preprocessing step you cast the data from the mu-law companding transformation to a tf.int32 which can take on 4294967296 different values. Am I mistaken or should this be cast to a tf.int8 instead?
When I try to run train.py, even with the --help flag, I recieve the error:
File "train.py", line 40 parser.add_argument('--num_steps', type=int, default-NUM_STEPS, SyntaxError: non-keyword arg after keyword arg
and am thus unable to train the network.
All dependencies (tensorflow, FFmpeg, etc.) are installed.
this is related to #104.
i'm thinking ahead perhaps to when this repo supports conditioning on labels. since most interesting audio sets may not be so neatly organized as VCTK, you can potentially trim your audio and create labels via analysis.
even without labels, it's useful to be able to train on a subset of a folder, or even subsets of individual audio files. one workflow I am developing is to isolate a segment of interest, then find audio chunks which are similar to it (using mel bands or other DSP stats useful in music information retrieval). it can be a bit heavy-handed but it can help weed out noise or other undesirable material.
it might be beyond the scope of what this repo is aiming for, in which case I'll just keep developing it separately. but one useful feature that I'd propose to start is simply letting you specify your training set in a text file rather than just a directory. maybe like a JSON file with paths to audio files and a list of subsegments to pull out. some supporting scripts can be used to generate it, e.g. a function which takes an audio file and time interval as input, and produces a JSON of T seconds of audio segments from a directory which are most similar to the input.
The performance of the network on GPUs seems to be lagging behind the CPU performance.
I suspect that this is because the 2D convolution isn't designed to work efficiently if the height of the input is 1.
It shouldn't be too difficult to write some custom code to perform an efficient 1D convolution.
For example, fft could be used for this.
The current implementation computes the short-time RMSE amplitude, applies a given threshold on its value (the units are raw amplitude), and trims the audio from the beginning until a frame above the threshold is found, and from the last frame above the threshold until the end. If the result is empty, that file is discarded.
At some point, we might want to implement a better algorithm, such as using a threshold relative to the maximum amplitude in the example or applying a smoothing filter. The algorithm should also be configurable.
It would be great continue learning from the someone's trained model or paused mine.
Hi,
I followed all the instructions on training a model from the README(from the default dataset). Then I used the generation script to generate a few seconds of sound, again, according to the instructions, but unfortunately I got silence.
Any idea of what I might got wrong?
Running train.py on Nvidia GRID K2, 1536 cores, 3.5 GB memory.
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties: name: GRID K2 major: 3 minor: 0 memoryClockRate (GHz) 0.745 pciBusID 0000:00:04.0 Total memory: 3.50GiB Free memory: 3.45GiB I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0 I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GRID K2, pci bus id: 0000:00:04.0) Trying to restore saved checkpoints from ./logdir/train/2016-09-20T19-23-04 ... No checkpoint found. step 0 - loss = 7.461, (13.180 sec/step) Storing checkpoint to ./logdir/train/2016-09-20T19-23-04 ... Done.
Then the BFC allocator runs out of memory:
W tensorflow/core/common_runtime/bfc_allocator.cc:270] **************************************************************************************__**********xx W tensorflow/core/common_runtime/bfc_allocator.cc:271] Ran out of memory trying to allocate 194.79MiB. See logs for memory state. W tensorflow/core/framework/op_kernel.cc:940] Resource exhausted: OOM when allocating tensor with shape[199462,256] I tensorflow/core/common_runtime/gpu/pool_allocator.cc:244] PoolAllocator: After 3463 get requests, put_count=3233 evicted_count=1000 eviction_rate=0.30931 and unsatisfied allo cation rate=0.38406
Do you know if there is a setting I can specify to work around GPU memory limitations ?
Comparing to https://github.com/basveeling/wavenet example output, their implementation sounds much more clear.
Has anybody managed to produce clear output?
It's not ideal that the length of the samples varies wildly.
This could be fixed by cropping (or even padding) them to a fixed size.
The current setup with some parameters specified as CLI flags and others in the JSON file is not optimal. There are two competing needs:
We should find a way to accommodate both. I propose:
It still doesn't feel optimal, though. Any suggestions?
What is the desirable size for input frame? I think mini batch would be beneficial so we can decrease frame size to get larger mini batch.
Isn't it necessary?
I'd like to add unit tests before extending the implementation further.
That should prevent us from pushing non-working versions of the code in the future.
TensorFlow has a nice API for tests, which I'd like to use for this: https://www.tensorflow.org/versions/r0.10/api_docs/python/test.html#testing
is it necessary to write a script doing something like checking prerequisites to avoid the import error, such as "ImportError: No module named librosa".
Traceback (most recent call last):
File "train.py", line 171, in
main()
File "train.py", line 167, in main
coord.join(threads)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/coordinator.py", line 325, in join
" ".join(stragglers))
RuntimeError: ('Coordinator stopped with threads still running: %s', 'Thread-4 Thread-6 Thread-8')
I tried with the solution in http://stackoverflow.com/questions/36210162/tensorflow-stopping-threads-via-coordinator-seems-not-to-work, but now work.
Should we use YAML instead of JSON for the parameters?
👍 : More human-friendly, more compact.
👍 : We can add comments explaining the meaning of each parameter.
👎 : Not native, we would need to add pyYAML to the dependencies.
It's pretty annoying to have to wait for the generation of the entire audio waveform before being able to inspect the output.
It would make sense to store the current output periodically.
When I try to run generate.py
per the readme, I get this:
I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:03:00.0)
Restoring model from model.ckpt-250
Traceback (most recent call last):
File "generate.py", line 86, in <module>
main()
File "generate.py", line 66, in main
feed_dict={samples: window})
File "/home/ubuntu/jupyter_base/venv/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 710, in run
run_metadata_ptr)
File "/home/ubuntu/jupyter_base/venv/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 908, in _run
feed_dict_string, options, run_metadata)
File "/home/ubuntu/jupyter_base/venv/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 958, in _do_run
target_list, options, run_metadata)
File "/home/ubuntu/jupyter_base/venv/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 978, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors.InvalidArgumentError: Output dimensions must be positive
[[Node: wavenet/dilated_stack/layer1/conv_filter/BatchToSpace = BatchToSpace[T=DT_FLOAT, block_size=2, _device="/job:localhost/replica:0/task:0/gpu:0"](wavenet/dilated_stack/layer1/conv_filter, wavenet/dilated_stack/layer1/conv_filter/BatchToSpace/crops)]]
Caused by op u'wavenet/dilated_stack/layer1/conv_filter/BatchToSpace', defined at:
File "generate.py", line 86, in <module>
main()
File "generate.py", line 51, in main
next_sample = net.predict_proba(samples)
File "/home/ubuntu/jupyter_base/project/tensorflow-wavenet/wavenet.py", line 154, in predict_proba
raw_output = self._create_network(encoded)
File "/home/ubuntu/jupyter_base/project/tensorflow-wavenet/wavenet.py", line 112, in _create_network
self.dilation_channels)
File "/home/ubuntu/jupyter_base/project/tensorflow-wavenet/wavenet.py", line 51, in _create_dilation_layer
name="conv_filter")
File "/home/ubuntu/jupyter_base/venv/lib/python2.7/site-packages/tensorflow/python/ops/nn_ops.py", line 228, in atrous_conv2d
block_size=rate)
File "/home/ubuntu/jupyter_base/venv/lib/python2.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 308, in batch_to_space
block_size=block_size, name=name)
File "/home/ubuntu/jupyter_base/venv/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 703, in apply_op
op_def=op_def)
File "/home/ubuntu/jupyter_base/venv/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2317, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/home/ubuntu/jupyter_base/venv/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1239, in __init__
self._traceback = _extract_stack()```
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.