Comments (7)
@fatchord I find that it's almost like magic that upsampling network finds a reasonable interpolation method all by itself. Convolution with a gaussian kernel is a reasonable interpolation method.
I trained a network with simplified upsampling and it produces very similar results, it just has fewer trainable parameters.
from wavernn.
@geneing I probably would've agreed had someone not mentioned to me that my upsampling "is basically a gaussian convolution with a time-shift". So I had a look at the kernel weights after training and this is what I found:
That first kernel is going to be more important overall and I reckon that does indeed look something like a guassian but shifted to the right a bit. I checked out another model and found more or less the same thing.
What do you make of it?
from wavernn.
@fatchord here's the kernel reponses and the total impulse response of the upsampling layers (my kernels are a bit smoother since my model differs a bit from yours):
from wavernn.
btw, linear interpolation is just convolution with a 'triangular' filter so linear upsampling might indeed be just as ok 👍
from wavernn.
@geneing so you just use model="linear" interpolation method ? I think 1d-resnet need a larger computation resource compared to upsample.
from wavernn.
@geneing Maybe I'm totally off the mark here but hear me out... the fact that the guassian is shifted forward in time intrigues me.
I mean - what if one tried shifting the conditioning features back in time by the same amount as the offset in the the guassian? Would the guassian end up centred? Could one interpret the offset in such a manner as to say 'this model is prioritising future conditioning features over current? Why not have two upsampling networks and give both a slice of current and future conditioning features?
Sorry, that's a lot of questions but I'm just curious what you'd think about that line of reasoning.
from wavernn.
@geneing , @fatchord Amazon recently had a paper on implementing a universal neural vocoder here: https://arxiv.org/abs/1811.06292.
their architecture is quite simple, unidirectional rnn followed by dense layer then softmax (with 10 bit mu-law), conditioned on outputs from an up-sampling network (they use rnns for upsampling). From reading the paper it seems the most important thing to have is a variety of dataset (74 speakers, 17 different languages, etc).
from wavernn.
Related Issues (20)
- RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)
- librosa.util.exceptions.ParameterError: Audio buffer is not finite everywhere
- TTS not generating output even after 900k steps of tacotron model HOT 4
- Infinite loop during create_training_data.py
- Slow tacotron training 1step/sec on AWS p3.2xlarge (Tesla V100) HOT 1
- Using wavernn pretrained model, loss stuck at 5.6
- Can I use pretrained models with different hparams settings?
- sentence long problem
- Train WaveRnn AttributeError HOT 5
- ValueError - gen_tacotron.py HOT 1
- Error During Computing Consensus Step HOT 1
- adding support for windows sapi5
- why do you minus 2 in preprocessing ?
- AttributeError: module 'librosa' has no attribute 'output' HOT 4
- data\\dataset.pkl isssue HOT 1
- [feature request] dynamic batch size during WaveRNN training depending on free/total GPU memory
- Tacotron to Onnx HOT 1
- Where is the audio file for which itis generating the text? HOT 2
- (Solved, but can be useful to someone) Problems getting the project working for the first time
- spectrogram (image_-to-wav HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from wavernn.