Comments (73)
@bearlu007 Here is some of my code you could use it as a reference:
- 55d to 20d:
def reduce_dim(features):
""" reduce dimension from 55d to 20d
keep features[0:18] and features[36:38] only
:param features: 55d
:return: 20d
"""
N, D = features.shape
assert D == 55, "Dimension error. %sx%s" % (N, D)
features = np.concatenate((features[:, 0:18], features[:, 36:38]), axis=1)
assert features.shape[1] == 20, "Dimension error. %s" % str(features.shape)
return features
- convert 20d to 55d when test
features = np.zeros((N, 55))
features[:, 0:18] = input[:, 0:18]
features[:, 36:38] = input[:, 18:20]
from lpcnet.
SO IT WORKS. Here are my samples.
https://yadi.sk/d/mBUJVSCzVVd2fQ
I achieved the result in the following steps:
- Load the pre-trained model.
- Take a WAV sample of trained tacotron-2 without vocoder (01.wav).
- Convert it to 16bit 16kHz mono raw PCM
(sox taco2-out.wav -b 16 -s -c 1 -r 16k -t raw - > input.s16
). - Compile the data processing program (
./compile.sh
) and run ti (/dump_data input.s16 exc.s8 features.f32 pred.s16 pcm.s16
) to getfeatures.f32
file. - Synthesis speech with LPCNet (
./test_lpcnet.py features.f32 > pcm.txt
).
it works quite slow... - Convert
pcm.txt
toPCNet-out.wav
(ffmpeg -f s16le -ar 16k -ac 1 -i pcm.txt PCNet-out.wav
)
So, I'am right? But why it works so slow?
P.s. And with RNN vocoder i've got better results...
So if i'm right i'll try connect Tacotron-2 and LPCNet. Or... Or it will better choice to use something else in stand of Tacotron-2?
from lpcnet.
Thanks for your respone. Yes It works. Of course I've synthesized the sound from takotron2 to demonstrate the result (as to say show progress). I tested LCPNet for Korean and Russian. The results are impressive. I will develop an implimentation of Tacotron2 for a closer connection with LCPNet to make end 2 end stt system. If Tacotron 2 will be work on server (without WaveNet vocoder) and LCPNet will be work on the clients it solves many problems, and reduce server load up to 10 times.
from lpcnet.
@gosha20777 What acoustic features are you used when you train the TTS model? I've trained with both 55 dimension features and 21 dimension features, however, the results are not good.
from lpcnet.
Are you training end-to-end or are you just learning the LPCNet features from text? Also, make sure that the LPC features are not predicted, but rather computed directly from the predicted cepstral features.
from lpcnet.
LPCNet is basically one half of a TTS system. It takes an acoustic feature vector every 10 ms and outputs speech samples. For TTS, you also need a network that takes in characters and outputs these acoustic feature vectors.
from lpcnet.
@jmvalin Hi, I have trained a taco2 model to predict the 18-band Bark-scale and 2 pitch parameters.
Can you tell me how to compute the LPC from Bark-scale cepstum, or which part in denoise.c do this work?
Thank you.
from lpcnet.
@changeforan To compute the LPC coefficients, look for the _celt_lpc() function in denoise.c. The process starts from Ex, computed by compute_band_energy(), so you'd need to invert a few more steps, but that shouldn't be too hard.
from lpcnet.
@jmvalin Thanks for your quick response, but I am still confused.
It seems like you compute the LPC at line 399 and assign them to features[39:55] at line 448, but if features[0:18] are Bark-scale coefficientis, they were computed after line 399.
I think features[39:55] should be computed from features[0:18] after read your paper.
18-band Bark-frequency cepstrum ----> PSD ----> auto-correlation ----> LPC
Am I right?
from lpcnet.
The LPC are the same as if they'd been computed on features[0:18]. The spectrum on which they're computed in the C code is the same that's used to compute the cepstrum and the operation is reversible.
from lpcnet.
Well, the way it's normally supposed to work is that you train Tacotron (or whatever network) to directly output features that LPCNet can use. No need to run the synthesis twice (though in this case I guess it was easier for testing purposes).
from lpcnet.
I ve got features from english multi speaker dataset. About 8 hours
from lpcnet.
With the original 55 dimension features or other features ?
from lpcnet.
Hmm. I'm not sure... But in my apinion it was 20 dim features.
Try to learn LONG TIME. I v trained it about 5 days in 2x Nvidia 1080 ti. I ve used horovod library to parallel it.
from lpcnet.
I can give u a pretrained model if u want.
from lpcnet.
I can't understand what are the 120 dim features and how you extract the features. I'll appreciate it if there's some explanations. In my opinion, in the paper, they claimed using 20 dim features, and in the code it seems like using actually 55 dim features.
from lpcnet.
Oh no! No 120 dim but 20 dim! Im so sorry :)
from lpcnet.
In the code, it seems like 21 dim features rather than 20 dim. I've tried to predicted the 21 dim features, however, the results sounds not stable. My backbone model is not taco series, but a traditional rnn model.
from lpcnet.
@attitudechunfeng I have reviewed the code and found that, features[18:36] is assigned to zero, features[36] and features[37] are about pitch. features[38] is not used at all. features[39:55] are about lpc.
from lpcnet.
So it means that i only need to predict the [0:18] and [36:37] both 20 dim features? Do you have good results using these features? @changeforan
from lpcnet.
So it means that i only need to predict the [0:18] and [36:37] both 20 dim features? Do you have good results using these features? @changeforan
with Taco2 model, Yes.
from lpcnet.
FYI, I don't think features[38] is useful for anything. OTOH, features[18:36] could potentially be useful for TTS.
from lpcnet.
@attitudechunfeng the 21dim not to predict
from lpcnet.
@hdmjdp what do you mean, can you explain it more detailedly?
from lpcnet.
@attitudechunfeng it means you need not predict the period, so the net output 20dim
from lpcnet.
I tried to predict lpcnet parameters directly using a tacotron model. The generated voice is not very good, and the attention seemed very strange. Here are some attention and samples (In Chinese). Is there someone also have this situation, and knows how to explain this?
More:
tacotron_lpcnet.zip
from lpcnet.
@candlewill may be u used the wrong feature as jmvalin said, my alignement is very good. and compared to the mel spectrogram, it is much easy to get the alignment.
from lpcnet.
Thanks @jmvalin and @azraelkuan, I predict all of the 55d features when do end-to-end training. I will try to change the features to predict.
from lpcnet.
@azraelkuan
Looks great!
Could you share your synthesized speech from Tacotron + LPCNet?
LPCNet acoustic feature
features[:18] : 18-dim Bark scale cepstrum
features[18:36] : Not used
features[36:37] : pitch period(what is this value?)
features[37:38] : pitch correlation(what is this value?)
features[39:55] : LPC(calculated by cepstrum)
window_size (=n_fft) = 320 (is it right?)
frame_shift(=hop_size) = 160 (is it right?)
And, did you train Tacotron to predict 20-dim feature(concat. the 18dim cepstrum and 2 pitch param.) instead of 80-dim mel-spectrogram?
(In that case, Decoder LSTM input will be 20-dim concatenated feature.)
Or, only 18-dim cepstrum is the input of Decoder LSTM, and 2 pitch param output is predicted by dense projection likewise stop-token?
Could you explain more detailed structure or tips for training?
(e.g. window_size, hop_size(=frame shift), and normalization of feature)
I would appreciate your reply.
from lpcnet.
feature: 20-dim concatenated feature, i do not split them, i can not share the samples, sorry
from lpcnet.
@azraelkuan what is your repo of tacotron?
from lpcnet.
@hdmjdp https://github.com/keithito/tacotron
from lpcnet.
I changed the features to predict, then the attention could be learnt well. Here is some samples with 16k pcm format generated from an end2end+lpcnet model.
e2e_lpcnet_samples.zip
from lpcnet.
@azraelkuan why not use tacotron2?
from lpcnet.
@candlewill how to convert chinese char to vector?
from lpcnet.
I changed the features to predict, then the attention could be learnt well. Here is some samples with 16k pcm format generated from an end2end+lpcnet model.
e2e_lpcnet_samples.zip
May I know how you change your features for modeling and prediction ?
from lpcnet.
I changed the features to predict, then the attention could be learnt well. Here is some samples with 16k pcm format generated from an end2end+lpcnet model.
e2e_lpcnet_samples.zipMay I know how you change your features for modeling and prediction ?
@candlewill Thanks
from lpcnet.
@bearlu007 Here is some of my code you could use it as a reference:
- 55d to 20d:
def reduce_dim(features): """ reduce dimension from 55d to 20d keep features[0:18] and features[36:38] only :param features: 55d :return: 20d """ N, D = features.shape assert D == 55, "Dimension error. %sx%s" % (N, D) features = np.concatenate((features[:, 0:18], features[:, 36:38]), axis=1) assert features.shape[1] == 20, "Dimension error. %s" % str(features.shape) return features
- convert 20d to 55d when test
features = np.zeros((N, 55)) features[:, 0:18] = input[:, 0:18] features[:, 36:38] = input[:, 18:20]
Clear enough. Thanks a lot .
from lpcnet.
@azraelkuan I have a question about the predicted features. When training with tacotron, do you only use LPCNET features? Or LPCNET features and Linear spectrogram?
from lpcnet.
@attitudechunfeng only lpcnet features, 20 dimension
from lpcnet.
thanks for your quick reply. And after how many steps the alignment becomes well?
from lpcnet.
@attitudechunfeng about 5k step, i use the real lpc feature in the training decode step.
from lpcnet.
@azraelkuan this repo can not give the time of when to stop?
from lpcnet.
@hdmjdp u can add a stop token to predict it
from lpcnet.
@azraelkuan how to add in decoder cell?
from lpcnet.
@jmvalin If I want to normalize the cepstral coefficients, how should I choose the normalization range? The magnitude of cepstral coefficients seems to vary a lot.
from lpcnet.
Why do you want to normalize the cepstral coefficients?
from lpcnet.
I tried to combine tacotron with LPCNet, which succeeded in a big data set, but failed in a small data set. (The dataset extraction feature only takes one round.) The tacotron output may have a period greater than 3.1, which I think will cause problems in training the LPCNet network (although training does not report an error). So I plan to normalize the cepstrum and pitch parameters.
from lpcnet.
@jmvalin Hi, in your makefile, you give the A53's compile option. Does this mean that this repo can run in realtime on A53 chip? but we find it runs slow than realtime much. why
from lpcnet.
LPCNet is not yet real-time on the A53. That's a pretty slow chip. We've managed real-time performance on an iPhone6 though. So it should run in real-time on most modern smartphones. Just not on RaspberryPi yet. That may eventually be achievable, but that's not what we're working on atm.
from lpcnet.
@jmvalin thanks, we test lpcnet on the phone with A73 chip, it cannot run in realtime yet. I will try train it with 32*1 sparse block, so it can using 17 registers. What do you think?
from lpcnet.
@bearlu007 Here is some of my code you could use it as a reference:
- 55d to 20d:
def reduce_dim(features): """ reduce dimension from 55d to 20d keep features[0:18] and features[36:38] only :param features: 55d :return: 20d """ N, D = features.shape assert D == 55, "Dimension error. %sx%s" % (N, D) features = np.concatenate((features[:, 0:18], features[:, 36:38]), axis=1) assert features.shape[1] == 20, "Dimension error. %s" % str(features.shape) return features
- convert 20d to 55d when test
features = np.zeros((N, 55)) features[:, 0:18] = input[:, 0:18] features[:, 36:38] = input[:, 18:20]
Hi, @candlewill .
while testing, do you predict the 20-dim features using Tacotron and then convert it back to 55-dim by padding zeros to the other 35 dimensionality, and directly synthesize with LPCnet ?
from lpcnet.
Hi Team - (@candlewill or @azraelkuan if you can help out that would be amazing)
I'm getting started with Speech Synthesis and TTS. This might be a naive question, so please bear with my ignorance.
Given a predicted 80-dimensional mel-spectrogram from say DeepVoice or Tacotron, what are the steps to post-process it so that it can be fed directly as an input (18-Bark Scale Cepstral Coefficients and 2 pitch params) for LPCNet?
Goal: numpy array (.npy file) from tts -> features.32 - without generating a wavform and converting that to a raw audio header file to be fed into LPCNet.
Assuming that my base TTS model is not trained e2e for LPCNet features, and let's say I use the below function:
def reduce_dim(features):
""" reduce dimension from 55d to 20d
keep features[0:18] and features[36:38] only
:param features: 55d
:return: 20d
"""
N, D = features.shape
assert D == 55, "Dimension error. %sx%s" % (N, D)
features = np.concatenate((features[:, 0:18], features[:, 36:38]), axis=1)
assert features.shape[1] == 20, "Dimension error. %s" % str(features.shape)
return features
to reduce my predicted 80 dimensional melspectrogram down to 20d. Where in this repo should I start to generate a test_features.f32
from a numpy array? I've been looking into dump_data.c
but am a bit lost. Any pointers (e.g. correcting my naive assumptions, e.g. which file and line in the repo should be used, strategy for converting npy array into features.32 directly, etc.) would be super appreciated! Thanks y'all
from lpcnet.
@pgmbayes you can refer my repo. You can turn on tacotron2
macro to incooperate with deep voice or tacotron repo.
from lpcnet.
@pgmbayes why not just predict 18-Bark Scale Cepstral Coefficients and 2 pitch params
using tacotron or deep voice?
from lpcnet.
Hi @candlewill, how many epochs would it take to get samples like e2e_lpcnet_samples.zip? Thank you.
from lpcnet.
@HallidayReadyOne I trained it with 120 epochs which is the default parameter. What's more important is that before use lpcnet, you should be assure your end2end model can predict the lpcnet-used features well.
from lpcnet.
@candlewill Thank you for the kindly reply. Yep, the text2feature model is important. I have trained a tacotron model to predict the lpcnet-used features. The attention alignment is quite good now. However, the sample of lpcnet (about 18 epochs) is instable.
from lpcnet.
Hi, @candlewill , how many step and how much batch size when you trained the end2end model to predict the lpcnet-used features? Thanks!
from lpcnet.
I can give u a pretrained model if u want.
Can you share the pretrained model
from lpcnet.
@jmvalin Thanks for your quick response, but I am still confused.
It seems like you compute the LPC at line 399 and assign them to features[39:55] at line 448, but if features[0:18] are Bark-scale coefficientis, they were computed after line 399.
I think features[39:55] should be computed from features[0:18] after read your paper.
18-band Bark-frequency cepstrum ----> PSD ----> auto-correlation ----> LPC
Am I right?
@changeforan
How can i get the 20 dim features from text using tacatron or tacatron2 that can be feeded to lcpnet, is there any repo/steps that i can follow.
from lpcnet.
@pgmbayes you can refer my repo. You can turn on
tacotron2
macro to incooperate with deep voice or tacotron repo.
How can we pass tacatron features so that it converts to features.32 ?
Suppose we are writing features from tacatron2 as *.npy or *.pkl or even the raw binary file.
from lpcnet.
@cahuja1992
the script is very simple.
import numpy as np
npy_data = np.load("mel_220k_0.npy")
npy_data = npy_data.reshape((-1,))
npy_data.tofile("mel_220k_0.s32")
from lpcnet.
@cahuja1992
the script is very simple.import numpy as np npy_data = np.load("mel_220k_0.npy") npy_data = npy_data.reshape((-1,)) npy_data.tofile("mel_220k_0.s32")
From tacatron we should be using the model.mel_outputs which comes out to be of (1 , 1000, 80) dimensions for a audio. In order to match the dimensions for LPCNet, what should be the parameters of tacatron 2.
The default parameters are as follow:
num_mels=80,
num_freq=1025,
sample_rate=20000,
frame_length_ms=50,
frame_shift_ms=12.5,
preemphasis=0.97,
min_level_db=-100,
ref_level_db=20,
from lpcnet.
Hi, @candlewill , I have listened to your samples. They are better than those that I generated. I have used tacotron 2 to predict the 20 dim features and trained LPCNet with my own data. It seems that the samples with predicted features have problems concerning the pitch compared to samples generated with ground-truth features. I would like to ask if you could share some information with me about the training of tacotron 2, for example, the loss function.
from lpcnet.
@jmvalin
I am trying to integrate tacotron and lpcnet , for that i am doing end-to-end training,
followed below steps , but i am not getting even a fair amount of result and voice synthesized
contains noise.
Please let me know if anything i am missing and doing wrong here.
Trying Below steps to generate features from tacatron and using that to generate speech
from Lpcnet
Training Tacatron for LPCNET
-
Change hparams.py with following parameter
num_mels=20,
sample_rate=22050 ( as LJSpeech dataset has 22050hz sampling) -
Download LJspeech
https://data.keithito.com/data/speech/LJSpeech-1.1.tar.bz2 -
Start Training
python3 preprocess.py --base_dir /ws/sandbox/tacatron --dataset ljspeech
python3 train.py --input /ws/sandbox/tacatron/training/train.txt
4.check points are created every 1000 iteration
at ~/tacotron/logs-tacotron/model.ckpt-1000
use checkpoint with less error or after some
5.Now using metadata.csv of LJSpeech generate sentences array for eval.py
( this contains text for all the sample LJSpeech wav files)
Also convert all wav to pcm and merge in same order as metadata.cvs
6.Modify synthesizer.py
to dump 55 dimension numpy array for all the text.
wav = self.session.run(self.wav_output, feed_dict=feed_dict)
mel_features, wav = self.session.run([self.model.mel_outputs, self.wav_output], feed_dict=feed_dict)
features = mel_features[0][:,:55]
f = open("mel_op.npy","ab")
features.tofile(f)
f.close()
=>mel_op.npy contains 55 dim features for text.
7 . run eval.py
python3 eval.py --checkpoint /tacotron/logs-tacotron/model.ckpt-123000
This will generate mel_op.npy for all text present in sentences array.
Training LPCNET using features generated from Tacatron
Here we have to use the concatenated pcm file and mel_op.npy
1.convert mel_op.npy to mel_op,f32
import numpy as np
import pickle
npy_data = np.fromfile("mel_op.npy")
npy_data = npy_data.reshape((-1,))
npy_data.tofile("mel_op.f32")
2
Merge all the wav files of any one folder of LJSpeech dataset and generate single PCM File.
Now used following to generate features.f32 and data.u8
make dump_data taco=1
./dump_data -train merge-LJ028.pcm features.f32 data.u8 ( features.f32 and data.u8 autogenerated)
use only data.u8 and mel_op.f32 from step.1
- Training
./src/train_lpcnet.py mel_op.f32 data.u8
this will generate generate lpcnet*.h5 file.
Usage
generate test_features.f32 from tacatron ( npy - > f32)
( edit test_lpcnet.py the path of .h5 file)
./src/test_lpcnet.py test_features.f32 test.s16
play test.s16
(Note the .h5 is hard coded in test_lpcnet.py, modify for your .h5 file.)
from lpcnet.
@jmvalin thanks, we test lpcnet on the phone with A73 chip, it cannot run in realtime yet. I will try train it with 32*1 sparse block, so it can using 17 registers. What do you think?
@hdmjdp Any progress with 32*1 sparse block? I've tried with A73 chip, when increasing the sparsity, it can reach about 1.0+ realtime speed, however, still a little slow.
from lpcnet.
@candlewill may be u used the wrong feature as jmvalin said, my alignement is very good. and compared to the mel spectrogram, it is much easy to get the alignment.
Hi @azraelkuan , for the tacotron model, what did you use as the input? phone or pinyin or English words? Thanks!
from lpcnet.
@
@bearlu007 Here is some of my code you could use it as a reference:
- 55d to 20d:
def reduce_dim(features): """ reduce dimension from 55d to 20d keep features[0:18] and features[36:38] only :param features: 55d :return: 20d """ N, D = features.shape assert D == 55, "Dimension error. %sx%s" % (N, D) features = np.concatenate((features[:, 0:18], features[:, 36:38]), axis=1) assert features.shape[1] == 20, "Dimension error. %s" % str(features.shape) return features
- convert 20d to 55d when test
features = np.zeros((N, 55)) features[:, 0:18] = input[:, 0:18] features[:, 36:38] = input[:, 18:20]
@candlewill are you train lpcnet with 55 dim feature? and the 55dim feature just generate with lpcnet dump_data without any other process?
from lpcnet.
@jmvalin
I am trying to integrate tacotron and lpcnet , for that i am doing end-to-end training,
followed below steps , but i am not getting even a fair amount of result and voice synthesized
contains noise.
Please let me know if anything i am missing and doing wrong here.Trying Below steps to generate features from tacatron and using that to generate speech
from LpcnetTraining Tacatron for LPCNET
- Change hparams.py with following parameter
num_mels=20,
sample_rate=22050 ( as LJSpeech dataset has 22050hz sampling)- Download LJspeech
https://data.keithito.com/data/speech/LJSpeech-1.1.tar.bz2- Start Training
python3 preprocess.py --base_dir /ws/sandbox/tacatron --dataset ljspeech
python3 train.py --input /ws/sandbox/tacatron/training/train.txt4.check points are created every 1000 iteration
at ~/tacotron/logs-tacotron/model.ckpt-1000
use checkpoint with less error or after some5.Now using metadata.csv of LJSpeech generate sentences array for eval.py
( this contains text for all the sample LJSpeech wav files)
Also convert all wav to pcm and merge in same order as metadata.cvs6.Modify synthesizer.py
to dump 55 dimension numpy array for all the text.wav = self.session.run(self.wav_output, feed_dict=feed_dict)
mel_features, wav = self.session.run([self.model.mel_outputs, self.wav_output], feed_dict=feed_dict)
features = mel_features[0][:,:55]
f = open("mel_op.npy","ab")
features.tofile(f)
f.close()
=>mel_op.npy contains 55 dim features for text.7 . run eval.py
python3 eval.py --checkpoint /tacotron/logs-tacotron/model.ckpt-123000
This will generate mel_op.npy for all text present in sentences array.Training LPCNET using features generated from Tacatron
Here we have to use the concatenated pcm file and mel_op.npy
1.convert mel_op.npy to mel_op,f32
import numpy as np
import pickle
npy_data = np.fromfile("mel_op.npy")
npy_data = npy_data.reshape((-1,))
npy_data.tofile("mel_op.f32")2
Merge all the wav files of any one folder of LJSpeech dataset and generate single PCM File.
Now used following to generate features.f32 and data.u8make dump_data taco=1
./dump_data -train merge-LJ028.pcm features.f32 data.u8 ( features.f32 and data.u8 autogenerated)
use only data.u8 and mel_op.f32 from step.1
- Training
./src/train_lpcnet.py mel_op.f32 data.u8
this will generate generate lpcnet*.h5 file.Usage
generate test_features.f32 from tacatron ( npy - > f32)
( edit test_lpcnet.py the path of .h5 file)
./src/test_lpcnet.py test_features.f32 test.s16
play test.s16
(Note the .h5 is hard coded in test_lpcnet.py, modify for your .h5 file.)
@alokprasad I have try your above idea, i merge the all mel_op.f32 ectracted by tacotron2 to a single final mel_op.f32 . But i found that threre is a missmatch between mel_op.f32 and data.u8. That is, the frame numbers of mel_op.f32 is different the frame numbers of data.u8. I want to konw how to you slove it.
from lpcnet.
Thanks @jmvalin and @azraelkuan, I predict all of the 55d features when do end-to-end training. I will try to change the features to predict.
Hi! Have you resolved this problem?
from lpcnet.
You should try the Text Speaker" app. This is the best text to speech app. It has so many natural sounding voices to chose from. It is useful to listen to study files and much more. It can even extract text from scanned pages and websites and read them out loud. I use it most often to create mp3 files of my study files so I can listen to them on the go. Great product.https://www.deskshare.com/text-to-speech-software.aspx
from lpcnet.
@Ben654987 Please stop pasting your add here.
from lpcnet.
Related Issues (20)
- Bug: MDense state restore crash with missing argument
- project version problem(tf2) HOT 4
- Heuristic doubling period trick by preprocessing pitch correlation values?
- Can't open input.pcm
- Is there a way to reduce the size of LPCNET_PACKET_SAMPLES and bits of per samples? HOT 1
- What does the "network size“ refer to on https://jmvalin.ca/demo/lpcnet/
- where is the gru_b_dense_feature defined?
- Does anyone have experience in jointly training of e2e LPCNet?
- Bitstream compatibility HOT 1
- P192 speed test in ARM A35 chip HOT 6
- "ValueError: axes don't match array" when applying --retrain flag to sample model file HOT 1
- I could get "nnet_data.*" files for the newly trained model. However after doing "make" and trying to generate signals with "lpcnet_demo", I find the reconstructions same as those ones of the pre-trained model. Any reason why this happens?
- bug
- How can it be so slow? HOT 1
- Training a new PLC model HOT 1
- Make errors HOT 8
- make error:undefined reference to `lpc_from_cepstrum' HOT 8
- How should the dataset of PLC algorithm be constructed?
- ValueError: all elements of `new_shape` must be non-negative
- The 18 Bark-scale frequency bins are not normalized, introduces spectral tilt. HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lpcnet.