huyanxin / phasen Goto Github PK
View Code? Open in Web Editor NEWA unofficial Pytorch implementation of Microsoft's PHASEN
A unofficial Pytorch implementation of Microsoft's PHASEN
I got an error (torch 1.10.0), and fix it by
phase_conv2(Conv1d) ------> phase_conv2(Conv2d)
想问下这个模型较好的拟合,loss值要接近多少,用的是-5-20信噪比的aishell数据,目前相位loss有点大
I want to run this script, but my computer does not have a GPU. I tried to use the CPU to train, but it failed. How can it be compatible with the CPU?
I got error:
outputs, wav = data_parallel(model, (inputs, ))
File "torch/nn/parallel/data_parallel.py", line 190, in data_parallel
output_device = device_ids[0]
IndexError: list index out of range
I am trying to reproduce the PHASEN, but I have a problem about data preprocessing. When the audio signal time is less than 4 seconds, what should I do?
I found this in your code
wave_inputs = np.concatenate([wave_inputs, wave_inputs[:segement_length-wave_inputs.shape[0]]])
wave_s1 = np.concatenate([wave_s1, wave_s1[:segement_length-wave_s1.shape[0]]])
What confused me is when segement_length-wave_inputs.shape[0]
>wave_inputs.shape[0]
, the code won't work.
不知道作者有没有计划开源模型,让我们更方便的使用
你号,音频分成4秒每段进行语音增强后,在音频的连接处有哒哒的声音或者会出现消音的情况,将4s改成1s后的效果更加严重,这种情况可以采用什么方式去除呢?产生的原因是因为音频不连续吗?
It throw the error that:
operands could not be broadcast together with shapes (514,399) (257,397)
line 100:duration=item['duration']报错KeyError
查看了一下,target_list,没有‘duration’这一项,应该是数据处理那部分出错,但是用代码中的lst仍然报错,请问是哪一步出了错?
你好,感谢您的复现工作,不过我使用自己的数据训练该模型,loss不会下降,请问我该如何排查原因?
我的数据为中英文均包含的干净录音,添加musan噪声后作为训练数据,使用mixloss,mixloss值稳定在40,sisnr值稳定在7~8之间,且不会下降和提升。
I got "Nan" when use Mix loss to train (not speech denoise task), and Fix it by adding grad clip as fellows:
loss.backward()
nn.utils.clip_grad_norm_(self.estimator.parameters(), 10.0) # add this to clip grad
self.optimizer.step()
The input shape to "PHASEN..rnn" is [B,T,D*C] as in PHASEN.forward, so the batch_first=True (by default batch_first=False) should be set?
Hi,I use tensorflow to conv_stft like this:
def init_kernels(win_len, win_inc, fft_len, win_type=None, invers=False):
if win_type == 'None' or win_type is None:
window = np.ones(win_len)
else:
window = get_window(win_type, win_len, fftbins=True)**0.5
N = fft_len
fourier_basis = np.fft.rfft(np.eye(N))[:win_len]
real_kernel = np.real(fourier_basis)
imag_kernel = np.imag(fourier_basis)
kernel = np.concatenate([real_kernel, imag_kernel], 1).T
if invers :
kernel = np.linalg.pinv(kernel).T
kernel = kernel*window
kernel = kernel[:, None, :]
return tf.convert_to_tensor(kernel,tf.float32)
import torch.nn.functional as F
class ConvSTFT(tf.keras.layers.Layer):
def __init__(self, win_len=400, win_inc=200, fft_len=512, win_type='hanning', feature_type='real', fix=True):
super(ConvSTFT, self).__init__()
self.fft_len = fft_len
kernel= init_kernels(win_len, win_inc, self.fft_len, win_type)
print('................',kernel.shape)
self.weight = tf.Variable(kernel)
self.feature_type = feature_type
self.stride = win_inc
self.win_len = win_len
self.dim = self.fft_len
def call(self, inputs):
outputs = F.conv1d(inputs, self.weight, stride=self.stride)
output_list = []
print("...............",outputs)
dim = self.dim//2+1
real = outputs[:, :dim, :]
imag = outputs[:, dim:, :]
output_list = [real,imag]
return output_list
It is right?
大佬,我用的是Mixloss,一运行loss就 nan.
1、LR 我已经设置很小了(0.00001);
2、没有/0 情况;请问还有可能是什么原因呢?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.