GithubHelp home page GithubHelp logo

Comments (12)

WelkinYang avatar WelkinYang commented on June 16, 2024

你好,不建议使用这个配置实验。我推荐将memory_efficient_training设置为true,将size设置为384,batch size提升至22以上。
因为基于扩散模型的声学模型,对数据长度并不敏感,但小batch似乎会导致diffusion loss无法收敛,所以我推荐使用如上设置训练至100万步,可以达到正常效果。希望您的实验顺利~

from learn2sing2.0.

Liujingxiu23 avatar Liujingxiu23 commented on June 16, 2024

@WelkinYang 感谢你的回复!memory_efficient_training设置为true, 我这边有错误提示,我先自己尝试解决一下。

from learn2sing2.0.

WelkinYang avatar WelkinYang commented on June 16, 2024

@WelkinYang 感谢你的回复!memory_efficient_training设置为true, 我这边有错误提示,我先自己尝试解决一下。
可能是batch设置太小,导致一个batch内所有的数据都小于所设置的最大长度,提升batch size就可以了

from learn2sing2.0.

Liujingxiu23 avatar Liujingxiu23 commented on June 16, 2024

@WelkinYang 实验了一下 确实可以了。感谢! 等几天有结果了来分享一下

from learn2sing2.0.

Liujingxiu23 avatar Liujingxiu23 commented on June 16, 2024

@WelkinYang 目前训练进行中,可能还得好几天有结果。
还有一个问题想请教一下哈。对于普通的speaker(非歌唱,仅speech的),在训练的时候,输入note应该怎么处理呢? 我是根据wave提取了真实的pitch,根据时长处理取平均得到phone级别的pitch,然后根据note和pitch的映射关系,映射回note作为输入,不知道是否合适。

from learn2sing2.0.

WelkinYang avatar WelkinYang commented on June 16, 2024

@WelkinYang 目前训练进行中,可能还得好几天有结果。 还有一个问题想请教一下哈。对于普通的speaker(非歌唱,仅speech的),在训练的时候,输入note应该怎么处理呢? 我是根据wave提取了真实的pitch,根据时长处理取平均得到phone级别的pitch,然后根据note和pitch的映射关系,映射回note作为输入,不知道是否合适。

是这样处理 因为mel-spectrogram也做的是音素级的平均,所以是可以对应上的

from learn2sing2.0.

Liujingxiu23 avatar Liujingxiu23 commented on June 16, 2024

@WelkinYang 不好意思,还有一个问题想请教下。有什么途径可以下载到musicxml或midi文件,仅用于预测吗?

from learn2sing2.0.

MaxMax2016 avatar MaxMax2016 commented on June 16, 2024

@Liujingxiu23 这个网站有midi下载 https://www.vsqx.top/

from learn2sing2.0.

Liujingxiu23 avatar Liujingxiu23 commented on June 16, 2024

@dtx525942103 收到!谢谢。
@WelkinYang 目前我训练结果,是唱歌合成本身还可以,Learn2song的合成效果,音质可以的,但是抖。我这边和原始方法最大的差别是用的hifigan模型,也没有用f0,可能有一定的影响。后续准备检查数据,加入f0等。

from learn2sing2.0.

WelkinYang avatar WelkinYang commented on June 16, 2024

@dtx525942103 收到!谢谢。 另外我这边目前训练效果不是很好,歌唱本身和l2s的效果都不太好,准备检查数据看看,是否有处理的不对的地方。不知道其他人的实验效果如何

请问数据量大概是多少 训练中的Loss怎么样 论文实验中使用了100首的数据量 并不算多 但已经达到了很不错的效果

from learn2sing2.0.

MaxMax2016 avatar MaxMax2016 commented on June 16, 2024

@Liujingxiu23 方便提供联系方式不啊,我也要做歌声克隆,期望多与您交流

from learn2sing2.0.

Liujingxiu23 avatar Liujingxiu23 commented on June 16, 2024

@dtx525942103 [email protected]

from learn2sing2.0.

Related Issues (6)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.