GithubHelp home page GithubHelp logo

Comments (4)

astorfi avatar astorfi commented on August 20, 2024

@ku2482 Thanks for your question.

Regarding your questions:

  1. 0.8-second as it is also mentioned in the paper.
  2. No for the paper, no CMVN has been used. CMVN is just an available feature in SpeechPy library.

from 3d-convolutional-speaker-recognition.

toshikwa avatar toshikwa commented on August 20, 2024

@astorfi Thanks for answering.

I think every 0.81 second audio file result in (80, 40) feature, and you concatenate 20 features to make (20, 80, 40) feature for development phase, is it right?
I don't know how many (20, 80, 40) features per speaker do you use in the paper .
You use just one (20, 80, 40) feature for one speaker and make the dataset shaped (511, 20, 80, 40) ??

Anyway, I appreciate for your work and kindness.

from 3d-convolutional-speaker-recognition.

astorfi avatar astorfi commented on August 20, 2024

@ku2482 Yes, that's quite correct.

For the second part, (20, 80, 40) features are fed to the network. "20" is the number of spoken utterances for the speaker. However, there is no restriction on the number of (20, 80, 40) features for any speaker. The rule of thumb is "the more is the better for background model generation". You can use "20" spoken utterances at random for data augmentation (although all needs to belong to the same speaker).

from 3d-convolutional-speaker-recognition.

toshikwa avatar toshikwa commented on August 20, 2024

@astorfi Thank you so much!!

I actually solve all my questions and now I can understand your script.
Your work is really great!!

I close this issue, and again, thank you!!

from 3d-convolutional-speaker-recognition.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.