GithubHelp home page GithubHelp logo

Viseme support about ueazspeech HOT 11 CLOSED

lucoiso avatar lucoiso commented on July 23, 2024 4
Viseme support

from ueazspeech.

Comments (11)

lucoiso avatar lucoiso commented on July 23, 2024 6

image

Log Output Example: 
[...]
[2022.11.06-19.06.49:180][156]LogAzSpeech: Display: OnVisemeReceived - Viseme Id: 6
[2022.11.06-19.06.49:180][156]LogAzSpeech: Display: OnVisemeReceived - Viseme Audio Offset: 33500000
[2022.11.06-19.06.49:180][156]LogAzSpeech: Display: OnVisemeReceived - Viseme Animation: 
[2022.11.06-19.06.49:180][156]LogAzSpeech: Display: OnVisemeReceived - Viseme Id: 20
[2022.11.06-19.06.49:181][156]LogAzSpeech: Display: OnVisemeReceived - Viseme Audio Offset: 34000000
[2022.11.06-19.06.49:181][156]LogAzSpeech: Display: OnVisemeReceived - Viseme Animation: 
[2022.11.06-19.06.49:181][156]LogAzSpeech: Display: OnVisemeReceived - Viseme Id: 19
[2022.11.06-19.06.49:181][156]LogAzSpeech: Display: OnVisemeReceived - Viseme Audio Offset: 34500000
[2022.11.06-19.06.49:181][156]LogAzSpeech: Display: OnVisemeReceived - Viseme Animation: 
[...]

from ueazspeech.

lucoiso avatar lucoiso commented on July 23, 2024 4

Will start working on this soon using this branch: https://github.com/lucoiso/UEAzSpeech/tree/feature/VISEME-22

from ueazspeech.

eggcaker avatar eggcaker commented on July 23, 2024 2

+1

from ueazspeech.

DrinkSlurp avatar DrinkSlurp commented on July 23, 2024 2

Upvote!
This would be great! I would do it myself, but as a total rookie, I would end up launching nukes or so ;-)
I'd rather wait for @lucoiso expert hands to get to it...

from ueazspeech.

eggcaker avatar eggcaker commented on July 23, 2024 2

image

from ueazspeech.

fingerx avatar fingerx commented on July 23, 2024 1

any new for this ,thanks

from ueazspeech.

eggcaker avatar eggcaker commented on July 23, 2024 1

Will start working on this soon using this branch: https://github.com/lucoiso/UEAzSpeech/tree/feature/VISEME-22

will the visemes be integrate with metahuman's face vimes ?

from ueazspeech.

lucoiso avatar lucoiso commented on July 23, 2024 1

@fingerx

I'm researching more about it, but it only seems to return this output when the synthesis comes from an SSML 🤔

Using SSML to Voice with an SSML containing <mstts:viseme type="FacialExpression"/>, I got Id and offset as 0 and the entire animation output returned normally:

LogAzSpeech: Display: OnVisemeReceived - Viseme Id: 0
LogAzSpeech: Display: OnVisemeReceived - Viseme Audio Offset: 0ms
LogAzSpeech: Display: OnVisemeReceived - Viseme Animation: {"FrameIndex":405,"BlendShapes":[[0,0.004,0,0.029,0,0.142,0.236,0,0.004,0.143,0,0,0.142,0.236,0.055,0.021,0,0.175,0.122,0.118,0.061,0.008,0.003,0.02,0.029,0.013,0.012,0.043,0.039,0.092,0.074,0.053,0.047,0.014,0.075,0.017,0.018,0.201,0.195,0.015,0.015,0.086,0.086,0.097,0,0,0.016,0.041,0.044,0.029,0.029,0,0.013,0,0],[0,0.008,0,0.03,0,0.139,0.233,0,0.008,0.144,0,0,0.139,0.233,0.055,0.021,0,0.175,0.121,0.118,0.061,0.008,0.003,0.02,0.029,0.013,0.012,0.043,0.039,0.092,0.074,0.053,0.047,0.014,0.075,0.017,0.018,0.201,0.196,0.015,0.015,0.084,0.
084,0.097,0,0,0.016,0.041,0.044,0.029,0.029,0,0.015,0,0],[0,0.01,0,0.03,0,0.136,0.231,0,0.01,0.144,0,0,0.136,0.231,0.055,0.021,0,0.174,0.121,0.118,0.061,0.008,0.003,0.02,0.029,0.013,0.012,0.043,0.039,0.092,0.074,0.053,0.047,0.014,0.075,0.017,0.018,0.202,0.196,0.015,0.015,0.083,0.083,0.096,0,0,0.016,0.041,0.044,0.029,0.029,0,0.015,0,0],[0,0.012,0,0.032,0,0.134,0.23,0,0.012,0.146,0,0,0.134,0.23,0.055,0.021,0,0.174,0.12,0.119,0.06,0.008,0.003,0.02,0.029,0.012,0.012,0.043,0.039,0.092,0.074,0.053,0.047,0.014,0.075,0.017,0.018,0.202,0.196,0.015,0.015,0.083,0.083,0.096,0,0,0.016,0.041,0.044,0.029,0.0
29,0,0.015,0,0],[0,0.014,0,0.034,0,0.132,0.229,0,0.014,0.148,0,0,0.132,0.229,0.055,0.021,0,0.174,0.12,0.119,0.06,0.008,0.003,0.021,0.03,0.012,0.012,0.043,0.039,0.092,0.074,0.053,0.047,0.014,0.075,0.017,0.018,0.202,0.196,0.015,0.015,0.082,0.082,0.096,0,0,0.016,0.041,0.044,0.029,0.029,0,0.015,0,0],[0,0.015,0,0.035,0,0.13,0.228,0,0.015,0.149,0,0,0.13,0.228,0.055,0.021,0,0.178,0.129,0.12,0.069,0.008,0.003,0.02,0.029,0.014,0.012,0.043,0.039,0.092,0.074,0.055,0.049,0.014,0.075,0.017,0.018,0.198,0.192,0.015,0.015,0.082,0.082,0.095,0,0,0.016,0.041,0.044,0.029,0.029,0,0.014,0,0],[0,0.015,0,0.036,0,0.12
9,0.228,0,0.015,0.15,0,0,0.129,0.228,0.055,0.021,0,0.18,0.133,0.118,0.074,0.008,0.003,0.022,0.03,0.014,0.012,0.043,0.039,0.092,0.074,0.056,0.05,0.014,0.075,0.017,0.018,0.197,0.191,0.015,0.015,0.082,0.082,0.095,0,0,0.016,0.041,0.044,0.029,0.029,0,0.014,0,0],[0,0.016,0,0.038,0,0.128,0.227,0,0.016,0.152,0,0,0.128,0.227,0.056,0.021,0,0.179,0.132,0.116,0.075,0.008,0.003,0.024,0.032,0.012,0.012,0.043,0.039,0.092,0.074,0.057,0.051,0.014,0.075,0.017,0.018,0.198,0.192,0.015,0.015,0.082,0.082,0.095,0,0,0.016,0.041,0.044,0.029,0.029,0,0.015,0,0],[0,0.016,0,0.04,0,0.128,0.225,0,0.016,0.154,0,0,0.128,0.225
,0.056,0.021,0,0.178,0.13,0.115,0.074,0.008,0.003,0.025,0.033,0.011,0.012,0.043,0.039,0.092,0.074,0.056,0.051,0.014,0.075,0.017,0.018,0.2,0.195,0.015,0.015,0.082,0.082,0.095,0,0,0.016,0.041,0.044,0.029,0.029,0,0.015,0,0],[0,0.017,0,0.042,0,0.127,0.22,0,0.017,0.156,0,0,0.127,0.22,0.057,0.021,0,0.174,0.123,0.113,0.067,0.008,0.003,0.026,0.035,0.01,0.012,0.043,0.039,0.092,0.074,0.055,0.05,0.014,0.075,0.017,0.018,0.203,0.197,0.015,0.015,0.083,0.083,0.094,0,0,0.016,0.041,0.044,0.029,0.029,0,0.015,0,0.001]]}

But using the Text to Voice I'm getting the Id and Offset, but my Animation string is empty:

LogAzSpeech: Display: OnVisemeReceived - Viseme Id: 19
LogAzSpeech: Display: OnVisemeReceived - Viseme Audio Offset: 100ms
LogAzSpeech: Display: OnVisemeReceived - Viseme Animation: 

from ueazspeech.

fingerx avatar fingerx commented on July 23, 2024

can output 3d blendshape,thanks
like doc :
https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/how-to-speech-synthesis-viseme?tabs=3dblendshapes&pivots=programming-language-csharp

@lucoiso thanks

from ueazspeech.

fingerx avatar fingerx commented on July 23, 2024

@lucoiso test branch output animation is alway string empty,dont know why that.thanks

from ueazspeech.

fingerx avatar fingerx commented on July 23, 2024

@lucoiso thanks for job. feel output blendshape deleay human speak and output ue output log window show outpu blendshape is not full json
https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/how-to-speech-synthesis-viseme?tabs=3dblendshapes&pivots=programming-language-cpp

from ueazspeech.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.