GithubHelp home page GithubHelp logo

Comments (9)

csukuangfj avatar csukuangfj commented on June 20, 2024 1

@martinshkreli

Could you describe how you get the int8 models?

from sherpa-onnx.

danpovey avatar danpovey commented on June 20, 2024

Fangjun will get back to you about it, but: hi, martin shkreli!
We might need more hardware info and details about what differed between those two runs.

from sherpa-onnx.

martinshkreli avatar martinshkreli commented on June 20, 2024

Hi guys, thanks again for the wonderful repo. I followed this link to download the model:
https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/vits.html#download-the-model

Then, I used that file (vits-ljs.int8.onnx) for inference in the python script (offline-tts.py). This was on an 8xA100 instance.

from sherpa-onnx.

martinshkreli avatar martinshkreli commented on June 20, 2024

@martinshkreli

Could you describe how you get the int8 models?

hi Fangjun, i just wanted to try and get your attention one more time, sorry if I am being annoying!

from sherpa-onnx.

csukuangfj avatar csukuangfj commented on June 20, 2024

The int8 model is obtained via the following code

quantize_dynamic(
model_input=filename,
model_output=filename_int8,
weight_type=QuantType.QUInt8,
)

Note that it uses

weight_type=QuantType.QUInt8,

It is a known issue about onnxruntime that quint8 is slower.

For instance, if you search with google, you can find similar issues:

from sherpa-onnx.

danpovey avatar danpovey commented on June 20, 2024

from sherpa-onnx.

csukuangfj avatar csukuangfj commented on June 20, 2024

int8 model mentioned in this issue is about 4x less in file size than that of float32.

If memory matters, then int8 model is preferred.

from sherpa-onnx.

beqabeqa473 avatar beqabeqa473 commented on June 20, 2024

hi @csukuangfj do you know how to optimize speed of an int8 model? I was experimenting several months ago with it, but i was not able to convert to qint8 and quint8 is really slow on cpu.

from sherpa-onnx.

nshmyrev avatar nshmyrev commented on June 20, 2024

You don't need to optimize speed, you need to pick MB-iSTFT VITS model, they are order of magnitude faster than raw VITS with the same quality.

from sherpa-onnx.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.