GithubHelp home page GithubHelp logo

kxxt / aspeak Goto Github PK

View Code? Open in Web Editor NEW
478.0 7.0 57.0 827 KB

A simple text-to-speech client for Azure TTS API.

License: MIT License

Shell 1.01% Rust 95.31% Makefile 0.07% Python 3.61%
azure-cognitive-services cli python speech-synthesis text-to-speech tts tts-engine aspeak

aspeak's Introduction

🗣️ aspeak

GitHub stars GitHub issues GitHub forks GitHub license

A simple text-to-speech client for Azure TTS API. 😆

Note

Starting from version 6.0.0, aspeak by default uses the RESTful API of Azure TTS. If you want to use the WebSocket API, you can specify --mode websocket when invoking aspeak or set mode = "websocket" in the auth section of your profile.

Starting from version 4.0.0, aspeak is rewritten in rust. The old python version is available at the python branch.

You can sign up for an Azure account and then choose a payment plan as needed (or stick to free tier). The free tier includes a quota of 0.5 million characters per month, free of charge.

Please refer to the Authentication section to learn how to set up authentication for aspeak.

Installation

Download from GitHub Releases (Recommended for most users)

Download the latest release from here.

After downloading, extract the archive and you will get a binary executable file.

You can put it in a directory that is in your PATH environment variable so that you can run it from anywhere.

Install from AUR (Recommended for Arch Linux users)

From v4.1.0, You can install aspeak-bin from AUR.

Install from PyPI

Installing from PyPI will also install the python binding of aspeak for you. Check Library Usage#Python for more information on using the python binding.

pip install -U aspeak==6.0.0

Now the prebuilt wheels are only available for x86_64 architecture. Due to some technical issues, I haven't uploaded the source distribution to PyPI yet. So to build wheel from source, you need to follow the instructions in Install from Source.

Because of manylinux compatibility issues, the wheels for linux are not available on PyPI. (But you can still build them from source.)

Install from Source

CLI Only

The easiest way to install aspeak from source is to use cargo:

cargo install aspeak -F binary

Alternatively, you can also install aspeak from AUR.

Python Wheel

To build the python wheel, you need to install maturin first:

pip install maturin

After cloning the repository and cd into the directory , you can build the wheel by running:

maturin build --release --strip -F python --bindings pyo3 --interpreter python --manifest-path Cargo.toml --out dist-pyo3
maturin build --release --strip --bindings bin -F binary --interpreter python --manifest-path Cargo.toml --out dist-bin
bash merge-wheel.bash

If everything goes well, you will get a wheel file in the dist directory.

Usage

Run aspeak help to see the help message.

Run aspeak help <subcommand> to see the help message of a subcommand.

Authentication

The authentication options should be placed before any subcommand.

For example, to utilize your subscription key and an official endpoint designated by a region, run the following command:

$ aspeak --region <YOUR_REGION> --key <YOUR_SUBSCRIPTION_KEY> text "Hello World"

If you are using a custom endpoint, you can use the --endpoint option instead of --region.

To avoid repetition, you can store your authentication details in your aspeak profile. Read the following section for more details.

From v5.2.0, you can also set the authentication secrets via the following environment variables:

  • ASPEAK_AUTH_KEY for authentication using subscription key
  • ASPEAK_AUTH_TOKEN for authentication using authorization token

From v4.3.0, you can let aspeak use a proxy server to connect to the endpoint. For now, only http and socks5 proxies are supported (no https support yet). For example:

$ aspeak --proxy http://your_proxy_server:port text "Hello World"
$ aspeak --proxy socks5://your_proxy_server:port text "Hello World"

aspeak also respects the HTTP_PROXY(or http_proxy) environment variable.

Configuration

aspeak v4 introduces the concept of profiles. A profile is a configuration file where you can specify default values for the command line options.

Run the following command to create your default profile:

$ aspeak config init

To edit the profile, run:

$ aspeak config edit

If you have trouble running the above command, you can edit the profile manually:

Fist get the path of the profile by running:

$ aspeak config where

Then edit the file with your favorite text editor.

The profile is a TOML file. The default profile looks like this:

Check the comments in the config file for more information about available options.

# Profile for aspeak
# GitHub: https://github.com/kxxt/aspeak

# Output verbosity
# 0   - Default
# 1   - Verbose
# The following output verbosity levels are only supported on debug build
# 2   - Debug
# >=3 - Trace
verbosity = 0

#
# Authentication configuration
#

[auth]
# Endpoint for TTS
# endpoint = "wss://eastus.tts.speech.microsoft.com/cognitiveservices/websocket/v1"

# Alternatively, you can specify the region if you are using official endpoints
# region = "eastus"

# Synthesizer Mode, "rest" or "websocket"
# mode = "rest"

# Azure Subscription Key
# key = "YOUR_KEY"

# Authentication Token
# token = "Your Authentication Token"

# Extra http headers (for experts)
# headers = [["X-My-Header", "My-Value"], ["X-My-Header2", "My-Value2"]]

# Proxy
# proxy = "socks5://127.0.0.1:7890"

# Voice list API url
# voice_list_api = "Custom voice list API url"

#
# Configuration for text subcommand
#

[text]
# Voice to use. Note that it takes precedence over the locale
# voice = "en-US-JennyNeural"
# Locale to use
locale = "en-US"
# Rate
# rate = 0
# Pitch
# pitch = 0
# Role
# role = "Boy"
# Style, "general" by default
# style = "general"
# Style degree, a floating-point number between 0.1 and 2.0
# style_degree = 1.0

#
# Output Configuration
#

[output]
# Container Format, Only wav/mp3/ogg/webm is supported.
container = "wav"
# Audio Quality. Run `aspeak list-qualities` to see available qualities.
#
# If you choose a container format that does not support the quality level you specified here, 
# we will automatically select the closest level for you.
quality = 0
# Audio Format(for experts). Run `aspeak list-formats` to see available formats.
# Note that it takes precedence over container and quality!
# format = "audio-16khz-128kbitrate-mono-mp3"

If you want to use a profile other than your default profile, you can use the --profile argument:

aspeak --profile <PATH_TO_A_PROFILE> text "Hello"

If you want to temporarily disable the profile, you can use the --no-profile argument:

aspeak --no-profile --region eastus --key <YOUR_KEY> text "Hello"

Pitch and Rate

  • rate: The speaking rate of the voice.
    • If you use a float value (say 0.5), the value will be multiplied by 100% and become 50.00%.
    • You can use the following values as well: x-slow, slow, medium, fast, x-fast, default.
    • You can also use percentage values directly: +10%.
    • You can also use a relative float value (with f postfix), 1.2f:
      • According to the Azure documentation,
      • A relative value, expressed as a number that acts as a multiplier of the default.
      • For example, a value of 1f results in no change in the rate. A value of 0.5f results in a halving of the rate. A value of 3f results in a tripling of the rate.
  • pitch: The pitch of the voice.
    • If you use a float value (say -0.5), the value will be multiplied by 100% and become -50.00%.
    • You can also use the following values as well: x-low, low, medium, high, x-high, default.
    • You can also use percentage values directly: +10%.
    • You can also use a relative value, (e.g. -2st or +80Hz):
      • According to the Azure documentation,
      • A relative value, expressed as a number preceded by "+" or "-" and followed by "Hz" or "st" that specifies an amount to change the pitch.
      • The "st" indicates the change unit is semitone, which is half of a tone (a half step) on the standard diatonic scale.
    • You can also use an absolute value: e.g. 600Hz

Note: Unreasonable high/low values will be clipped to reasonable values by Azure Cognitive Services.

Examples

The following examples assume that you have already set up authentication in your profile.

Speak "Hello, world!" to default speaker.

$ aspeak text "Hello, world"

SSML to Speech

$ aspeak ssml << EOF
<speak version='1.0' xmlns='http://www.w3.org/2001/10/synthesis' xml:lang='en-US'><voice name='en-US-JennyNeural'>Hello, world!</voice></speak>
EOF

List all available voices.

$ aspeak list-voices

List all available voices for Chinese.

$ aspeak list-voices -l zh-CN

Get information about a voice.

$ aspeak list-voices -v en-US-SaraNeural
Output
Microsoft Server Speech Text to Speech Voice (en-US, SaraNeural)
Display name: Sara
Local name: Sara @ en-US
Locale: English (United States)
Gender: Female
ID: en-US-SaraNeural
Voice type: Neural
Status: GA
Sample rate: 48000Hz
Words per minute: 157
Styles: ["angry", "cheerful", "excited", "friendly", "hopeful", "sad", "shouting", "terrified", "unfriendly", "whispering"]

Save synthesized speech to a file.

$ aspeak text "Hello, world" -o output.wav

If you prefer mp3/ogg/webm, you can use -c mp3/-c ogg/-c webm option.

$ aspeak text "Hello, world" -o output.mp3 -c mp3
$ aspeak text "Hello, world" -o output.ogg -c ogg
$ aspeak text "Hello, world" -o output.webm -c webm

List available quality levels

$ aspeak list-qualities
Output
Qualities for MP3:
  3: audio-48khz-192kbitrate-mono-mp3
  2: audio-48khz-96kbitrate-mono-mp3
 -3: audio-16khz-64kbitrate-mono-mp3
  1: audio-24khz-160kbitrate-mono-mp3
 -2: audio-16khz-128kbitrate-mono-mp3
 -4: audio-16khz-32kbitrate-mono-mp3
 -1: audio-24khz-48kbitrate-mono-mp3
  0: audio-24khz-96kbitrate-mono-mp3

Qualities for WAV:
 -2: riff-8khz-16bit-mono-pcm
  1: riff-24khz-16bit-mono-pcm
  0: riff-24khz-16bit-mono-pcm
 -1: riff-16khz-16bit-mono-pcm

Qualities for OGG:
  0: ogg-24khz-16bit-mono-opus
 -1: ogg-16khz-16bit-mono-opus
  1: ogg-48khz-16bit-mono-opus

Qualities for WEBM:
  0: webm-24khz-16bit-mono-opus
 -1: webm-16khz-16bit-mono-opus
  1: webm-24khz-16bit-24kbps-mono-opus

List available audio formats (For expert users)

$ aspeak list-formats
Output
amr-wb-16000hz
audio-16khz-128kbitrate-mono-mp3
audio-16khz-16bit-32kbps-mono-opus
audio-16khz-32kbitrate-mono-mp3
audio-16khz-64kbitrate-mono-mp3
audio-24khz-160kbitrate-mono-mp3
audio-24khz-16bit-24kbps-mono-opus
audio-24khz-16bit-48kbps-mono-opus
audio-24khz-48kbitrate-mono-mp3
audio-24khz-96kbitrate-mono-mp3
audio-48khz-192kbitrate-mono-mp3
audio-48khz-96kbitrate-mono-mp3
ogg-16khz-16bit-mono-opus
ogg-24khz-16bit-mono-opus
ogg-48khz-16bit-mono-opus
raw-16khz-16bit-mono-pcm
raw-16khz-16bit-mono-truesilk
raw-22050hz-16bit-mono-pcm
raw-24khz-16bit-mono-pcm
raw-24khz-16bit-mono-truesilk
raw-44100hz-16bit-mono-pcm
raw-48khz-16bit-mono-pcm
raw-8khz-16bit-mono-pcm
raw-8khz-8bit-mono-alaw
raw-8khz-8bit-mono-mulaw
riff-16khz-16bit-mono-pcm
riff-22050hz-16bit-mono-pcm
riff-24khz-16bit-mono-pcm
riff-44100hz-16bit-mono-pcm
riff-48khz-16bit-mono-pcm
riff-8khz-16bit-mono-pcm
riff-8khz-8bit-mono-alaw
riff-8khz-8bit-mono-mulaw
webm-16khz-16bit-mono-opus
webm-24khz-16bit-24kbps-mono-opus
webm-24khz-16bit-mono-opus

Increase/Decrease audio qualities

# Less than default quality.
$ aspeak text "Hello, world" -o output.mp3 -c mp3 -q=-1
# Best quality for mp3
$ aspeak text "Hello, world" -o output.mp3 -c mp3 -q=3

Read text from file and speak it.

$ cat input.txt | aspeak text

or

$ aspeak text -f input.txt

with custom encoding:

$ aspeak text -f input.txt -e gbk

Read from stdin and speak it.

$ aspeak text

maybe you prefer:

$ aspeak text -l zh-CN << EOF
我能吞下玻璃而不伤身体。
EOF

Speak Chinese.

$ aspeak text "你好,世界!" -l zh-CN

Use a custom voice.

$ aspeak text "你好,世界!" -v zh-CN-YunjianNeural

Custom pitch, rate and style

$ aspeak text "你好,世界!" -v zh-CN-XiaoxiaoNeural -p 1.5 -r 0.5 -S sad
$ aspeak text "你好,世界!" -v zh-CN-XiaoxiaoNeural -p=-10% -r=+5% -S cheerful
$ aspeak text "你好,世界!" -v zh-CN-XiaoxiaoNeural -p=+40Hz -r=1.2f -S fearful
$ aspeak text "你好,世界!" -v zh-CN-XiaoxiaoNeural -p=high -r=x-slow -S calm
$ aspeak text "你好,世界!" -v zh-CN-XiaoxiaoNeural -p=+1st -r=-7% -S lyrical

Advanced Usage

Use a custom audio format for output

Note: Some audio formats are not supported when outputting to speaker.

$ aspeak text "Hello World" -F riff-48khz-16bit-mono-pcm -o high-quality.wav

Library Usage

Python

The new version of aspeak is written in Rust, and the Python binding is provided by PyO3.

Here is a simple example:

from aspeak import SpeechService

service =  SpeechService(region="eastus", key="YOUR_AZURE_SUBSCRIPTION_KEY")
service.speak_text("Hello, world")

First you need to create a SpeechService instance.

When creating a SpeechService instance, you can specify the following parameters:

  • audio_format(Positional argument): The audio format of the output audio. Default is AudioFormat.Riff24KHz16BitMonoPcm.
    • You can get an audio format by providing a container format and a quality level: AudioFormat("mp3", 2).
  • endpoint: The endpoint of the speech service.
  • region: Alternatively, you can specify the region of the speech service instead of typing the boring endpoint url.
  • key: The subscription key of the speech service.
  • token: The auth token for the speech service. If you provide a token, the subscription key will be ignored.
  • headers: Additional HTTP headers for the speech service.
  • mode: Choose the synthesizer to use. Either rest or websocket.
    • In websocket mode, the synthesizer will connect to the endpoint when the SpeechService instance is created.

After that, you can call speak_text() to speak the text or speak_ssml() to speak the SSML. Or you can call synthesize_text() or synthesize_ssml() to get the audio data.

For synthesize_text() and synthesize_ssml(), if you provide an output, the audio data will be written to that file and the function will return None. Otherwise, the function will return the audio data.

Here are the common options for speak_text() and synthesize_text():

  • locale: The locale of the voice. Default is en-US.
  • voice: The voice name. Default is en-US-JennyNeural.
  • rate: The speaking rate of the voice. It must be a string that fits the requirements as documented in this section: Pitch and Rate
  • pitch: The pitch of the voice. It must be a string that fits the requirements as documented in this section: Pitch and Rate
  • style: The style of the voice.
    • You can get a list of available styles for a specific voice by executing aspeak -L -v <VOICE_ID>
    • The default value is general.
  • style_degree: The degree of the style.
    • According to the Azure documentation , style degree specifies the intensity of the speaking style. It is a floating point number between 0.01 and 2, inclusive.
    • At the time of writing, style degree adjustments are supported for Chinese (Mandarin, Simplified) neural voices.
  • role: The role of the voice.
    • According to the Azure documentation , role specifies the speaking role-play. The voice acts as a different age and gender, but the voice name isn't changed.
    • At the time of writing, role adjustments are supported for these Chinese (Mandarin, Simplified) neural voices: zh-CN-XiaomoNeural, zh-CN-XiaoxuanNeural, zh-CN-YunxiNeural, and zh-CN-YunyeNeural.

Rust

Add aspeak to your Cargo.toml:

$ cargo add aspeak

Then follow the documentation of aspeak crate.

There are 4 examples for quick reference:

aspeak's People

Contributors

attila-lin avatar dependabot[bot] avatar everythingsuckz avatar flt6 avatar kxxt avatar mend-bolt-for-github[bot] avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

aspeak's Issues

Faster audio output/processing

Possible to use this in real-time communications? Compared with just azure it's slower and I have the deepl API to talk with foreigners. I'd like to get the audio within 200 ms and output it to a sound device, if it's feasible.

ERROR: Cannot install aspeak because these package versions have conflicting dependencies.

To fix this you could try to:

  1. loosen the range of package versions you've specified
  2. remove package versions to allow pip attempt to solve the dependency conflict

ERROR: Cannot install aspeak==0.1, aspeak==0.1.1, aspeak==0.2.0, aspeak==0.2.1, aspeak==0.3.0, aspeak==0.3.1, aspeak==0.3.2, aspeak==1.0.0, aspeak==1.1.0, aspeak==1.1.1, aspeak==1.1.2, aspeak==1.1.3, aspeak==1.1.4, aspeak==1.2.0, aspeak==1.3.0, aspeak==1.3.1, aspeak==1.4.0, aspeak==1.4.1, aspeak==1.4.2, aspeak==2.0.0, aspeak==2.0.1, aspeak==2.1.0, aspeak==3.0.0, aspeak==3.0.1 and aspeak==3.0.2 because these package versions have conflicting dependencies.

STYLE 在 1.30 中好像不起作用

输入命令
aspeak -f input.txt -v zh-CN-XiaoxiaoNeural -S newscast -F Riff24Khz16BitMonoPcm -o output.wav
成功输出音频,但是风格和官网上听到的 newscast 风格完全不同,感觉是默认风格

Error: 0: Websocket error 1: HTTP error: 200 OK

Error:
0: Websocket error
1: HTTP error: 200 OK

Location:
/Users/mick/.cargo/registry/src/mirrors.tuna.tsinghua.edu.cn-df7c3c540f42cdbd/aspeak-4.2.0/src/main.rs:70

Backtrace omitted. Run with RUST_BACKTRACE=1 environment variable to display it.
Run with RUST_BACKTRACE=full to include source snippets.

How can I use the variable for input parameter?

print(type(names1["hot50_cn_topic_" + str(i) ][0:280])) #结果为 str

input1 = names1["hot50_cn_topic_" + str(i) ][0:280]
input2 = "近日在全国多地,许多新冠感染者们已陆续转阴,回归到正常的生活和工作中。"

#os.system('aspeak -t names1["hot50_cn_topic_" + str(i) ][0:280] -o "./1/{}{}{}".format(year, month, day)+str(i)+".mp3" -l zh-CN') # 结果: -t names1["hot50_cn_topic" + str(i) ][0:280] 格式不正常
#os.system('aspeak -t input1 -v zh-CN-YunjianNeural -R YoungAdultMale -o "{}".mp3'.format(out1)) # 结果: -t input1 格式不正常
#os.system('aspeak -t input2 -v zh-CN-YunjianNeural -R YoungAdultMale -o "{}".mp3'.format(out1)) # 结果: -t input2 格式不正常

os.system('aspeak -t """近日在全国多地,许多新冠感染者们已陆续转阴,回归到正常的生活和工作中。""" -v zh-CN-YunjianNeural -R YoungAdultMale -o "{}".mp3'.format(out1)) #这个是正常的

-t 之后的输入参数 怎么才能换成变量呢?

Error: Speech synthesis canceled: CancellationReason.Error

when i run:
aspeak -l zh-CN -f 从美丽的室友开始.txt -o 从美丽的室友开始.wav
the txt file has 90k chars and 279KB.
and then it shows:
Error: Speech synthesis canceled: CancellationReason.Error Connection was closed by the remote host. Error code: 1007. Error details: Websocket message size cannot exceed 65536 bytes USP state: 3. Received audio size: 0 bytes.
then I only used the first paragraph(4k letters and 12.8KB),and it works well.

imo,it may cause by too large file.

Running in python script

How can I run this in my python script, just that without text being typed in, there would be used translated_text instead:
def speak_paste(): try: spoken_text = driver1.find_element_by_xpath("/html/body/div/div[2]/div[3]/span").text test_str = (spoken_text) res = " ".join(lookp_dict.get(ele, ele) for ele in test_str.split()) pyperclip.copy(res) translator = deepl.Translator('') result = translator.translate_text((res), target_lang="ru", formality="less", preserve_formatting="1") translated_text = result.text
This is basically using subtitle text to translate it with deepl, but I want just to pass this translated text to azure tts to synthesize it.

need help

need help.
C:\Users\公司>aspeak
Traceback (most recent call last):
File "c:\users\公司\appdata\local\programs\python\python37\lib\runpy.py", line 193, in run_module_as_main
"main", mod_spec)
File "c:\users\公司\appdata\local\programs\python\python37\lib\runpy.py", line 85, in run_code
exec(code, run_globals)
File "C:\Users\公司\AppData\Local\Programs\Python\Python37\Scripts\aspeak.exe_main
.py", line 5, in
File "c:\users\公司\appdata\local\programs\python\python37\lib\site-packages\aspeak_main
.py", line 1, in
from .cli import main
File "c:\users\公司\appdata\local\programs\python\python37\lib\site-packages\aspeak\cli_init_.py", line 1, in
from .main import main
File "c:\users\公司\appdata\local\programs\python\python37\lib\site-packages\aspeak\cli\main.py", line 11, in
from .parser import parser
File "c:\users\公司\appdata\local\programs\python\python37\lib\site-packages\aspeak\cli\parser.py", line 5, in
from .value_parsers import pitch, rate, format
File "c:\users\公司\appdata\local\programs\python\python37\lib\site-packages\aspeak\cli\value_parsers.py", line 26
if (result := try_parse_float(arg)) and result[0]:
^
SyntaxError: invalid syntax

Could not extract token from webpage

Raise this error today

def _get_auth_token() -> str:
    """
    Get a trial auth token from the trial webpage.
    """
    response = requests.get(TRAIL_URL)
    if response.status_code != 200:
        raise errors.TokenRetrievalError(status_code=response.status_code)
    text = response.text

    # We don't need bs4, because a little of regex is enough.

    match = re.search(r'\s+var\s+localizedResources\s+=\s+\{((.|\n)*?)\};', text, re.M)
    retrieval_error = errors.TokenRetrievalError(message='Could not extract token from webpage.',
                                                 status_code=response.status_code)
    if match is None:
        raise retrieval_error
    token = re.search(r'\s+token:\s*"([^"]+)"', match.group(1), re.M)
    if token is None:
        raise retrieval_error
    return token.group(1)

https://azure.microsoft.com/en-us/services/cognitive-services/text-to-speech/#overview works normal in browser in same machine.

aspeak -t "hello world"

Traceback (most recent call last):
File "/usr/local/bin/aspeak", line 8, in
sys.exit(main())
File "/usr/local/lib/python3.9/dist-packages/aspeak/cli/main.py", line 150, in main
result = main_text(funcs, args, audio_format)
File "/usr/local/lib/python3.9/dist-packages/aspeak/cli/main.py", line 77, in main_text
result = speech_function_selector(funcs, preprocess_text(args.text, args), audio_format)
File "/usr/local/lib/python3.9/dist-packages/aspeak/cli/main.py", line 51, in speech_function_selector
return _pure_text_to_speech(text_or_ssml, audio_format=audio_format, **options)
File "/usr/local/lib/python3.9/dist-packages/aspeak/api/functional.py", line 32, in pure_text_to_speech
return provider.text_to_speech(text, cfg, output)
File "/usr/local/lib/python3.9/dist-packages/aspeak/api/provider.py", line 40, in text_to_speech
return speechsdk.SpeechSynthesizer(speech_config=cfg, audio_config=output).speak_text(text)
File "/usr/local/lib/python3.9/dist-packages/azure/cognitiveservices/speech/speech.py", line 1563, in init
self._impl = self._get_impl(impl.SpeechSynthesizer, speech_config, audio_config, auto_detect_source_language_config)
File "/usr/local/lib/python3.9/dist-packages/azure/cognitiveservices/speech/speech.py", line 1667, in _get_impl
_impl = synth_type._from_config(speech_config._impl, None if audio_config is None else audio_config._impl)
RuntimeError: Exception with an error code: 0x38 (SPXERR_AUDIO_SYS_LIBRARY_NOT_FOUND)
[CALL STACK BEGIN]

Error: 0: Websocket error 1: TLS error: native-tls error: connection closed via error 2: native-tls error: connection closed via error 3: connection closed via error

Error:
0: Websocket error
1: TLS error: native-tls error: connection closed via error
2: native-tls error: connection closed via error
3: connection closed via error

Location:
/Users/mick/.cargo/registry/src/rsproxy.cn-8f6827c7555bfaf8/aspeak-4.0.0/src/main.rs:73

Backtrace omitted. Run with RUST_BACKTRACE=1 environment variable to display it.
Run with RUST_BACKTRACE=full to include source snippets.
mick@192 ~ %

reqwest-0.11.14.crate: 4 vulnerabilities (highest severity is: 9.1) - autoclosed

Vulnerable Library - reqwest-0.11.14.crate

Found in HEAD commit: e6fca34b37d722f0a140069426f1e09e8054a0c0

Vulnerabilities

CVE Severity CVSS Dependency Type Fixed in (reqwest version) Remediation Available
WS-2023-0045 High 9.1 remove_dir_all-0.5.3.crate Transitive N/A*
WS-2023-0082 High 7.5 detected in multiple dependencies Transitive N/A*
WS-2023-0081 High 7.5 detected in multiple dependencies Transitive N/A*
WS-2023-0083 High 7.5 detected in multiple dependencies Transitive N/A*

*For some transitive vulnerabilities, there is no version of direct dependency with a fix. Check the "Details" section below to see if there is a version of transitive dependency where vulnerability is fixed.

Details

WS-2023-0045

Vulnerable Library - remove_dir_all-0.5.3.crate

A safe, reliable implementation of remove_dir_all for Windows

Library home page: https://crates.io/api/v1/crates/remove_dir_all/0.5.3/download

Dependency Hierarchy:

  • reqwest-0.11.14.crate (Root Library)
    • tokio-native-tls-0.3.1.crate
      • native-tls-0.2.11.crate
        • tempfile-3.3.0.crate
          • remove_dir_all-0.5.3.crate (Vulnerable Library)

Found in HEAD commit: e6fca34b37d722f0a140069426f1e09e8054a0c0

Found in base branch: main

Vulnerability Details

The remove_dir_all crate is a Rust library that offers additional features over the Rust standard library fs::remove_dir_all function. It suffers the same class of failure as the code it was layering over: TOCTOU race conditions, with the ability to cause arbitrary paths to be deleted by substituting a symlink for a path after the type of the path was checked.

Publish Date: 2023-02-24

URL: WS-2023-0045

CVSS 3 Score Details (9.1)

Base Score Metrics:

  • Exploitability Metrics:
    • Attack Vector: Network
    • Attack Complexity: Low
    • Privileges Required: None
    • User Interaction: None
    • Scope: Unchanged
  • Impact Metrics:
    • Confidentiality Impact: None
    • Integrity Impact: High
    • Availability Impact: High

For more information on CVSS3 Scores, click here.

Suggested Fix

Type: Upgrade version

Origin: GHSA-mc8h-8q98-g5hr

Release Date: 2023-02-24

Fix Resolution: remove_dir_all - 0.8.0

Step up your Open Source Security Game with Mend here

WS-2023-0082

Vulnerable Libraries - openssl-sys-0.9.80.crate, openssl-0.10.45.crate

openssl-sys-0.9.80.crate

FFI bindings to OpenSSL

Library home page: https://crates.io/api/v1/crates/openssl-sys/0.9.80/download

Dependency Hierarchy:

  • reqwest-0.11.14.crate (Root Library)
    • tokio-native-tls-0.3.1.crate
      • native-tls-0.2.11.crate
        • openssl-0.10.45.crate
          • openssl-sys-0.9.80.crate (Vulnerable Library)

openssl-0.10.45.crate

OpenSSL bindings

Library home page: https://crates.io/api/v1/crates/openssl/0.10.45/download

Dependency Hierarchy:

  • reqwest-0.11.14.crate (Root Library)
    • tokio-native-tls-0.3.1.crate
      • native-tls-0.2.11.crate
        • openssl-0.10.45.crate (Vulnerable Library)

Found in HEAD commit: e6fca34b37d722f0a140069426f1e09e8054a0c0

Found in base branch: main

Vulnerability Details

openssl X509NameBuilder::build returned object is not thread safe

Publish Date: 2023-03-25

URL: WS-2023-0082

CVSS 3 Score Details (7.5)

Base Score Metrics:

  • Exploitability Metrics:
    • Attack Vector: Network
    • Attack Complexity: Low
    • Privileges Required: None
    • User Interaction: None
    • Scope: Unchanged
  • Impact Metrics:
    • Confidentiality Impact: High
    • Integrity Impact: None
    • Availability Impact: None

For more information on CVSS3 Scores, click here.

Suggested Fix

Type: Upgrade version

Origin: GHSA-3gxf-9r58-2ghg

Release Date: 2023-03-25

Fix Resolution: openssl - 0.10.48

Step up your Open Source Security Game with Mend here

WS-2023-0081

Vulnerable Libraries - openssl-sys-0.9.80.crate, openssl-0.10.45.crate

openssl-sys-0.9.80.crate

FFI bindings to OpenSSL

Library home page: https://crates.io/api/v1/crates/openssl-sys/0.9.80/download

Dependency Hierarchy:

  • reqwest-0.11.14.crate (Root Library)
    • tokio-native-tls-0.3.1.crate
      • native-tls-0.2.11.crate
        • openssl-0.10.45.crate
          • openssl-sys-0.9.80.crate (Vulnerable Library)

openssl-0.10.45.crate

OpenSSL bindings

Library home page: https://crates.io/api/v1/crates/openssl/0.10.45/download

Dependency Hierarchy:

  • reqwest-0.11.14.crate (Root Library)
    • tokio-native-tls-0.3.1.crate
      • native-tls-0.2.11.crate
        • openssl-0.10.45.crate (Vulnerable Library)

Found in HEAD commit: e6fca34b37d722f0a140069426f1e09e8054a0c0

Found in base branch: main

Vulnerability Details

openssl X509Extension::new and X509Extension::new_nid null pointer dereference

Publish Date: 2023-03-25

URL: WS-2023-0081

CVSS 3 Score Details (7.5)

Base Score Metrics:

  • Exploitability Metrics:
    • Attack Vector: Network
    • Attack Complexity: Low
    • Privileges Required: None
    • User Interaction: None
    • Scope: Unchanged
  • Impact Metrics:
    • Confidentiality Impact: None
    • Integrity Impact: None
    • Availability Impact: High

For more information on CVSS3 Scores, click here.

Suggested Fix

Type: Upgrade version

Origin: GHSA-6hcf-g6gr-hhcr

Release Date: 2023-03-25

Fix Resolution: openssl - 0.10.48

Step up your Open Source Security Game with Mend here

WS-2023-0083

Vulnerable Libraries - openssl-sys-0.9.80.crate, openssl-0.10.45.crate

openssl-sys-0.9.80.crate

FFI bindings to OpenSSL

Library home page: https://crates.io/api/v1/crates/openssl-sys/0.9.80/download

Dependency Hierarchy:

  • reqwest-0.11.14.crate (Root Library)
    • tokio-native-tls-0.3.1.crate
      • native-tls-0.2.11.crate
        • openssl-0.10.45.crate
          • openssl-sys-0.9.80.crate (Vulnerable Library)

openssl-0.10.45.crate

OpenSSL bindings

Library home page: https://crates.io/api/v1/crates/openssl/0.10.45/download

Dependency Hierarchy:

  • reqwest-0.11.14.crate (Root Library)
    • tokio-native-tls-0.3.1.crate
      • native-tls-0.2.11.crate
        • openssl-0.10.45.crate (Vulnerable Library)

Found in HEAD commit: e6fca34b37d722f0a140069426f1e09e8054a0c0

Found in base branch: main

Vulnerability Details

openssl SubjectAlternativeName and ExtendedKeyUsage::other allow arbitrary file read

Publish Date: 2023-03-25

URL: WS-2023-0083

CVSS 3 Score Details (7.5)

Base Score Metrics:

  • Exploitability Metrics:
    • Attack Vector: Network
    • Attack Complexity: Low
    • Privileges Required: None
    • User Interaction: None
    • Scope: Unchanged
  • Impact Metrics:
    • Confidentiality Impact: High
    • Integrity Impact: None
    • Availability Impact: None

For more information on CVSS3 Scores, click here.

Suggested Fix

Type: Upgrade version

Origin: GHSA-9qwg-crg9-m2vc

Release Date: 2023-03-25

Fix Resolution: openssl - 0.10.48

Step up your Open Source Security Game with Mend here

routine options not available on new rust aspeak?

Hi,

I like aspeak. After using it for one whole day, I received "AssertionError" issue now. Seems blocked by MSFT?

Checked this subject from closed issues and found that you are rebuilding one new rust edition.
Downloaded latest 4.0 beta 1 and found "-f" option and "-o" "--mp3" not working and seems not supported?
Changed the option name or cancelled these option?

My Environment is on intel mbp (MacOs 12.6.3 Montery), python 3.11.2.
I downed aspeak-x86_64-apple-darwin.

Add support for `Send`

Hi, the use of RefCell will cause it hard to use in async runtime.

It would be nice to enhance this part.

RuntimeError in ssml_to_speech_async

An RuntimeError occurred while calling ssml_to_speech_async of instance of SpeechToFileService.

CRITICAL: Traceback (most recent call last):
  ......
  File "E:\***\tts.py", line 17, in tts
    return provider.ssml_to_speech_async(ssml,path=path)  # type: ignore
  File "F:\ProgramData\Miniconda3\lib\site-packages\aspeak\api\api.py", line 110, in wrapper
    self._setup_synthesizer(kwargs['path'])
  File "F:\ProgramData\Miniconda3\lib\site-packages\aspeak\api\api.py", line 139, in _setup_synthesizer
    self._synthesizer = speechsdk.SpeechSynthesizer(self._config, self._output)
  File "F:\ProgramData\Miniconda3\lib\site-packages\azure\cognitiveservices\speech\speech.py", line 1598, in __init__
    self._impl = self._get_impl(impl.SpeechSynthesizer, speech_config, audio_config,
  File "F:\ProgramData\Miniconda3\lib\site-packages\azure\cognitiveservices\speech\speech.py", line 1703, in _get_impl
    _impl = synth_type._from_config(speech_config._impl, None if audio_config is None else audio_config._impl)
RuntimeError: Exception with an error code: 0x8 (SPXERR_FILE_OPEN_FAILED)
[CALL STACK BEGIN]

    > pal_string_to_wstring

    - pal_string_to_wstring

    - synthesizer_create_speech_synthesizer_from_config

    - synthesizer_create_speech_synthesizer_from_config

    - 00007FFE37F772C4 (SymFromAddr() error: 试图访问无效的地址。)

    - 00007FFE37FC76A8 (SymFromAddr() error: 试图访问无效的地址。)

    - 00007FFE37FC87A8 (SymFromAddr() error: 试图访问无效的地址。)

    - PyArg_CheckPositional

    - Py_NewReference

    - PyEval_EvalFrameDefault

    - Py_NewReference

    - PyEval_EvalFrameDefault

    - PyFunction_Vectorcall

    - PyFunction_Vectorcall

    - PyMem_RawStrdup

    - Py_NewReference



[CALL STACK END]

tts.py

from aspeak import SpeechToFileService,AudioFormat,FileFormat

provider=None
fmt=AudioFormat(FileFormat.MP3,-1)

def init():
    global provider
    provider=SpeechToFileService(locale="zh-CN",audio_format=fmt)

def tts(ssml:str,path:str):
    if provider is None:
        init()
    return provider.ssml_to_speech_async(ssml,path=path)  # type: ignore

The thing is, this error seemed to occurred randomly, and only when I created(Finished) over 20 ssml_to_speech_async instance does it occurs.
This error seems can't be catch through try

AssertionError... Again...

I really hope this issue gets resolved! What a pity that I do not understand how to solve this problem...

Traceback (most recent call last):
File "/opt/homebrew/bin/aspeak", line 8, in
sys.exit(main())
File "/opt/homebrew/lib/python3.9/site-packages/aspeak/cli/main.py", line 138, in main
speech = create_speech(args.locale, args.voice, audio_format)
File "/opt/homebrew/lib/python3.9/site-packages/aspeak/api/api.py", line 138, in init
super().init(locale, voice, audio_format, output)
File "/opt/homebrew/lib/python3.9/site-packages/aspeak/api/api.py", line 43, in init
self._config()
File "/opt/homebrew/lib/python3.9/site-packages/aspeak/api/api.py", line 51, in _config
assert token is not None
AssertionError

error

Traceback (most recent call last):
File "/opt/homebrew/bin/aspeak", line 8, in
sys.exit(main())
File "/opt/homebrew/lib/python3.9/site-packages/aspeak/cli/main.py", line 138, in main
speech = create_speech(args.locale, args.voice, audio_format)
File "/opt/homebrew/lib/python3.9/site-packages/aspeak/api/api.py", line 138, in init
super().init(locale, voice, audio_format, output)
File "/opt/homebrew/lib/python3.9/site-packages/aspeak/api/api.py", line 43, in init
self._config()
File "/opt/homebrew/lib/python3.9/site-packages/aspeak/api/api.py", line 51, in _config
assert token is not None
AssertionError

Failed to synthesize speech!

Failed to synthesize speech!

result = speech.text_to_speech(
            str, voice='en-US-JennyNeural', style='excited')
        print(result)

the result gets None

Error: Speech synthesis canceled: CancellationReason.Error

Error: Speech synthesis canceled: CancellationReason.Error
WebSocket upgrade failed: Unspecified connection error (200). USP state: 2. Received audio size: 0 bytes.

今天突然出现的,不知道什么情况。是我的网络问题吗?

WebsocketError with ErrorCode 429

The free trial API is heavily rate limited by Microsoft.

If you are constantly hitting this with an error code 429, please consider registering an Azure account and use aspeak with an azure subscription key. (There is a free tier)

Or you can try to use the edge tts endpoint which comes with fewer features(You need at least aspeak v4.2.0). You can figure out what options are needed by looking at edge-tts's code:

Example

Hello!

i really need some examples of how to user aspeak 4.1 with python, i used to have 3.1 working, but i upgraded and got lost on the new 4 version,

can someone please make a examplo of how to make output a mp3 file based on an input.txt file

thanks in advance

certifi-2022.9.24-py3-none-any.whl: 1 vulnerabilities (highest severity is: 6.8) - autoclosed

Vulnerable Library - certifi-2022.9.24-py3-none-any.whl

Python package for providing Mozilla's CA Bundle.

Library home page: https://files.pythonhosted.org/packages/1d/38/fa96a426e0c0e68aabc68e896584b83ad1eec779265a028e156ce509630e/certifi-2022.9.24-py3-none-any.whl

Path to dependency file: /tmp/ws-scm/aspeak

Path to vulnerable library: /tmp/ws-scm/aspeak,/requirements.txt

Vulnerabilities

CVE Severity CVSS Dependency Type Fixed in (certifi version) Remediation Available
CVE-2022-23491 Medium 6.8 certifi-2022.9.24-py3-none-any.whl Direct certifi - 2022.12.07

Details

CVE-2022-23491

Vulnerable Library - certifi-2022.9.24-py3-none-any.whl

Python package for providing Mozilla's CA Bundle.

Library home page: https://files.pythonhosted.org/packages/1d/38/fa96a426e0c0e68aabc68e896584b83ad1eec779265a028e156ce509630e/certifi-2022.9.24-py3-none-any.whl

Path to dependency file: /tmp/ws-scm/aspeak

Path to vulnerable library: /tmp/ws-scm/aspeak,/requirements.txt

Dependency Hierarchy:

  • certifi-2022.9.24-py3-none-any.whl (Vulnerable Library)

Found in base branch: main

Vulnerability Details

Certifi is a curated collection of Root Certificates for validating the trustworthiness of SSL certificates while verifying the identity of TLS hosts. Certifi 2022.12.07 removes root certificates from "TrustCor" from the root store. These are in the process of being removed from Mozilla's trust store. TrustCor's root certificates are being removed pursuant to an investigation prompted by media reporting that TrustCor's ownership also operated a business that produced spyware. Conclusions of Mozilla's investigation can be found in the linked google group discussion.

Publish Date: 2022-12-07

URL: CVE-2022-23491

CVSS 3 Score Details (6.8)

Base Score Metrics:

  • Exploitability Metrics:
    • Attack Vector: Network
    • Attack Complexity: Low
    • Privileges Required: High
    • User Interaction: None
    • Scope: Changed
  • Impact Metrics:
    • Confidentiality Impact: None
    • Integrity Impact: High
    • Availability Impact: None

For more information on CVSS3 Scores, click here.

Suggested Fix

Type: Upgrade version

Origin: https://www.cve.org/CVERecord?id=CVE-2022-23491

Release Date: 2022-12-07

Fix Resolution: certifi - 2022.12.07

Step up your Open Source Security Game with Mend here

AssertionError

Hi
i'm using this configuration since oct22:

speech = SpeechToFileService(voice="pt-BR-AntonioNeural", audio_format=AudioFormat(FileFormat.MP3, quality=1))

but today this error make the system stop

File "C:\Users\XX\AppData\Local\Programs\Python\Python310\lib\site-packages\aspeak\api\api.py", line 162, in init
super().init(locale, voice, audio_format, None)
File "C:\Users\XX\AppData\Local\Programs\Python\Python310\lib\site-packages\aspeak\api\api.py", line 43, in init
self._config()
File "C:\Users\XX\AppData\Local\Programs\Python\Python310\lib\site-packages\aspeak\api\api.py", line 51, in _config
assert token is not None
AssertionError

can u help me?

ImportError from urllib3: cannot import name 'Mapping' from 'collections'

/Users/meeia ~ aspeak -t "Hello, world"
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.10/bin/aspeak", line 5, in <module>
    from aspeak.__main__ import main
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/aspeak/__main__.py", line 1, in <module>
    from .cli import main
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/aspeak/cli/__init__.py", line 1, in <module>
    from .main import main
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/aspeak/cli/main.py", line 8, in <module>
    from .voices import list_voices
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/aspeak/cli/voices.py", line 1, in <module>
    import requests
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/requests/__init__.py", line 43, in <module>
    import urllib3
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/urllib3/__init__.py", line 8, in <module>
    from .connectionpool import (
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/urllib3/connectionpool.py", line 29, in <module>
    from .connection import (
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/urllib3/connection.py", line 39, in <module>
    from .util.ssl_ import (
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/urllib3/util/__init__.py", line 3, in <module>
    from .connection import is_connection_dropped
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/urllib3/util/connection.py", line 3, in <module>
    from .wait import wait_for_read
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/urllib3/util/wait.py", line 1, in <module>
    from .selectors import (
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/urllib3/util/selectors.py", line 14, in <module>
    from collections import namedtuple, Mapping
ImportError: cannot import name 'Mapping' from 'collections' (/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/collections/__init__.py)
/Users/meeia ~

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.