GithubHelp home page GithubHelp logo

sfortis / openai_tts Goto Github PK

View Code? Open in Web Editor NEW
44.0 2.0 16.0 46 KB

OpenAI TTS custom component for HA

License: GNU General Public License v3.0

Python 100.00%
hacs openai tts ha homeassistant text-to-speech speech-synthesis

openai_tts's Introduction

openai_tts

OpenAI TTS Custom Component for Home Assistant

This custom component integrates OpenAI's Text-to-Speech (TTS) service with Home Assistant, allowing users to convert text into spoken audio. The service supports various languages and voices, offering customizable options such as voice model.

Description

The OpenAI TTS component for Home Assistant makes it possible to use the OpenAI API to generate spoken audio from text. This can be used in automations, assistants, scripts, or any other component that supports TTS within Home Assistant. You need an openAI API key.

Features

  • Text-to-Speech conversion using OpenAI's API
  • Support for multiple languages and voices
  • Customizable speech model (check https://platform.openai.com/docs/guides/text-to-speech for supported voices and models)
  • Integration with Home Assistant's assistant, automations and scripts

Sample

https://www.youtube.com/watch?v=oeeypI_X0qs

Sample Home Assistant service

service: tts.speak
target:
  entity_id: tts.openai_nova_engine
data:
  cache: true
  media_player_entity_id: media_player.bedroom_speaker
  message: My speech has improved now!

HACS installation ( preferred! )

  1. Go to the sidebar HACS menu

  2. Click on the 3-dot overflow menu in the upper right and select the "Custom Repositories" item.

  3. Copy/paste https://github.com/sfortis/openai_tts into the "Repository" textbox and select "Integration" for the category entry.

  4. Click on "Add" to add the custom repository.

  5. You can then click on the "OpenAI TTS Speech Services" repository entry and download it. Restart Home Assistant to apply the component.

  6. Add the integration via UI, provide API key and select required model and voice. Multiple instances may be configured.

Manual installation

  1. Ensure you have a custom_components folder within your Home Assistant configuration directory.

  2. Inside the custom_components folder, create a new folder named openai_tts.

  3. Place the repo files inside openai_tts folder.

  4. Restart Home Assistant

  5. Add the integration via UI, provide API key and select required model and voice. Multiple instances may be configured.

openai_tts's People

Contributors

dvejsada avatar jkpe avatar raldone01 avatar sfortis avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

openai_tts's Issues

Speed is Int instead of Float

Speed is currently configured as an integer. But as per the API, should be float with range from 0.25 to 4 (default 1).

Updated my own code to float instead of Int and worked a treat.

Invalid API key length

I'm getting a 70 character key back when I create a key owned by a Service Account.

Works fine if I increase the limit in this:

if not (51 <= len(api_key) <= 56):

Traceback (most recent call last):
  File "/config/custom_components/openai_tts/config_flow.py", line 63, in async_step_user
    await validate_user_input(user_input)
  File "/config/custom_components/openai_tts/config_flow.py", line 28, in validate_user_input
    await validate_api_key(user_input.get(CONF_API_KEY))
  File "/config/custom_components/openai_tts/config_flow.py", line 24, in validate_api_key
    raise WrongAPIKey("Invalid API key length")
custom_components.openai_tts.config_flow.WrongAPIKey: Invalid API key length

CONF_API_KEY length 51 vs 56

The CONF_API_KEY config value is checked to be 51 character long, but the new project based openapi keys are 56 character long. Also I didn't succeed to use my old key to generate voice (got an error from the API call), so I had to change the code in custom_components to allow 56 characters and then it worked with my project api key.

Language selection?

The UI setup of the integration only asks for API key, speed, model and voice.
In the YAML version I was able to configure the language, how do I do that now?

Unable to produce TTS output

I have added my OpenAI API key in the configuration, however whenever I try to call the tts.openai_tts_say, for example:

service: tts.openai_tts_say
data:
  entity_id: media_player.living_room_speaker
  message: This is a test message

I receive the following error:

HTTP error from OpenAI: 429 Client Error: Too Many Requests for url: https://api.openai.com/v1/audio/speech

I haven't seen any other steps to take in order to start using the service. Any idea on what the issue might be?

[Feature] Custom endpoint URL

Hello,

I'm trying to get https://github.com/ther3zz/TTS working with home assistant (specifically this fork, as it supports multi-speaker models/xTTSv2). I've spent hours trying to bodge MaryTTS to work with this, but it has not been successful and I've given up trying (even tried making a proxy via a PHP Laravel app, but MaryTTS seems broken and isn't POSTing the actual text field?)

I was wondering if openai_tts could be modified to provide the option to specify a custom host that provides an OpenAI compatible endpoint. Alternatively, how much work would it be to create a separate project specifically for Coqui TTS? I'd be willing to donate for getting this working as I want all my AI running locally. I pretty much just want to be able to point HA to Coqui TTS -> Give it the endpoint URL and speaker settings -> Have it work.

This is how I've been running the forked TTS:

docker-compose.yml

    coqui-ai-tts:
        build: ./TTS
        container_name: tts
        restart: no
        environment:
            - COQUI_TOS_AGREED=1
        # To run the model, use the below entrypoint
        entrypoint: /bin/bash -c 'python3 TTS/server/server.py --list_models && python3 TTS/server/server.py --model_path /root/.local/share/tts/tts_models--multilingual--multi-dataset--xtts_v2/model.pth --config_path /root/.local/share/tts/tts_models--multilingual--multi-dataset--xtts_v2/config.json --use_cuda true'
        
        # To download the model, use the below entrypoint
        #entrypoint: /bin/bash -c 'python3 TTS/server/server.py --list_models && python3 TTS/server/server.py --model_name tts_models/multilingual/multi-dataset/xtts_v2 --use_cuda true'
        volumes:
            - ./volumes/tts_models:/app/tts_models
            - ./volumes/tts:/root/.local/share/tts
        ports:
            - 5002:5002
        deploy:
            resources:
                reservations:
                    devices:
                        - driver: nvidia
                          device_ids: ['0']
                          capabilities: [gpu]

Once the TTS server has been started, a request to the server has the following parameters:

GET /api/tts?text=SOMETEXTHERE&speaker_id=Claribel%20Dervla&style_wav=&language_id=en

Parameters:

text
speaker_id
style_wav
language_id

I don't think there is an API endpoint to return the list of supported languages or speakers, but perhaps as a temporary means, this can be manually specified until an endpoint is added upstream?

Thanks!

Missing agent

image

Could you advice how to setup the AGENT ? Also in your video you call TTS, is this needed? If so how to set it up?

Added a configuration but no audio

I've added a configuration as per instructions, but I can't hear any audio. Is there any setting missing?

Here is what it looks like:
The device:
image

The service call:

service: tts.speak
data:
  cache: true
  media_player_entity_id: media_player.vlc_telnet
  message: Test
target:
  entity_id: tts.openai_tts_nova

When I run it, I hear nothing, but UI shows as it succeeded.

This is the only log I can say it may be related to this, but I'm not 100% sure, because it is related to tags and this shouldn't stop it from playing.

2024-06-19 09:09:58.141 ERROR (MainThread) [homeassistant.components.tts] ID3 tag error: can't sync to MPEG frame

The media_player seems to be working normally, as I can use other tts services (like google translator google_say service) and I can hear it.

Thanks for your help.

Unable to use environment variable

Hi, first I want to say great job on this and thank you for making it!

One small issue I found though that I was wondering if you could look into: the configuration.yaml file is not happy with me using environment variables for the API key.

Failed to restart Home Assistant
The system cannot restart because the configuration is not valid: Error loading /config/configuration.yaml: OPENAIAPI

This seems to work with other items in the configuration.yaml file so I assume it's related to the integration, but might not be.

tts:
  - platform: openai_tts
    api_key: !env_var OPENAIAPI

Enable voice selection in options (feature request)

I'd like to be able to define the voice in the "options" field (and the other fields too, but primarily just voices), seen in Developer tools > services > call tts.openai_tts_say as well as Node-RED. I tried making some changes but I don't know much Python so I'm not sure the level of effort for this change. I think it would make sense to have defaults or be able to define them in configuration.yaml too.

Wrong API key

I've tried inputting the OpenAI API key in the "Add text-to-speech engine" dialog when adding integration but the input window just reloads. Tried multiple keys that work in other integrations.

Using the new 56 character API keys.
Just installed openai_tts today so have the latest version.

I get the following in the logs..

Logger: custom_components.openai_tts.config_flow
Source: custom_components/openai_tts/config_flow.py:56
integration: openai_tts ([documentation](https://github.com/sfortis/openai_tts/), [issues](https://github.com/sfortis/openai_tts/issues))
First occurred: 18:13:10 (4 occurrences)
Last logged: 18:14:28

Wrong or no API key provided.
Traceback (most recent call last):
  File "/config/custom_components/openai_tts/config_flow.py", line 56, in async_step_user
    await validate_input(user_input)
  File "/config/custom_components/openai_tts/config_flow.py", line 21, in validate_input
    raise WrongAPIKey
custom_components.openai_tts.config_flow.WrongAPIKey

InvalidDataError

Running HA core 2024.6.1

Installs fine, but sending requests I get the below error in the log:
music_assistant.common.models.errors.InvalidDataError

The mp3 files created are 1kb in size, hence the invalid data :(

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.