-
-
Notifications
You must be signed in to change notification settings - Fork 32.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make Google Cloud STT/TTS timeout configurable #136575
base: dev
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please take a look at the requested changes, and use the Ready for review button when you are done, thanks 👍 |
Hey there @lufton, @tronikos, mind taking a look at this pull request as it has been labeled with an integration ( Code owner commandsCode owners of
|
Honestly, I don't think this is an option that should work this way. It means you have to take into account what you are sending as a user (e.g., output of a template, could be unknown up front). Instead, we could maybe consider a different default, error handling or logic to automatically set a sensible timeout depending on requested content? I don't feel this is a burden we should put on an end-user. |
Thanks @frenck for the feedback! I agree with your description of the endstate, this would be great indeed. However, I don't think that having this as an optional configurable parameter creates any extra burden for the users compared to the current state: their experience does not change except for allowing power users to change the timeout if it fails on their use cases. So I see this PR as an incremental improvement and not the end solution. But maybe it's a step too small and somebody needs to work on the proper endstate solution instead. |
Something to consider is how do we define the timeout based on the input? The response time will depend on the selected model, language and probably text, too. If we say "for each X chars add Y ms to the timeout" and undershoot it, we end up with the same issue for the end-user just with extra steps on our side. So we need to somehow estimate the response time with some generous extra buffer, while making sure we don't end up with unreasonably long timeouts... |
@@ -19,7 +19,9 @@ | |||
CONF_GAIN = "gain" | |||
CONF_PROFILES = "profiles" | |||
CONF_TEXT_TYPE = "text_type" | |||
CONF_TIMEOUT= "tts_timeout" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
space before =
@@ -19,7 +19,9 @@ | |||
CONF_GAIN = "gain" | |||
CONF_PROFILES = "profiles" | |||
CONF_TEXT_TYPE = "text_type" | |||
CONF_TIMEOUT= "tts_timeout" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove tts_ prefix and use it in stt.py too.
@@ -25,7 +25,8 @@ | |||
"gain": "Default volume gain (in dB) of the voice", | |||
"profiles": "Default audio profiles", | |||
"text_type": "Default text type", | |||
"stt_model": "STT model" | |||
"stt_model": "STT model", | |||
"tts_timeout": "TTS timeout, seconds", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove tts_
and TTS
prefixes and use it in stt.py too.
@@ -25,7 +25,8 @@ | |||
"gain": "Default volume gain (in dB) of the voice", | |||
"profiles": "Default audio profiles", | |||
"text_type": "Default text type", | |||
"stt_model": "STT model" | |||
"stt_model": "STT model", | |||
"tts_timeout": "TTS timeout, seconds", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency with other options, e.g. gain above, change to Timeout (in seconds)
…v/core into google_cloud_tts_timeout
Hey @tronikos, thank you for the feedback, and pardon the delay - I got a bit distracted with a side quest :) |
@@ -215,8 +217,11 @@ async def _async_get_tts_audio( | |||
), | |||
) | |||
|
|||
response = await self._client.synthesize_speech(request, timeout=10) | |||
|
|||
timeout = options[CONF_TIMEOUT] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace it with options.get(CONF_TIMEOUT, DEFAULT_TIMEOUT)
and inline it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, that was dumb of me - fixed, thanks!
Proposed change
Hello, this is my first ever PR for Home Assistant - I sure hope I've done everything correctly, haha!
The existing implementation has a hard-coded 10s timeout when waiting for the generated speech from Google Cloud TTS.
Because of that, the integration fails (error "400 Request contains an invalid argument") when you try to generate outputs larger than roughly 600-800 characters, since it would take longer than that for Google Cloud TTS to return a response. I've been hitting this issue constantly when trying to announce long responses coming from my LLM, for example.
The proposed change makes TTS Timeout configurable in the UI. The default value is set at 10 seconds (same as the existing implementation), but now users can increase it if needed to generate longer output. The tests have been updated, too.

Type of change
Additional information
Checklist
ruff format homeassistant tests
)If user exposed functionality or configuration variables are added/changed:
If the code communicates with devices, web services, or third-party tools:
Updated and included derived files by running:
python3 -m script.hassfest
.requirements_all.txt
.Updated by running
python3 -m script.gen_requirements_all
. (no changes)To help with the load of incoming pull requests: