Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Google Cloud STT/TTS timeout configurable #136575

Open
wants to merge 10 commits into
base: dev
Choose a base branch
from

Conversation

eslavnov
Copy link

@eslavnov eslavnov commented Jan 26, 2025

Proposed change

Hello, this is my first ever PR for Home Assistant - I sure hope I've done everything correctly, haha!

The existing implementation has a hard-coded 10s timeout when waiting for the generated speech from Google Cloud TTS.
Because of that, the integration fails (error "400 Request contains an invalid argument") when you try to generate outputs larger than roughly 600-800 characters, since it would take longer than that for Google Cloud TTS to return a response. I've been hitting this issue constantly when trying to announce long responses coming from my LLM, for example.

The proposed change makes TTS Timeout configurable in the UI. The default value is set at 10 seconds (same as the existing implementation), but now users can increase it if needed to generate longer output. The tests have been updated, too.
2025-01-26 12_14_39-Settings – Home Assistant

Type of change

  • Dependency upgrade
  • Bugfix (non-breaking change which fixes an issue)
  • New integration (thank you!)
  • New feature (which adds functionality to an existing integration)
  • Deprecation (breaking change to happen in the future)
  • Breaking change (fix/feature causing existing functionality to break)
  • Code quality improvements to existing code or addition of tests

Additional information

Checklist

  • The code change is tested and works locally.
  • Local tests pass. Your PR cannot be merged unless tests pass
  • There is no commented out code in this PR.
  • I have followed the development checklist
  • I have followed the perfect PR recommendations
  • The code has been formatted using Ruff (ruff format homeassistant tests)
  • Tests have been added to verify that the new code works.

If user exposed functionality or configuration variables are added/changed:

If the code communicates with devices, web services, or third-party tools:

  • The manifest file has all fields filled out correctly.
    Updated and included derived files by running: python3 -m script.hassfest.
  • New or updated dependencies have been added to requirements_all.txt.
    Updated by running python3 -m script.gen_requirements_all. (no changes)
  • For the updated dependencies - a link to the changelog, or at minimum a diff between library versions is added to the PR description. (no changes)

To help with the load of incoming pull requests:

@eslavnov eslavnov requested a review from tronikos as a code owner January 26, 2025 14:46
Copy link

@home-assistant home-assistant bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @eslavnov

It seems you haven't yet signed a CLA. Please do so here.

Once you do that we will be able to review and accept this pull request.

Thanks!

@home-assistant
Copy link

Please take a look at the requested changes, and use the Ready for review button when you are done, thanks 👍

Learn more about our pull request process.

@home-assistant
Copy link

Hey there @lufton, @tronikos, mind taking a look at this pull request as it has been labeled with an integration (google_cloud) you are listed as a code owner for? Thanks!

Code owner commands

Code owners of google_cloud can trigger bot actions by commenting:

  • @home-assistant close Closes the pull request.
  • @home-assistant rename Awesome new title Renames the pull request.
  • @home-assistant reopen Reopen the pull request.
  • @home-assistant unassign google_cloud Removes the current integration label and assignees on the pull request, add the integration domain after the command.
  • @home-assistant add-label needs-more-information Add a label (needs-more-information, problem in dependency, problem in custom component) to the pull request.
  • @home-assistant remove-label needs-more-information Remove a label (needs-more-information, problem in dependency, problem in custom component) on the pull request.

@frenck
Copy link
Member

frenck commented Feb 6, 2025

Honestly, I don't think this is an option that should work this way. It means you have to take into account what you are sending as a user (e.g., output of a template, could be unknown up front).

Instead, we could maybe consider a different default, error handling or logic to automatically set a sensible timeout depending on requested content?

I don't feel this is a burden we should put on an end-user.

@frenck frenck marked this pull request as draft February 6, 2025 19:59
@eslavnov
Copy link
Author

eslavnov commented Feb 7, 2025

Thanks @frenck for the feedback! I agree with your description of the endstate, this would be great indeed.

However, I don't think that having this as an optional configurable parameter creates any extra burden for the users compared to the current state: their experience does not change except for allowing power users to change the timeout if it fails on their use cases.

So I see this PR as an incremental improvement and not the end solution. But maybe it's a step too small and somebody needs to work on the proper endstate solution instead.

@eslavnov
Copy link
Author

eslavnov commented Feb 7, 2025

Something to consider is how do we define the timeout based on the input? The response time will depend on the selected model, language and probably text, too. If we say "for each X chars add Y ms to the timeout" and undershoot it, we end up with the same issue for the end-user just with extra steps on our side. So we need to somehow estimate the response time with some generous extra buffer, while making sure we don't end up with unreasonably long timeouts...

@@ -19,7 +19,9 @@
CONF_GAIN = "gain"
CONF_PROFILES = "profiles"
CONF_TEXT_TYPE = "text_type"
CONF_TIMEOUT= "tts_timeout"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space before =

@@ -19,7 +19,9 @@
CONF_GAIN = "gain"
CONF_PROFILES = "profiles"
CONF_TEXT_TYPE = "text_type"
CONF_TIMEOUT= "tts_timeout"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove tts_ prefix and use it in stt.py too.

@@ -25,7 +25,8 @@
"gain": "Default volume gain (in dB) of the voice",
"profiles": "Default audio profiles",
"text_type": "Default text type",
"stt_model": "STT model"
"stt_model": "STT model",
"tts_timeout": "TTS timeout, seconds",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove tts_ and TTS prefixes and use it in stt.py too.

@@ -25,7 +25,8 @@
"gain": "Default volume gain (in dB) of the voice",
"profiles": "Default audio profiles",
"text_type": "Default text type",
"stt_model": "STT model"
"stt_model": "STT model",
"tts_timeout": "TTS timeout, seconds",
Copy link
Member

@tronikos tronikos Feb 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency with other options, e.g. gain above, change to Timeout (in seconds)

@eslavnov eslavnov changed the title Make Google Cloud TTS timeout configurable Make Google Cloud STT/TTS timeout configurable Feb 21, 2025
@eslavnov eslavnov requested a review from tronikos February 21, 2025 13:18
@eslavnov eslavnov marked this pull request as ready for review February 21, 2025 13:18
@eslavnov eslavnov marked this pull request as draft February 21, 2025 13:23
@eslavnov eslavnov marked this pull request as ready for review February 21, 2025 13:28
@eslavnov
Copy link
Author

Hey @tronikos, thank you for the feedback, and pardon the delay - I got a bit distracted with a side quest :)
I've implemented all your proposed changes and updated the docs PR too - let me know if anything is missing!

@@ -215,8 +217,11 @@ async def _async_get_tts_audio(
),
)

response = await self._client.synthesize_speech(request, timeout=10)

timeout = options[CONF_TIMEOUT]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace it with options.get(CONF_TIMEOUT, DEFAULT_TIMEOUT) and inline it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, that was dumb of me - fixed, thanks!

@home-assistant home-assistant bot marked this pull request as draft February 22, 2025 08:51
@eslavnov eslavnov requested a review from tronikos February 22, 2025 09:58
@eslavnov eslavnov marked this pull request as ready for review February 22, 2025 09:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants