-
-
Notifications
You must be signed in to change notification settings - Fork 32.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MQTT not reconnecting if disconnected from broker #31057
Comments
Hey there @home-assistant/core, mind taking a look at this issue as its been labeled with a integration ( |
Suspect the behavior is related to the phao.mqtt lib after looking at this issue and this issue. |
could you find any solution? I still suffer from this. |
No solution yet as far as I can tell. It's really annoying because any time my mqtt broker restarts, home-assistant 'gives up' immediately instead of properly attempting reconnects. The only recourse is to manually restart home-assistant to force a fresh connection to the mqtt broker. It's a non-ideal user experience. |
@balloob do you happen to have any thoughts on this or hints about digging into a root-cause? I can pretty easily reproduce this issue simply by restarting my mqtt broker. |
Looking through the code, it is apparent that _mqtt_on_disconnect is being called: def _mqtt_on_disconnect(self, _mqttc, _userdata, result_code: int) -> None:
"""Disconnected callback."""
self.connected = False
# When disconnected because of calling disconnect()
if result_code == 0:
return
tries = 0
while True:
try:
if self._mqttc.reconnect() == 0:
self.connected = True
_LOGGER.info("Successfully reconnected to the MQTT server")
break
except OSError:
pass
wait_time = min(2 ** tries, MAX_RECONNECT_WAIT)
_LOGGER.warning(
"Disconnected from MQTT (%s). Trying to reconnect in %s s",
result_code,
wait_time,
)
# It is ok to sleep here as we are in the MQTT thread.
time.sleep(wait_time)
tries += 1 .. The observed behavior from home-assistant logs is that Home Assistant first logs,
which seems to correspond with Immediately after that home-assistant logs,
Which corresponds to a call in _mqtt_on_connect: import paho.mqtt.client as mqtt
if result_code != mqtt.CONNACK_ACCEPTED:
_LOGGER.error(
"Unable to connect to the MQTT broker: %s",
mqtt.connack_string(result_code),
)
self._mqttc.disconnect()
return What appears to be happening is that the It seems to reason that the What do you all think? |
I did a lot of different local testing with changes implemented in
|
@billimek Which external MQTT server is used, what version of it, and how do you restart it? |
@erikkastelec the external MQTT server/broker is vernemq 1.10.2 (latest) running in cluster (high availability) mode with 3 replicas. They are running as three different kubernetes pods and the restart is to restart one at a time until each are healthy before restarting the others. Home Assistant, and all my other MQTT clients, connect to it via a single LoadBalancerIP. I can say with confidence that I can reproduce this 100% of the time. Interestingly, the unexpected behavior with restarts of the vernemq MQTT server are not present with all of the other MQTT client library implementations that I'm also running. In other words, these other MQTT clients properly detect a down connection and attempt to reconnect in an expected way when a multi-node vernemq MQTT cluster is restarted:
Since my last comment, I did further experiments whereby I ran vernemq as just a single-replica MQTT server and the restart behavior from the paho client (Home Assistant) was different: the disconnect resulting from a restart went down the correct path of reconnecting attempts until the vernemq server was able to handle requests once again. |
Thanks! The reconnect logic in Home Assistant does not seem correct, I'll have a stab at improving it, it would be great if you could help test it. |
Would you mind giving this a try: emontnemery@c265c1e ? |
Hey, that did it! When the vernemq cluster pods began to restart, the homeassistant client connection dropped as expected. The new code yielded the following log statements in home-assistant,
And home assistant automatically reconnected and is behaving as expected. I assume paho has reconnect logic built-in. This looks great! I did add a comment to the commit about the log message. |
Great, thanks for testing! Right, paho has reconnect logic built in, and the reconnect logic in HA was broken. |
Seems like reconnection is no longer working. Anyone else having this problem? |
Home Assistant release with the issue: 0.104.3
Last working Home Assistant release (if known): not known
Operating environment (Hass.io/Docker/Windows/etc.): docker
Integration: MQTT Broker
Description of problem:
When the external MQTT broker service is restarted, Home Assistant doesn't seem to reconnect, or seems to give up too quickly. The only remedy to re-establish connection with the MQTT broker is to restart Home Assistant itself which seems unnecessary.
I see the following in the home assistant logs, which suggests that it reconnects and then one second later sees a connection refused and then appears to stop trying anything further related to MQTT:
For comparison, I observed the behavior of three other different MQTT client applications at the same time the MQTT broker was being restarted:
Perhaps Home Assistant needs some sort of retry/backoff logic in situations when it loses connection to an MQTT broker instead of giving up the first time it fails to reconnect? If I can puzzle this out, I'm happy to raise a PR.
Problem-relevant
configuration.yaml
entries and (fill out even if it seems unimportant):Additional information:
I already looked though #8589 from 2-3 years ago as well as #10133 but Home-Assistant doesn't appear to be reconnecting in situations when the MQTT broker restarts.
The text was updated successfully, but these errors were encountered: