-
Notifications
You must be signed in to change notification settings - Fork 286
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docker network connection time outs to host over time #8861
Comments
This seems to improve if you use the recommended |
Further update on this. While the above does prolong the deterioration, it still eventually happens. After 4-5 days, timeouts start occurring at increasing frequency, with it eventually reaching a point where timeouts are happening on almost every few calls, requiring a full restart of WSL and Docker to function. |
We have the same issue
We have a service running on the host. If i try to hit host.docker.internal from within a linux container i can always get it to trip up eventually after say 5000 curl requests to http:\host.docker.internal\service (it timesout for one request) If i try http:\host.docker.internal\service from the host, it works flawlessly even after 10000 curl requests Sometimes, intermittently, and we can't find out why, it starts to fail much more frequently (like maybe every 100 curl requests) Something is up with the networking... |
In my limited testing, i created a loopback adapter and used it instead. I created an ip 10.0.75.2 and used it instead. It's much more reliable. It's an ugly work around but it might work at least to help show where the issue might be. |
Hey guys, this is still happening pretty consistently. Is anyone looking at the reliability/performance of these things? Is this the wrong place to post this? |
I was able to send this via their support and have them reproduce the issue. They diagnosed the cause, but said it would involve some major refactoring, so they didn't have a target fix date. Below is the issue as mentioned by them
|
Woah, thanks for this update @rg9400. Glad you got it on their radar. So your work around is to restart docker and wsl --shutdown? I've been trying to use another IP (loopback adapter) as opposed to host.docker.internal or whatever host.docker.internal points to. But I'm not 100% sure that solves the problem permanently. Maybe its just a new IP so it will work for a little and then deteriorate again over time. Based on your explanation of the root cause, that might indeed be the case. |
Yeah, for now I am just living with it and restarting WSL/Docker every now and then when the connection timeouts become too frequent and unbearable. |
What can we do to get this worked on. Is there work happening on it? or a ticket we can follow? This still bugs us quite consistently. |
I want to keep this thread alive as this is a massive pain for folks especially because they don't know its happening. This needs to become more reliable. Here is a newer diagnostic id: F4D29FA0-6778-40B8-B312-BADEA278BB3B/20210521171355 Also discovered that just killing vpnkit.exe in task manager reduces the problem. It restarts almost instantly and connections resume much better without having to restart containers or anything. But problem eventually reoccurs. |
We have about 15 services in our docker-compose file and all of them do an I'm not using the
But non of this seems to change the behavior |
This happens on macOS too, in fact quite reliably after ~7 minutes and ~13,000 requests of hitting a HTTP server: Server:
Client (siege):
Output:
What's interesting is that it gets progressively worse from there, the timeouts happen more and more frequently. Restarting the HTTP server doesn't help, but restarting it on another port does (e.g. from 8019 -> 8020). From there you get another 7 minutes of 100% success before it starts degrading again. I tried adding an IP alias to my loopback adapter and hitting that instead of |
This issue remains unresolved. The devs indicated it required major rework, but I haven't heard back from them in 6 months on the progress. |
Issues go stale after 90 days of inactivity. Prevent issues from auto-closing with an If this issue is safe to close now please do so. Send feedback to Docker Community Slack channels #docker-for-mac or #docker-for-windows. |
/remove-lifecycle stale |
I am also affected by this issue. I thought at one point it was because of TCP keepalive on sockets, and the sockets not being closed as fast as they are opened, thus a exhausting the max number of available sockets. But the problem doesn't go away even if my containers stop opening connections for a while, only a restart of docker and wsl seems to fix this. |
I cannot connect from a container to a host port even using telnet. I tried to guess host IP, but also I tried this: Telnet connection from host machine to this host port does work well. In previous Docker versions it was working fine! Seems it's broken since some update maybe from 2021-2022. |
Upd. |
Having this exact problem on MacOS. Restarting Docker fixes the problem (for a while). |
We have reports of this occurring across teams on Windows and macOS as well. We have no reports of this issue occuring on Linux. Someone noticed that on macOS, simply waiting ~15mins often alleviates the problem. |
We're also experiencing this (using host.docker.internal) on Docker Desktop for Windows. Strangely enough, Docker version up to 4.5.1 seem to work fine, but versions 4.6.x and 4.7.x instantly bring up the problem. Connections work for some time, but then the timeouts start. All checks of |
I'm experiencing the same problem with increasing amount of timeouts over time while using host.docker.internal. |
I'm also experiencing the same problem. Downgrade to 4.5.1 looks that solves the issue. |
I suspect I am running into this at the moment. If having to restart the VM that Docker is running in, rebooting in essence, is not a blocker, what is? Hardware damage? |
This is absolutely a blocker for me, as I cannot run scheduled tasks reliably. |
The following workaround resolved the issue for me |
Adding an archive in case the post or site goes down. |
While this is useful information, I am not sure that it's actually related to this bug. The error described in the post is "Connection reset by peer." However, the problem in this issue is "Connection timed out." The exact error may differ depending on which software you're using, but the key thing is that you send packets that just never arrive. The connection isn't reset, it just stops moving data and effectively becomes There are reproduction steps here, and I'm happy to be proven wrong. If someone can run the Python reproduction above and confirm that the problem doesn't occur on recent versions of Docker Desktop with the idle time set to If you would like changes in the behavior of |
I also replied a few months ago with that fix, and my problem was a connection time out for an nginx reverse proxy and PING command, not a connection reset. |
I'm thinking this is a port saturation issue, similar to what's described here. I recently restarted my Docker service, but once the problem crops up again, I'll try going through some of these troubleshooting steps. |
I'm about 90% sure this issue applies to me as well, but it's devilishly difficult to tell for sure. I'll refer to a tool for reproduction that I wrote in my observations below:
|
We are running Docker version We are running a node-red flow which queries a mssql server every 5 minutes, and randomly the connection to the SQL server just gets a 30000ms timeout, the next attempt will be successful.. |
We are experiencing same issue, almost every 10 minutes, SQL queries from our containers getting slower, then it resolves until the next 10 minute period. Docker Desktop version v4.17.0 is there any update on this? |
I had also been experiencing this for several months. Doing this workaround appears to have fixed the issue. |
Got this issue with windows 11 on WSL and Docker version 23.0.3, build 3e7cbfd We are running to server so this error becomes untenable. |
Please note that an experimental build of vpnkit has been released in this parallel issue which attempts to resolve what may be the underlying problem here. Users experiencing this should install the experimental builds if possible and feed back to @djs55 in the vpnkit issue as to whether the problem is resolved, and if you notice any side effects. |
Per my testing of the experimental build, the issue is significantly improved but not resolved. There are still timeouts, just a lot less. When running thousands of curls, I still notice stuck handshakes that don't instantly close but take a minute or two to resolve. The difference is that most such instances do clear out before the timeout. I just still wanted to confirm that the connections still are getting stuck even if the overall symptoms are a lot better |
I believe I am facing this same problem on MacOS Sonoma 14.1.1, running Docker Desktop for Mac (Apple Silicon) 4.25.2. I want to try downgrading to 4.5.0 (it's insane the issue is going on that long). Does anybody have an install file? The oldest available here is 4.9.1. EDIT: Docker Desktop for MacOS (Apple Silicon) can be downloaded here. EDIT2: Confirmed, downgrading fixed the issue. I’ve been running with stable connections for weeks now. |
Facing the same issue on Debian 12. Checked ufw logs and whitelisted container's IP address with |
Pretty stuck on this as I am not using docker desktop, only docker engine on ubuntu. Reverting to 4.5.0 (docker engine 20.10.12) breaks everything, so if anyone has other workarounds lmk |
If you’re running on Linux I don’t think you’ll experience this exact issue. It seems to only happen when running Docker on MacOS or Windows. |
Expected behavior
I would expect services running inside Docker containers in a WSL backend to be able to reliably communicate with applications running on the host, even with frequent polling
Actual behavior
Due to #8590, I have to run some applications that require high download speeds on the host. I have multiple applications inside Docker containers running inside a Docker bridge network that poll this application every few seconds. When launching WSL, the applications are able to communicate reliably, but this connection deteriorates over time, and after 1-2 days, I notice frequent
connection timed out
responses from the application running on the host. Runningwsl --shutdown
and restarting the Docker daemon fixes the issue temporarily. Shifting applications out of Docker and onto the host fixes their communication issues as well. It may be related to the overall network issues linked above.To be clear, it can still connect. It just starts timing out more and more often the longer the network/containers have been up.
Information
I have had this problem ever since starting to use Docker for Windows with the WSL2 backend.
Steps to reproduce the behavior
The text was updated successfully, but these errors were encountered: