-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pod IP not removed from Service EndPoint when ReadinessProbe failed #3725
Comments
You have to add more details and a reproducer, is not easy to understand from the comments what can be failing there |
I apologize, what I wish to say is that the Pod IP was not remove from the service endpoint. I use a Nginx Deployment with a ReadinessProbe with this container :
and a service like :
and when this ReadinessProbe failed, the Pod IP is shown "NotReadyAddress" in the EndPoint :
BUT the Pod IP 10.32.204.60 was not removed from de Service Endpoints :
With Kind 0.23 and kindest/node:1.30.2, everything is OK, the Pod IP is removed from the Service EndPoints when the ReadinessProbe failed Is my english clear ? |
Just to understand, this works in kubernetes versions 1.30 and 1.31, only fails with Node 1.31.0 ? |
After further investigations, I found that whatever kubernetes version is, the problem seems to be Virtualbox environnement.
An explanation ? |
Is the kubectl the same version? What difference make for kind running on top of virtual box or a VM, it just used docker container? Are you doing something out of the ordinary? Adding custom nodes or different kind configuration? |
Kubectl is the same version |
Do you observe this without calico? We don't really provide support for third party CNI (it's supported to be possible to install it, but we're not tracking down bugs with all of them). |
With calico/cilium/kindnet i've the same behavior I've tried this simple
with one Control-Plane and three Workers. |
can you upload a tarball with the logs of the cluster that has the issue with |
full-logs.tar.gz |
Manifest used
Alternatively I create/destroy /usr/share/nginx/html/healthz to act on ReadinessProbe. |
that does not adds up, the ngninx container starts at 18:44
and there is no more logs after that, you have period 2 and threshold 2, so it should start failing at 18:44:05 but there are no logs there I noticed that your environment has only 2 GB of ram in the VM, it would not be surprising that the problem is that your VMs are constrained and everything is slower on that environment |
I'm so sorry to waste your time, but the problem remains the same with 8GB ! The time was around 11:40/11:50 UTC.
|
@bob2204 is like the kubelet is continuously restarting ... if you have the cluster running can you verify that? |
None of the three kubelets is continuously restarting.
|
I am also having same problem , is your problem solved ?? kubernetes version : v1.31.2 |
I have the same problem, even if the pod is not ready, its IP address is being added to the service endpoints. Client Version: v1.31.5 kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
name: dev-cluster
nodes:
- role: control-plane
- role: worker
- role: worker apiVersion: v1
kind: Pod
metadata:
name: my-pod
labels:
app: my-app
spec:
terminationGracePeriodSeconds: 1
containers:
- name: probe-demo
image: nginx
startupProbe:
httpGet:
path: /
port: 80
periodSeconds: 1
failureThreshold: 30
livenessProbe:
httpGet:
path: /live.html
port: 80
periodSeconds: 10
failureThreshold: 30
readinessProbe:
httpGet:
path: /ready.html
port: 80
periodSeconds: 10
failureThreshold: 20
---
apiVersion: v1
kind: Service
metadata:
name: my-app
spec:
selector:
app: my-app
ports:
- port: 80
targetPort: 80 |
I think this is starting to become a magnet for symptoms that are not necessarily having the same root cause, kind does not do anything exceptional to kubernetes components, so all this should be open on kubernetes repo, besides I will be likely the one traiging them I will make an exception with this last one @chsakell you can do |
Here's the logs exported with the following commands: kind export logs --name dev-cluster
tar zcvf kind-logs.tar.gz . |
This would be a bug in the main Kubernetes project, service endpoints and pods are not implemented here. We implement cluster bootstrapping, a default PV driver, and NetworkPolicy / pod network (NOT endpoints / services, the network bridges / node to node pod IP routing) GitHub.com/kubernetes/kubernetes I don't mind discussing here but there's a better chance of finding the root issue if it's reported to the project. Also, please aim for a minimal reproducer to help contributors find the cause quickly. EG does it still happen with a single node? If so then use that. Aside: more generally, unless you're implementing distributed behaviors related to multi-node I highly recommend single node clusters, for simplicity, reduced overhead, and not over-reporting the host's resources which are ultimately shared by the nodes. |
Like BenTheElder, I think it's a bug/feature of K8S. I've the same behavior with a vanilla Cluster 1.31 and 1.32. Kind it's not guilty ;-) |
Is there any issue in traffic routing due to that or will traffic be routed to only ready pods ? |
@chsakell something is odd with your environment, I see pods like kindnet restart or the probe pod you run also fail probes and is restarted .... let's not complicate this issue more, kind is not doing anything special with endpoints or pod readiness, so if you have a repro please open an issue in kubernetes/kubernetes with all the details and exact steps and tag me there /close |
@aojea: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Hello
With Kind 0.24 and Node 1.31.0 the Pod IP is not removed from Service EndPoint when ReadinessProbe failed, although noticed NotReadyAddress in EndPoint !
This was fine wih kind 0.23 and Node 1.30.2
Is this normal ?
Best Regards
The text was updated successfully, but these errors were encountered: