Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new cluster stuck in CREATE_IN_PROGRESS using auto_scaling_enabled=true #429

Open
Lappihuan opened this issue Sep 16, 2024 · 7 comments
Open

Comments

@Lappihuan
Copy link

creating a new cluster, in my case with v1.30.4 is stuck in CREATE_IN_PROGRESS using auto_scaling_enabled=true without auto_scaling_enabled it goes into CREATE_COMPLETE and HEALTHY.

there is no autoscaler pods in the magnum-system namespace, nor do i see a release in the helm-releases.
i don't know how to confirm this but i suspect this chart is way out of date, so it never goes past this line.

for v1.30.x the supported version of cluster-autoscaler would be 9.37.0+
https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler#releases

@MaximMonin
Copy link

My cluster goes to HEALTHY with v.1.30.7 and auto_scaling_enabled=true, but autoscaling pods not started due to scheduling issue. To fix it all control plane cluster nodes should be labeled with:
kubectl label node controlnodename openstack-control-plane=enabled

@nguyenhuukhoi
Copy link
Contributor

I have this problem too but kubectl label node controlnodename openstack-control-plane=enabled not working for me,

@nguyenhuukhoi
Copy link
Contributor

I can do it if I set openstack-control-plane=enabled label on worker nodes.

@MaximMonin
Copy link

I can do it if I set openstack-control-plane=enabled label on worker nodes.

It seems to me that you do not quite understand the architecture of the solution - autoscaler is created on the management cluster.

root@os-stage-ctl-b:/home/ubuntu# kubectl get pods -A
NAMESPACE                           NAME                                                             READY   STATUS      RESTARTS     AGE
capi-kubeadm-bootstrap-system       capi-kubeadm-bootstrap-controller-manager-647c4d77dc-cgc2p       1/1     Running     0            9d
capi-kubeadm-control-plane-system   capi-kubeadm-control-plane-controller-manager-67fc9db87c-p7d8t   1/1     Running     0            9d
capi-system                         capi-controller-manager-685f8c946f-6hhjr                         1/1     Running     0            9d
capo-system                         capo-controller-manager-6bdf5576d4-smqpx                         1/1     Running     1 (9d ago)   9d
cert-manager                        cert-manager-5c887c889d-7fh4r                                    1/1     Running     0            9d
cert-manager                        cert-manager-cainjector-58f6855565-mmznl                         1/1     Running     0            9d
cert-manager                        cert-manager-webhook-6647d6545d-bqc7g                            1/1     Running     0            9d
kube-system                         coredns-ccb96694c-dfcxr                                          1/1     Running     1 (9d ago)   9d
kube-system                         local-path-provisioner-5cf85fd84d-nws4s                          1/1     Running     1 (9d ago)   9d
magnum-system                       kube-rgjp9-autoscaler-697dcb57c8-hhkns                           1/1     Running     0            2d
root@os-stage-ctl-b:/home/ubuntu# kubectl get nodes
NAME             STATUS   ROLES                       AGE   VERSION
os-stage-ctl-a   Ready    control-plane,etcd,master   9d    v1.31.4+k3s1
os-stage-ctl-b   Ready    control-plane,etcd,master   9d    v1.31.4+k3s1
os-stage-ctl-c   Ready    control-plane,etcd,master   9d    v1.31.4+k3s1

@nguyenhuukhoi
Copy link
Contributor

nguyenhuukhoi commented Jan 8, 2025 via email

@MaximMonin
Copy link

What you mean? I see that cause by autoscale cannot schedule on controlplane node. I mean management cluster.

Ok, it seems it depends on management cluster installation. k3s ha cluster in my case enable to schedule it to os-stage-ctl-* control-plane (k3s uses it for worker nodes too)

@nguyenhuukhoi
Copy link
Contributor

Hello, from https://github.com/openstack/magnum/blob/1c3d7d070b60a36ccfb7c753b26f12609d818cec/magnum/drivers/common/templates/kubernetes/fragments/enable-auto-scaling.sh#L124C3-L135C29

It allowed us to schedule on controlplane node with control-plane=enabled label.

But I see that

https://github.com/vexxhost/magnum-cluster-api/blob/main/magnum_cluster_api/resources.py#L94

does have tolerations allow us schedule on controlplane.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants