Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: StatefulSets are not reevaluated after rescheduling #67

Merged

Conversation

miguelbernadi
Copy link
Contributor

Context

Noe caches the Pods it has already processed to avoid reprocessing
them unless there are changes. We do this by storing the
NamespacedName of the Pod, which includes the namespace and Pod
name.

For Deployments, rescheduling the Pod actually changes its name as the
termination is generated. For StatefulSets, the name is kept the same
no matter how many reschedules happen.

Our caching approach prevents us form reevaluating the location of
StatefulSet pods if they were previously scheduled, and we've seen
this allocate StatefulSet pods to nodes whose architectures where not
supported by the Pod.

Bugfix approach

The intention of the cache is to reduce load over Registries and Noe
itself. In our case so far StatefulSets are much less common than
Deployments, so we choose to avoid the cache for StatefulSets.

This is a stopgap measure to prevent misscheduling of current
workloads until a better approach to prevent reprocessing of Pods can
be devised that supports StatefulSets.

What we change

We add a skipCache(*v1.Pod) function that checks if the Pod is owned
by a StatefulSet. If so, the caching mechanism is skipped for this
workload.

@miguelbernadi
Copy link
Contributor Author

First commit is #66 and will be removed once that PR is merged.

pkg/controllers/pod.go Outdated Show resolved Hide resolved
pkg/controllers/pod.go Outdated Show resolved Hide resolved
 # Context

Noe caches the Pods it has already processed to avoid reprocessing
them unless there are changes. We do this by storing the
`NamespacedName` of the Pod, which includes the namespace and Pod
name.

For Deployments, rescheduling the Pod actually changes its name as the
termination is generated. For StatefulSets, the name is kept the same
no matter how many reschedules happen.

Our caching approach prevents us form reevaluating the location of
StatefulSet pods if they were previously scheduled, and we've seen
this allocate StatefulSet pods to nodes whose architectures where not
supported by the Pod.

 # Bugfix approach

The intention of the cache is to reduce load over Registries and Noe
itself. In our case so far StatefulSets are much less common than
Deployments, so we choose to avoid the cache for StatefulSets.

This is a stopgap measure to prevent misscheduling of current
workloads until a better approach to prevent reprocessing of Pods can
be devised that supports StatefulSets.

 # What we change

We add a `skipCache(*v1.Pod)` function that checks if the Pod is owned
by a StatefulSet. If so, the caching mechanism is skipped for this
workload.

Change-Id: Ie7bdf2d863f433e7998b1b92cea129c6bda4dd8f
@miguelbernadi miguelbernadi force-pushed the fix-statefulset-reschedule-bug branch from c5e2458 to 33ae035 Compare January 8, 2024 11:40
@joaoqalves joaoqalves self-requested a review January 8, 2024 11:45
@miguelbernadi miguelbernadi merged commit 8957c00 into adevinta:main Jan 8, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants