Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade 1.7.2 to 1.8.0 at Ubuntu hanging due to missing allow-change-held-packages flag #3289

Open
toschneck opened this issue Jun 28, 2024 · 14 comments
Labels
customer-request sig/cluster-management Denotes a PR or issue as being assigned to SIG Cluster Management. triage/not-reproducible Indicates an issue can not be reproduced as described.

Comments

@toschneck
Copy link
Member

What happened?

During an upgrade of an customer environment the following error message appeared and blocked the upgrade:
Zoom Meeting 2024-06-28 15-57-32

Env:

  • BareMetal
  • Single Node
  • Kubeone 1.7.2

Expected behavior

kubeone set the flag --allow-change-held-packages during the upgrade procedure call of apt install ... kubelet kubeadm ...

How to reproduce the issue?

pot. setup kubeone 1.7 cluster single node, then try to upgrade with kubeone 1.8

What KubeOne version are you using?

$ kubeone version
{
  "kubeone": {
    "major": "1",
    "minor": "7",
    "gitVersion": "1.7.2",
    "gitCommit": "00fd09d91da76e307f016afb3b4f42ad6281eb2c",
    "gitTreeState": "",
    "buildDate": "2024-01-05T15:30:12Z",
    "goVersion": "go1.21.3",
    "compiler": "gc",
    "platform": "linux/amd64"
  },
  "machine_controller": {
    "major": "1",
    "minor": "57",
    "gitVersion": "v1.57.4",
    "gitCommit": "",
    "gitTreeState": "",
    "buildDate": "",
    "goVersion": "",
    "compiler": "",
    "platform": "linux/amd64"
  }
}

Provide your KubeOneCluster manifest here (if applicable)

# paste manifest here
apiVersion: kubeone.k8c.io/v1beta2
kind: KubeOneCluster
name: first-cluster
versions:
  kubernetes: '1.28.9'
cloudProvider:
  none: {}

controlPlane:
  hosts:
    - publicAddress: ''
      privateAddress: '10.49.3.94'
      sshUsername: user 
      #sshPrivateKeyFile: '.ssh/id_rsa'
      taints: []
        #- key: "node-role.kubernetes.io/master"
        #  effect: "NoSchedule"

          #staticWorkers:
          #  hosts:
          #    - publicAddress: '1.2.3.5'
          #      privateAddress: '172.18.0.2'
          #      sshUsername: root
          #      sshPrivateKeyFile: '/home/me/.ssh/id_rsa'

# Provide the external address of your load balancer or the public addresses of
# the first control plane nodes.
apiEndpoint:
  host: ''
  port: 6443

machineController:
  deploy: false
addons:
  enable: true
  path: "./addons"
#  addons:
#  - name: default-storage-class
helmReleases:
   # releaseName can be omitted, in that case it will defaulted to .chart
  - releaseName: metallb
    # chart is a required field, simply a chart name to deploy
    chart: metallb
    # where to find the chart, chan be a remote server or local directory.
    # --repo flag of the `helm upgrade` command.
    repoURL: https://metallb.github.io/metallb
    # namespace to deploy helm release to. --namespace flag of the
    # `helm upgrade` command.
    namespace: metallb-system
    # use specific version instead of latest available, which is highly
    # recommended, but version can be omitted. --version flag of the
    # `helm upgrade` command.
    version: 0.14.5
    # provide optional overrides for chart values, --values flag of the
    # `helm upgrade` command
    values:
      #- valuesFile: values.yaml   # it can be a file
      - inline:                       # or directly specified inline YAML
          ignoreExcludeLB: true

# needed for kvirt
clusterNetwork:
  cni:
    canal: {}
  podSubnet: 10.250.0.0/16
  serviceSubnet: 172.50.0.0/16

What cloud provider are you running on?

Bare Metal

What operating system are you running in your cluster?

Ubuntu 22.04.04

Additional information

@toschneck toschneck added kind/bug Categorizes issue or PR as related to a bug. sig/cluster-management Denotes a PR or issue as being assigned to SIG Cluster Management. labels Jun 28, 2024
@toschneck
Copy link
Member Author

/label customer-request

@toschneck
Copy link
Member Author

maybe related to #1578

@toschneck
Copy link
Member Author

@xmudrii the translation of this is:

Package lists are read...
Dependency tree is built.
Status information is read in
The following packages were installed automatically and are no longer required:
    .... See screenshot for list of packages
Use "sudo apt autoremove" to remove them.
The following retained packages will be modified:
cri-tools kubeadm kubectl kubelet
The following packages will be upgraded:
cri-tools kuheadm kubectl kubelet
The status of held packages was changed and -y was added without --allow-change-held-packages
4 updated, 0 reinstalled, 0 to be removed and 35 not updated.
ssh. installing kubeadm

the most relevant one is

The status of held packages was changed and -y was added without --allow-change-held-packages

Somehow the changed packages are the one of kubeone, I guess maybe some broken/stopped former kubeone applycould caused it. Anyway if we add to the apt install ... command the flag --allow-change-held-packages it would fix it. Maybe only some Ubuntu version or apt config requires it, but I think it's safe as in this command we want to change it by intend.
man apt-get 8

        --allow-change-held-packages
           Force yes; this is a dangerous option that will cause apt to continue without prompting if it is changing held packages. It should not be
           used except in very special situations. Using it can potentially destroy your system! Configuration Item:
           APT::Get::allow-change-held-packages. Introduced in APT 1.1.

a maybe safe way could be also to only set this flag when you use kubeone apply --force-install or kubeone apply --force-upgrade

@kron4eg
Copy link
Member

kron4eg commented Jul 10, 2024

I'm unable to reproduce the issue. Initialized kubernetes 1.27.15 with kubeone 1.7.2. Then upgraded to
kubernetes 1.28.11 with kubeone 1.8.1. Everything worked perfectly normal.

apiVersion: kubeone.k8c.io/v1beta2
kind: KubeOneCluster

versions:
  kubernetes: <VERSION>

cloudProvider:
  external: true

addons:
  enable: true
  addons:
  - name: default-storage-class

@kron4eg kron4eg added triage/not-reproducible Indicates an issue can not be reproduced as described. and removed kind/bug Categorizes issue or PR as related to a bug. labels Jul 10, 2024
@dgsponer
Copy link

i can reproduce the issue on ubuntu.
to upgrade the cluster, i did this as prework:
sudo apt-get install --no-install-recommends -y 'kubelet=1.30.3-' 'kubeadm=1.30.3-' 'kubectl=1.30.3-*' kubernetes-cni cri-tools --allow-change-held-packages
means just add --allow-change-held-packages

@xmudrii
Copy link
Member

xmudrii commented Jul 25, 2024

@dgsponer Can you please briefly explain what steps did you take exactly to reproduce this issue? I'm wondering about:

  • Did you upgrade KubeOne prior to running kubeone apply, and if yes, which version did you use and which version do you use now?
  • From which Kubernetes version did you upgrade your cluster?
  • What Ubuntu version do you use?

@dgsponer
Copy link

Hi, sure

Plain Ubuntu Server 24.04 LTS installed per LiveCD.
OpenSSH enabled
User enabled in the sudoers to suppress the password

cluster installed with kubeone 1.8.0 and kubernetes 1.29.6.

today tested for upgrading to 1.30.3. hit the same issue.
after kubeone installed the repo, i injected this between the process:
sudo apt-get install --no-install-recommends -y 'kubelet=1.30.3-' 'kubeadm=1.30.3-' 'kubectl=1.30.3-*' kubernetes-cni cri-tools --allow-change-held-packages

then the upgrade was fine

@xmudrii
Copy link
Member

xmudrii commented Jul 25, 2024

As a side note, Ubuntu 24.04 is not yet supported by KubeOne, it'll be supported in the next minor release (1.9). We'll try to reproduce the issue again and see if we can find something. Did you by any chance do a system/packages upgrade between installing the cluster and upgrading it?

@dgsponer
Copy link

we are in close contact with some guys from you for a project. we work on the bootstrap for ubuntu and have some machines ready to try reproduce there. i will inform you.

@kron4eg
Copy link
Member

kron4eg commented Jul 25, 2024

I wonder if kubeone reset has been used in between?

@dgsponer
Copy link

dgsponer commented Jul 29, 2024

Hi kron4eg,

nope, i didn't.
i found the time to setup a 22.04.4 cluster with 1.29.6 and redo the same procedure. On 22.04.4 is not message and the upgrade goes well.

After short research: in 24.04. apt mark few packages as hold.

apt-mark showhold
containerd.io
cri-tools
kubeadm
kubectl
kubelet
kubernetes-cni

as workaround for 24.04.:

apt-mark showhold > hold
sudo apt-mark unhold $(cat hold)

DO THE UPGRADE

sudo apt-mark hold $(cat hold)

@kron4eg
Copy link
Member

kron4eg commented Jul 29, 2024

Yes, those packages are correctly held. This is how it's suppose to be. During the upgrade they will be unheld.

containerd.io
cri-tools
kubeadm
kubectl
kubelet
kubernetes-cni

@kubermatic-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.
After a furter 30 days, they will turn rotten.
Mark the issue as fresh with /remove-lifecycle stale.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@kubermatic-bot kubermatic-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 28, 2024
@xmudrii
Copy link
Member

xmudrii commented Oct 28, 2024

/remove-lifecycle stale

@kubermatic-bot kubermatic-bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
customer-request sig/cluster-management Denotes a PR or issue as being assigned to SIG Cluster Management. triage/not-reproducible Indicates an issue can not be reproduced as described.
Projects
None yet
Development

No branches or pull requests

5 participants