Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change post-install to pre-install for create-upgrade-db job #139

Closed

Conversation

dedene
Copy link

@dedene dedene commented Dec 19, 2024

What this PR does / why we need it:

I could not get a fresh install running with following helm values (and an external postgresql cluster):

zabbixServer:
  enabled: true
  zabbixServerHA:
    enabled: true
postgresql:
  enabled: false

After some investigation, when zabbixServerHA is enabled, the zabbix server pod is waiting to become active with the initContainer init-wait-for-database-schema, but the job to create the initial schema has the helm hook set to post-install. As the helm installer does not complete because the zabbix server pod does not become active and healthy, it does not work without manually inserting this upgrade-db job.

I believe changing the helm hook annotation from post-install to pre-install will fix this.

Checklist

[Place an '[x]' (no spaces) in all applicable fields. Please remove unrelated fields.]

  • DCO signed
  • Variables are documented in values.yaml with Helm-Docs comments, as we build README.md using this utility

@dedene dedene requested a review from aeciopires as a code owner December 19, 2024 09:20
@aeciopires
Copy link
Member

aeciopires commented Dec 19, 2024

Hello @dedene!

Thanks for your contribution. But I think the problem is another.

I reported to @fibbs that we can not download the Docker image registry.inqbeo.de/zabbix-dev/zabbix-server-ha-label-manager when the zabbixServer.zabbixServerHA.enabled is true.

At this moment, I recommend disabling HA mode until this Docker Image is available in the new Docker Registry.

@aeciopires aeciopires added bug Something isn't working good first issue Good for newcomers labels Dec 19, 2024
@aeciopires aeciopires requested a review from fibbs December 19, 2024 14:57
@dedene
Copy link
Author

dedene commented Dec 19, 2024

@aeciopires Actually it did: what I observed was that the job was not being created by Helm. After manually creating the job

# Source: zabbix/templates/job-create-upgrade-db.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: zabbix-create-upgrade-db
  labels:
    app: zabbix-create-upgrade-db
    app.kubernetes.io/name: create-upgrade-db
    helm.sh/chart: zabbix-6.1.2
    app.kubernetes.io/instance: zabbix-create-upgrade-db
    app.kubernetes.io/managed-by: Helm-create-upgrade-db
  annotations:
    "helm.sh/hook": post-install,post-upgrade
    "helm.sh/hook-weight": "-5"
    "helm.sh/hook-delete-policy": hook-succeeded
spec:
  backoffLimit: 0
  ttlSecondsAfterFinished: 120
  template:
    spec:
      serviceAccountName: zabbix-ha-helper
      containers:
      - name: create-upgrade-db
        image: "registry.inqbeo.de/zabbix-dev/zabbix-server-create-upgrade-db:7.0-latest"
        imagePullPolicy: IfNotPresent
        securityContext:
          {}
        resources:
          {}
        env:
          - name: ZBX_SERVER_DEPLOYMENT_NAME
            value: zabbix-zabbix-server


          - name: DB_SERVER_HOST
            valueFrom:
              secretKeyRef:
                name: zabbixdb-pguser-zabbix
                key: host
          - name: DB_SERVER_PORT
            valueFrom:
              secretKeyRef:
                name: zabbixdb-pguser-zabbix
                key: port
                optional: true
          - name: POSTGRES_USER
            valueFrom:
              secretKeyRef:
                name: zabbixdb-pguser-zabbix
                key: user
                optional: true
          - name: POSTGRES_PASSWORD
            valueFrom:
              secretKeyRef:
                name: zabbixdb-pguser-zabbix
                key: password
          - name: POSTGRES_DB
            valueFrom:
              secretKeyRef:
                name: zabbixdb-pguser-zabbix
                key: dbname
                optional: true
        args:
          - init_and_upgrade_db
      imagePullSecrets:
      restartPolicy: Never

it worked. Also for me docker pull registry.inqbeo.de/zabbix-dev/zabbix-server-create-upgrade-db:7.0-latest works.

@aeciopires
Copy link
Member

aeciopires commented Dec 19, 2024

Hum... interesting

I tested the docker pull http://registry.inqbeo.de/zabbix-dev/zabbix-server-create-upgrade-db:7.0-latest command and it is working, but last week it didn't work. I saw image not found.

Thanks for sharing more details.

I will wait for @fibbs' opinion before merging his fix, because he was implemented this job.

@fibbs
Copy link
Contributor

fibbs commented Dec 19, 2024

Hum... interesting

I tested the docker pull http://registry.inqbeo.de/zabbix-dev/zabbix-server-create-upgrade-db:7.0-latest command and it is working, but last week it didn't work. I saw image not found.

Thanks for sharing more details.

I will wait for @fibbs' opinion before merging his fix, because he was implemented this job.

You may have tested 7.2-latest which indeed wasn't there until about a day after appearal of the 7.2 server images.

@fibbs
Copy link
Contributor

fibbs commented Dec 19, 2024

Let me please test your desired modification locally, as I believe I had a good reason for implementing it as a post-install hook, but I don't remember the reason why it wouldn't work that way.
Indeed, if you use helm with the --wait flag, the described problem may happen, but in a normal installation case it wouldn't happen. Give me until the weekend, I will check.

@fibbs
Copy link
Contributor

fibbs commented Dec 30, 2024

Hey, took some time for me to get back into this. Christmas time has been more busy than expected, as every year...

I did a simple test and I can confirm the error reported by you, even though everything works just fine when simply using helm install as follows:

  • deploy a simple Postgresql pod (I did this to have the same starting situation as described by you)
  • create a secret with host, password, username, db name etc.
  • use values.yaml below to deploy zabbix with the command helm install zabbix2 -f zabbix-values_minimal.yaml zabbix-community/zabbix

As expected, at least by myself, the following happened:

  1. all manifests get deployed
  2. after some seconds, the zabbix_server pod starts into "init" phase and waits there
  3. Job "zabbix2-create-upgrade-db" starts, detects successful connection to the DB, and does its job
  4. After that, zabbix_server pod(s) connect to the DB, everything fine!

When using helm install --wait ..., I have exactly the problem described by you. And by the way, the same happens with a simple ArgoCD deployment (see #140, I am testing with Argo because of it), the Job does not get started at all, ArgoCD wants the deployment to be "ready and running" before it starts the post-install job.

Having the issue confirmed, I am wondering whether just changing the job to be a pre-install one will work or lead into other problems. I remember I have chosen pre-install because of wanting to make sure at the time of database initialization no (probably previously running) server containers are trying to access the db anymore, but this would clearly be a Upgrade situation which shouldn't interfere with the Install situation.

Will do some more tests....

@aeciopires
Copy link
Member

Congratulations on the troubleshooting, @fibbs. Well done
A simple word can affect a deploy/update flow.
I'm curious to see the final result

@fibbs
Copy link
Contributor

fibbs commented Dec 30, 2024

I have created another PR #142 which will fix this problem. I am sorry this PR will have to be closed and not accepted, but the changes made in mine were necessary so the entire Helm Chart would continue to work.

@aeciopires
Copy link
Member

Congratulations @fibbs!
I will review your PR now.
We can mention jose in the changelog of the next release. Do you agree?

@fibbs
Copy link
Contributor

fibbs commented Dec 31, 2024

As mentioned, I am closing this in favor of my other pull request. If any more problems during installation, please open an Issue first. It's easier to refer to an issue rather than to a PR.

@fibbs fibbs closed this Dec 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants