Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Impossible to upgrade to Cassandra 5.x.x #742

Open
aaarranz opened this issue Dec 27, 2024 · 3 comments
Open

Impossible to upgrade to Cassandra 5.x.x #742

aaarranz opened this issue Dec 27, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@aaarranz
Copy link

aaarranz commented Dec 27, 2024

What happened?

When trying to upgrade my CassandraDatacenter from version 4.1.2 to 5.0.2 I got this webhook validation error:

admission webhook "vcassandradatacenter.kb.io" denied the request: CassandraDatacenter write rejected, attempted to use unsupported Cassandra version '5.0.2'

The curious thing is I downgraded to cass-operator:v1.22.1 and the operation was allowed

What did you expect to happen?

the cluster to be automaticly updated to Cassandra version 5.0.2

How can we reproduce it (as minimally and precisely as possible)?

from a CassandraDatacenter with serverVersion: 4.1.2, edit it to serverVersion: 5.0.2

apiVersion: cassandra.datastax.com/v1beta1
kind: CassandraDatacenter
metadata:
  annotations:
    k8ssandra.io/resource-hash: kI/Eh2KZtPLeXJhkxMeOcnYI+ZSBytSzPLKDtBzO+6k=
  creationTimestamp: "2023-06-14T22:12:48Z"
  finalizers:
  - finalizer.cassandra.datastax.com
  generation: 3637
  labels:
    app.kubernetes.io/component: cassandra
    app.kubernetes.io/name: k8ssandra-operator
    app.kubernetes.io/part-of: k8ssandra
    k8ssandra.io/cleaned-up-by: k8ssandracluster-controller
    k8ssandra.io/cluster-name: cluster1
    k8ssandra.io/cluster-namespace: k8ssandra-operator
  name: dc2
  namespace: k8ssandra-operator
  resourceVersion: "1374935376"
  uid: 9b40472b-1599-49e4-b142-c44bf717a159

spec:
  additionalServiceConfig:
    additionalSeedService: {}
    allpodsService: {}
    dcService: {}
    nodePortService: {}
    seedService: {}
  clusterName: cluster1
  config:
    cassandra-env-sh:
      additional-jvm-opts:
      - -Dcassandra.allow_alter_rf_during_range_movement=true
      - -Dcassandra.system_distributed_replication=dc2:3
      - -Dcassandra.jmx.authorizer=org.apache.cassandra.auth.jmx.AuthorizationProxy
      - -Djava.security.auth.login.config=$CASSANDRA_HOME/conf/cassandra-jaas.config
      - -Dcassandra.jmx.remote.login.config=CassandraLogin
      - -Dcom.sun.management.jmxremote.authenticate=true
    cassandra-yaml:
      authenticator: PasswordAuthenticator
      authorizer: CassandraAuthorizer
      num_tokens: 16
      role_manager: CassandraRoleManager
    jvm-server-options:
      initial_heap_size: 1073741824
      max_heap_size: 1073741824
    jvm11-server-options:
      garbage_collector: G1GC
  configBuilderResources: {}
  managementApiAuth: {}
  podTemplateSpec:
    metadata: {}
    spec:
      containers:
      - env:
        - name: LOCAL_JMX
          value: "no"
        - name: METRIC_FILTERS
          value: deny:org.apache.cassandra.metrics.Table deny:org.apache.cassandra.metrics.table
            allow:org.apache.cassandra.metrics.table.live_ss_table_count allow:org.apache.cassandra.metrics.Table.LiveSSTableCount
            allow:org.apache.cassandra.metrics.table.live_disk_space_used allow:org.apache.cassandra.metrics.table.LiveDiskSpaceUsed
            allow:org.apache.cassandra.metrics.Table.Pending allow:org.apache.cassandra.metrics.Table.Memtable
            allow:org.apache.cassandra.metrics.Table.Compaction allow:org.apache.cassandra.metrics.table.read
            allow:org.apache.cassandra.metrics.table.write allow:org.apache.cassandra.metrics.table.range
            allow:org.apache.cassandra.metrics.table.coordinator allow:org.apache.cassandra.metrics.table.dropped_mutations
        - name: MANAGEMENT_API_HEAP_SIZE
          value: "67108864"
        name: cassandra
        resources: {}
      - env:
        - name: MEDUSA_MODE
          value: GRPC
        - name: MEDUSA_TMP_DIR
          value: /var/lib/cassandra
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: CQL_USERNAME
          valueFrom:
            secretKeyRef:
              key: username
              name: cluster1-medusa
        - name: CQL_PASSWORD
          valueFrom:
            secretKeyRef:
              key: password
              name: cluster1-medusa
        image: docker.io/k8ssandra/medusa:0.22.2
        imagePullPolicy: IfNotPresent
        livenessProbe:
          exec:
            command:
            - /bin/grpc_health_probe
            - --addr=:50051
          failureThreshold: 10
          initialDelaySeconds: 10
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        name: medusa
        ports:
        - containerPort: 50051
          name: grpc
          protocol: TCP
        readinessProbe:
          exec:
            command:
            - /bin/grpc_health_probe
            - --addr=:50051
          failureThreshold: 10
          initialDelaySeconds: 10
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        resources:
          limits:
            memory: 8Gi
          requests:
            cpu: 100m
            memory: 100Mi
        volumeMounts:
        - mountPath: /etc/cassandra
          name: server-config
        - mountPath: /var/lib/cassandra
          name: server-data
        - mountPath: /etc/medusa
          name: cluster1-medusa
        - mountPath: /etc/podinfo
          name: podinfo
        - mountPath: /etc/medusa-secrets
          name: k8ssandra-staging-medusa-key
      initContainers:
      - name: server-config-init
        resources: {}
      - env:
        - name: MEDUSA_MODE
          value: RESTORE
        - name: MEDUSA_TMP_DIR
          value: /var/lib/cassandra
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: CQL_USERNAME
          valueFrom:
            secretKeyRef:
              key: username
              name: cluster1-medusa
        - name: CQL_PASSWORD
          valueFrom:
            secretKeyRef:
              key: password
              name: cluster1-medusa
        image: docker.io/k8ssandra/medusa:0.22.2
        imagePullPolicy: IfNotPresent
        name: medusa-restore
        resources:
          limits:
            memory: 8Gi
          requests:
            cpu: 100m
            memory: 100Mi
        volumeMounts:
        - mountPath: /etc/cassandra
          name: server-config
        - mountPath: /var/lib/cassandra
          name: server-data
        - mountPath: /etc/medusa
          name: cluster1-medusa
        - mountPath: /etc/podinfo
          name: podinfo
        - mountPath: /etc/medusa-secrets
          name: k8ssandra-staging-medusa-key
      volumes:
      - configMap:
          name: cluster1-medusa
        name: cluster1-medusa
      - name: k8ssandra-staging-medusa-key
        secret:
          secretName: k8ssandra-staging-medusa-key
      - downwardAPI:
          items:
          - fieldRef:
              fieldPath: metadata.labels
            path: labels
        name: podinfo
  resources:
    limits:
      cpu: "1"
      memory: 3Gi
    requests:
      cpu: 500m
      memory: 1500Mi
  serverType: cassandra
  serverVersion: 4.1.2
  size: 3
  storageConfig:
    additionalVolumes:
    - mountPath: /opt/management-api/configs
      name: metrics-agent-config
      volumeSource:
        configMap:
          items:
          - key: metrics-collector.yaml
            path: metrics-collector.yaml
          name: cluster1-dc2-metrics-agent-config
    cassandraDataVolumeClaimSpec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 30Gi
      storageClassName: standard
  superuserSecretName: cluster1-superuser
  systemLoggerResources: {}
  tolerations:
  - effect: NoSchedule
    key: nextpax-stateful-app
    operator: Equal
    value: non-spot
  users:
  - secretName: cluster1-superuser
    superuser: true
  - secretName: cluster1-medusa
    superuser: true

cass-operator:v1.22.4

cass-operator version

v1.22.4

Kubernetes version

v1.30.6

Method of installation

kubectl edit CassandraDatacenter

Anything else we need to know?

No response

┆Issue is synchronized with this Jira Story by Unito
┆Issue Number: CASS-84

@aaarranz aaarranz added the bug Something isn't working label Dec 27, 2024
@burmanm
Copy link
Contributor

burmanm commented Jan 13, 2025

You probably have two installations of cass-operator in the same cluster and the validation was failed by an older one. "Downgrading" it probably overwrote the older webhook's priority.

@aaarranz
Copy link
Author

Thanks @burmanm, you were right.

We had a legacy cass-operator installation in another namespace, and it seems its ValidatingWebhookConfiguration was still active and applied to all datacenters regardless of the namespace.

This appears to have caused conflicts between the different cass-operator versions running in different Kubernetes namespaces.

After manually removing the legacy ValidatingWebhookConfiguration, I was able to successfully upgrade my cluster.

@burmanm
Copy link
Contributor

burmanm commented Jan 13, 2025

Yeah, there are some cases in Kubernetes which makes the per-namespace installation of operators a bit tricky to manage. If the cluster can use a cluster-scoped installation, I'd recommend that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
No open projects
Status: No status
Development

No branches or pull requests

2 participants