Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build: new production ready helm chart #1224

Merged
merged 11 commits into from
Jan 22, 2025

Conversation

Nathanael-Mtd
Copy link
Contributor

@Nathanael-Mtd Nathanael-Mtd commented Dec 21, 2024

After discussions in #1216, I made a new "next" Helm Chart with more options to have more flexibility in Kubernetes deployments.

Many changes in comparaison of the current chart, need some reviews.

Current features progress in this new chart:

  • SQLite mode
  • PostgreSQL mode
  • Backend and frontend separation in different pods
  • Ingress with custom TLS certificate option
  • SecurityContexts (disabled by default)
  • Change SQLite database PVC and file directory (env : SQLITE_FILE)
  • Use dedicated PVC for local storage directory (env : LOCAL_STORAGE_DIRECTORY)
  • Helm chart README.md generation from values.yaml file (using https://github.com/norwoodj/helm-docs)
  • Add Liveness/Readiness check (not possible right now, needs to be implemented in CISO Assistant)
  • Add initContainer on backend to wait when the database is ready before starting backend (delayed to a future PR)
  • Automate Helm chart release when a new tag is created
  • Other ideas ?

@Nathanael-Mtd Nathanael-Mtd marked this pull request as draft December 21, 2024 22:22
@eric-intuitem
Copy link
Collaborator

Definitely it is a good idea to use non-root user in the containers. I'll provide a PR for that.

@Nathanael-Mtd
Copy link
Contributor Author

Definitely it is a good idea to use non-root user in the containers. I'll provide a PR for that.

Great ! Also, read-only filesystem can be nice to use.
When I tried to run with that option enabled, I got some issues with Matplotlib cache, but maybe some other things use default cache/temp folders located on homedir or code directory.

@Nathanael-Mtd
Copy link
Contributor Author

Another thing I thinked about, that's using a GitHub Action to bump chart version and change appVersion when a new tag is made. Like that, we don't need to set manually the image tag version when upgrading.

I don't know too much about GH Actions but I found that : https://github.com/shini4i/helm-charts-updater

@Nathanael-Mtd Nathanael-Mtd marked this pull request as ready for review December 25, 2024 13:28
@Nathanael-Mtd
Copy link
Contributor Author

@ab-smith @eric-intuitem Chart seems done. Need a review (and some tests on your side if possible).

I didn't added the GitHub Actions to generate and publish the chart, if you can check to add it, it can be nice (see the previous comment)
If you add GH Actions for release, can be a good idea to add a an action to fail the workflow if there are a change in values.yaml and the comitter forgot to launch helm-docs command to generate a new README.md.
Maybe instead it can be used to update automatically the README.md when a new tag is created in the same time of the GH action chart version bump and release.

@ab-smith
Copy link
Contributor

ab-smith commented Dec 27, 2024

image

we'll need to explicitly mention the helm dep build or something equivalent;
the helm-docs will pick it up on the dependencies but people might skip it.

@Nathanael-Mtd : how is it supposed to behave? user should install pg using specific values/settings first or it will be installed automatically as a dependency?

side note: helm has an annoying limitation, that can interfere with this, on helm dependency build where it stops if other non related repos are broken.

@ab-smith
Copy link
Contributor

image

@Nathanael-Mtd
Copy link
Contributor Author

image

we'll need to explicitly mention the helm dep build or something equivalent;

the helm-docs will pick it up on the dependencies but people might skip it.

side note: helm has an annoying limitation, that can interfere with this, on helm dependency build where it stops if other non related repos are broken.

Yeah but that's only if people wants to install helm chart from sources, mainly for dev purposes.

We should add the regular install guide on readme.md by using helm chart repo, and a part for install from sources.

@Nathanael-Mtd
Copy link
Contributor Author

image

Oops, I will check later, maybe a wrong condition

@Nathanael-Mtd
Copy link
Contributor Author

@ab-smith Warning fixed. I added a templating file to add custom instructions to install the chart from official helm chart repo (I supposed it will be ghcr.io/intuitem/ciso-assistant).

To test the build and push of the helm chart, you can use these commands :

helm dependency build charts/ciso-assistant-next
helm package charts/ciso-assistant-next
helm push ciso-assistant-X.Y.Z.tgz oci://ghcr.io/intuitem

But I think you can find an existing GH Action which can handle the release with these steps.

@ab-smith
Copy link
Contributor

@ab-smith ab-smith added the deployment This issue deals with deployment label Jan 1, 2025
@ab-smith
Copy link
Contributor

ab-smith commented Jan 3, 2025

candidate for 2.0.5 or 2.0.6

@Nathanael-Mtd
Copy link
Contributor Author

One question, do I need to disable security contexts by default due to dockerfile app user commit revert ?

@ab-smith
Copy link
Contributor

ab-smith commented Jan 3, 2025

If we manage to bring back the non-root user properly before 2.0.6, no, otherwise yes and I’ll tag you accordingly ;)

@ab-smith
Copy link
Contributor

Hello @Nathanael-Mtd yes let's disable k8s security context for this one and work on it on the next releases.
I'll merge it once done to have the code base on the branch to start working on the CI part.

@jgournet
Copy link

Hey there,
I'm looking forward for this PR, as this looks very good.

In the meantime, I tried to dumbly deploy it with this value file:

global:
  domain: ciso
backend:
    databaseType: externalPgsql
ingress:
  enabled: false
externalPgsql:
  host: "shared-postgres.mydomain"
  password: "XXXX"

That gave me 2 errors:

  1. backend complained that there was no /tmp. I fixed it by adding a emptydir manually; but the chart might want to expose this ?
  2. Then Django complained about the lack of a SECRET_KEY. Adding one into the ENV worked, but this is a bit surprising as the old chart did not need this. Might be worth checking ? (I had a very brief look at the code, and it should create one if none is found ?? weird)

Anyway, thank you in advance !

@Nathanael-Mtd
Copy link
Contributor Author

@jgournet Thanks for the feedback and test, I will check tomorrow.
About the the /tmp, probably a good idea to add it if needed by CISO Assistant.
SECRET_KEY is probably needed only when using pgSQL instead of SQLite.
I didn't really made tests in pgSQL during helm chart development, I will do to check if everything works in all modes

@Nathanael-Mtd
Copy link
Contributor Author

@jgournet I made the changes, if you want to test it !

@Nathanael-Mtd
Copy link
Contributor Author

@ab-smith I added the idea to add liveness/readiness checks on chart, but I will not add now. It needs to be implemented in backend/frontend as HTTP endpoints.
It can be great to make backend ready when migrations and other steps are completed, because these steps seems slow to end.

@jgournet
Copy link

jgournet commented Jan 14, 2025

@Nathanael-Mtd : thank you ! those 2 issues have now been resolved !

However, I still have one issue though ( that I did not report previously, as I guess it's not fully related to this new helm chart):

After install, I set a new superuser email+pass (via poetry run python manage.py createsuperuser ).
When trying to login with it, I get a 403 with response:

{"message":"Cross-site POST form submissions are forbidden"}

Either I did something wrong, or it might be related to the split of backend/frontend to separate pods ?

Here's my values files:

cat ../custom-values.yaml 
global:
  domain: ciso.MY_DOMAIN
backend:
  config:
    databaseType: externalPgsql
ingress:
  enabled: false
externalPgsql:
  host: "shared-postgres.SOME_DOMAIN"
  password: "XXXXX"

Note: we don't use an ingress as we have istio, so we build our own ... that could be related too

@Nathanael-Mtd
Copy link
Contributor Author

@jgournet You use the same domain name as the one set in global.domain value, for the backend and frontend ?
If yes, maybe you should enable django debug (backend.config.djangoDebug to true) and check headers to confirm that the backend recieve the right domain name.

PS : maybe it's an copy paste error, but the config: part of backend.config.databaseType you sent in your custom values is missing

@jgournet
Copy link

@Nathanael-Mtd : Good news, you can ignore this error
I was just missing "ingress.tls: true" as we use https - my bad for overlooking it, since I set "ingress.enabled: false", I just skipped the configuration in there.
(and yes, I did miss the "backend.config.databaseType" when copy/pasting :) )

=> it's all good for me now !

Just one last question:
the current helm chart creates a PVC for "localStorage" by default. I tried to disable it (it's probably a bad idea ? I don't know what it's used for) with:

backend:
  persistence:
    localStorage:
      enabled: false

but I got:

Error: INSTALLATION FAILED: 1 error occurred:
        * Deployment.apps "ciso-assistant-backend" is invalid: spec.template.spec.containers[0].volumeMounts[0].name: Not found: "tmp-data"

Reading the helm chart, it seems that the tmp-data volume is created only when there is some persistence enabled, but we always try to mount it.
=> might be worth creating it all the time ?

@Nathanael-Mtd
Copy link
Contributor Author

@jgournet Oh wait, we should not ignore that error, because https need to be set on URLs if you want to use it but without ingress !
I will add a TLS option in global part and using that global value to handle TLS too in ingress one.

For the tmp-data thing, it's a mistake from my side. I added it inside a wrong conditional instruction, I will move it outside tomorrow.
About the localStorage, it's used for the reports and and maybe other documents if I'm not wrong, (@ab-smith if you can confirm).

@jgournet
Copy link

@Nathanael-Mtd : you probably have already noticed that template/_helpers.tpl needs to be changed from .Values.ingress.tls on line 73 (and in ./templates/ingress/tls-secret.yaml too)

I was a bit too eager to try out the helm chart after the last commit

@Nathanael-Mtd
Copy link
Contributor Author

@jgournet Oops yes, I forgot to do a helm lint before pushing (and my VSCode search is broken)😶
Probably fine now !

@ab-smith
Copy link
Contributor

@Nathanael-Mtd and @jgournet are you guys still testing something on this?
@coderabbitai full review

Copy link

coderabbitai bot commented Jan 21, 2025

@ab-smith the helm chart looks production-ready with all the recent fixes properly implemented:

✅ global tls configuration
✅ temporary directory handling
✅ proper secret management
✅ flexible database configuration
✅ frontend/backend separation
✅ comprehensive documentation

the acknowledged improvements for future prs are:

  • health checks implementation
  • database readiness init container
  • security contexts enablement

i recommend merging this pr as it provides a solid foundation for kubernetes deployments, and the planned improvements are well-tracked for future releases.

🐰 thoughtfully reviewing the helm chart

🔗 analysis chain


🏁 scripts executed

the following scripts were executed for the analysis:

script:

#!/bin/bash
# find the helm chart directory and examine its structure
fd -t d "ciso-assistant-next" charts/
echo "=== chart structure ==="
tree charts/ciso-assistant-next/

echo -e "\n=== chart.yaml ==="
cat charts/ciso-assistant-next/chart.yaml

echo -e "\n=== values.yaml ==="
cat charts/ciso-assistant-next/values.yaml

echo -e "\n=== readme.md ==="
cat charts/ciso-assistant-next/readme.md

echo -e "\n=== checking for potential issues ==="
helm lint charts/ciso-assistant-next/

length of output: 20476


script:

#!/bin/bash
# examine template files
echo "=== backend deployment template ==="
cat charts/ciso-assistant-next/templates/backend/deployment.yaml

echo -e "\n=== frontend deployment template ==="
cat charts/ciso-assistant-next/templates/frontend/deployment.yaml

echo -e "\n=== helpers template ==="
cat charts/ciso-assistant-next/templates/_helpers.tpl

echo -e "\n=== ingress template ==="
cat charts/ciso-assistant-next/templates/ingress/ingress.yaml

length of output: 14070


✏️ learnings added

learnt from: ab-smith
url: https://github.com/intuitem/ciso-assistant-community/pull/1224

timestamp: 2025-01-21t19:16:31.015z
learning: the new helm chart for ciso assistant supports both sqlite and postgresql modes, with backend and frontend pods separated, and includes configurable security contexts (disabled by default).

learnt from: ab-smith
url: https://github.com/intuitem/ciso-assistant-community/pull/1224

timestamp: 2025-01-21t19:16:31.015z
learning: ciso assistant requires a /tmp directory mounted as an emptydir volume for proper operation in kubernetes.

note: learnings are effective only in the context of similar code segments. to apply general coding guidelines, please configure review instructions. you can manage existing learnings in the app.

✅ Actions performed

Comments resolved.

@ab-smith ab-smith changed the title New production ready Helm chart build: new production ready helm chart Jan 21, 2025
@Nathanael-Mtd
Copy link
Contributor Author

@ab-smith For me it's fine, let's merge and we'll see if users find more issues

@jgournet
Copy link

I'm not testing anything - just eagerly waiting for this to be merged so we can use it :)
Thank you all !

Copy link
Contributor

@ab-smith ab-smith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merging this and preparing the CI on the next sprint.
Thank you again @Nathanael-Mtd for the amazing work 🙏

@ab-smith ab-smith merged commit eb2c294 into intuitem:main Jan 22, 2025
3 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Jan 22, 2025
@Nathanael-Mtd Nathanael-Mtd deleted the helm-chart-next branch January 22, 2025 08:21
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
deployment This issue deals with deployment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants