Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sam/2 stage ami nix #953

Closed
wants to merge 116 commits into from
Closed

Sam/2 stage ami nix #953

wants to merge 116 commits into from

Conversation

samrose
Copy link
Contributor

@samrose samrose commented Apr 22, 2024

This PR will need minor follow up prior to approval/merge for github actions that are dedicated to specifically merging to develop

Documentation of changes in #953

Conventional AMI approach

The existing/conventional AMI build approach installs postgres from the postgresql-common ubuntu/debian package at the time of the AMI build. In addition, it builds extensions, and wrappers from source at the point of AMI build, and installs them as ‘.deb’ packages.

Flowcharts (3)

Nix packaged postgresql bundle approach

In the nix approach, we use the postgresql provided by nixpkgs (currently pinned at version 15.6 vi a76c4553d7e741e17f289224eda135423de0491d commit of nixpkgs-unstable branch locked via https://github.com/supabase/postgres/blob/sam/2-stage-ami-nix/flake.lock#L114 )

Nixpkgs sources from https://github.com/NixOS/nixpkgs/blob/a76c4553d7e741e17f289224eda135423de0491d/pkgs/servers/sql/postgresql/generic.nix#L52 ← this URL

the nixpkgs package applies the following patches for aarch64-linux pg 15.6

https://github.com/NixOS/nixpkgs/blob/a76c4553d7e741e17f289224eda135423de0491d/pkgs/servers/sql/postgresql/patches/disable-resolve_symlinks.patch

https://github.com/NixOS/nixpkgs/blob/a76c4553d7e741e17f289224eda135423de0491d/pkgs/servers/sql/postgresql/patches/less-is-more.patch

https://github.com/NixOS/nixpkgs/blob/a76c4553d7e741e17f289224eda135423de0491d/pkgs/servers/sql/postgresql/patches/hardcode-pgxs-path.patch

https://github.com/NixOS/nixpkgs/blob/a76c4553d7e741e17f289224eda135423de0491d/pkgs/servers/sql/postgresql/patches/specify_pkglibdir_at_runtime.patch

https://github.com/NixOS/nixpkgs/blob/a76c4553d7e741e17f289224eda135423de0491d/pkgs/servers/sql/postgresql/patches/findstring.patch

https://github.com/NixOS/nixpkgs/blob/a76c4553d7e741e17f289224eda135423de0491d/pkgs/servers/sql/postgresql/patches/locale-binary-path.patch

https://github.com/NixOS/nixpkgs/blob/a76c4553d7e741e17f289224eda135423de0491d/pkgs/servers/sql/postgresql/patches/socketdir-in-run-13.patch

When a PR is submitted to the supabase/postgres repo updating any of the nix packages maintained there, a build of the entire bundle is triggered on supported systems (x86_64-linux and aarch64-linux as of this writing). When this the nix ci workflow is initiated, nix is able to source from our binary cache (currently located in a publicly readable aws s3 bucket at https://nix-postgres-artifacts.s3.amazonaws.com ) and will check for any component dependency which has an exact match and has already successfully built. Nix will source that built version from the cache, and only build the items that have changed. If nix cannot build a changed item, the build will fail. If the build succeeds, nix will perform flake “checks” (scripted tests with dependencies managed by nix). An example of the “check” is seen here

Our CI implmentation of nix has only 2 trusted public keys and 2 specified nix caches (ours and the upstream nixpkgs community cache https://github.com/supabase/postgres/blob/sam/2-stage-ami-nix/docker/nix/Dockerfile#L5 and on the AMI at https://github.com/supabase/postgres/blob/sam/2-stage-ami-nix/scripts/nix-provision.sh#L20 )

https://github.com/supabase/postgres/actions/runs/9468138054/job/26083806922?pr=953#step:6:813 this starts the database, and enables several extensions post-build. If this test fails, the build will also fail. If the nix build and check succeeds, the build will upload the artifacts to the nix cache for re-use prior to stopping this workflow.

Flowcharts (2)

In the debian/ubuntu postgresql-common package, the “postgres” user is created, and postgres is installed to locations that are conventional for debian/ubuntu. In the nix approach, we explicitly create the “postgres” linux user, and then we use the nix profile install method to install the nix-built binaries for postgres, into the nix profile for the “postgres” user (located a /home/postgres/.nixprofile on the ami machine). We then alias the installed file locations to the conventional debian/ubuntu locations for postgres installation. nix profile command will give us an imperative way to install, uninstall, and upgrade packages that we build with nix going forward, allowing us to integrate our nix-built packages with debian/ubuntu distributions.

2 Stage AMI approach

The Ansible and Packer code has been forked in parallel in the same repo, so that both the nix-built approach, and the existing ubuntu/debian package approach can be supported in paralell. This will allow continued production rollouts under the old method, while also allowing targeted rollouts with the nix build AMI.

The existing build lives under the same ansible folder, and the companion packer hcl files have been retained. The parallel nix AMI build has parallel packer files with nix inserted into the name, and a new folder ansible-nix . Both of these builds use the same command line command recipe to initiate them.

Description of 2 stage approach

The previous packer/ansible build used the https://developer.hashicorp.com/packer/integrations/hashicorp/amazon/latest/components/builder/ebssurrogate exclusively. The new nix-based retains the ebssurrogate approach to build and configure everything except for the postgres bundle.

The nature of nix builds is that they are already “sandboxed” and isolated at build time, and the results are store in a read only directory called the “nix store”. Nix has never had the need to support building in chroot as ebssurrogate packer build does, and so running nix in chroot has never been supported for these reasons. Therefore, a second stage of the AMI build was introduced, that securely sources the private “stage1” AMI built by the stage 1 ebssurrogate approach, and then installs the nix built suapbase postgres/extensions/wrappers bundle from binary cache using the conventional github.com/hashicorp/amazon packer plugin, and limited to installing, configuring and testing postgres from either files uploaded in the first stage, or sourced from nix cache (other than stage 2 ansible playbook and unit test files). The workflow that performs these 2 stages is located here https://github.com/supabase/postgres/blob/sam/2-stage-ami-nix/.github/workflows/ami-release-nix.yml As more Supabase projects are packaged in nix, they will be moved into this 2nd stage for installation and configuration. In the 2nd stage we run migration and unit tests, and linux user/group assignment checks with a temporarily installed copy of osquery https://github.com/supabase/postgres/blob/sam/2-stage-ami-nix/ansible-nix/tasks/stage2/playbook.yml#L71 and https://github.com/supabase/postgres/blob/sam/2-stage-ami-nix/ansible-nix/files/permission_check.py

The 2nd stage also creates path aliases to the nix-installed binaries so that files and configurations are still where they are expected to be as much as possible. This allows post-AMI-build init scripts like https://github.com/supabase/infrastructure/blob/develop/init-scripts/project/00-init.sh to continue to succeed in running.

We are maintaining documentation on how to work with the nix portion of supabase/postgres at https://github.com/supabase/postgres/tree/sam/2-stage-ami-nix/nix/docs and will continue to expand that as much as possible.

Current Progress on adoption in https://github.com/supabase/postgres

There is an umbrella draft PR at #953 which includes the building of an aarch64-linux AMI

Docker image PR #986

Docker AIO Image PR #987

The Ansible and Packer code has been forked in parallel in the same repo, so that both the nix-built approach, and the existing ubuntu/debian package approach can be supported in paralell. This will allow continued production rollouts under the old method, while also allowing targeted rollouts with the nix build AMI.

samrose added 30 commits April 19, 2024 17:21
@samrose samrose force-pushed the sam/2-stage-ami-nix branch 2 times, most recently from f178fde to 6f43499 Compare June 12, 2024 21:24
@samrose samrose marked this pull request as ready for review June 12, 2024 23:55
@samrose samrose requested review from a team as code owners June 12, 2024 23:55
Copy link
Contributor

@pashkinelfe pashkinelfe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see comments on the PR code.
Also, I see many supabase machinery is copied into *nix directories. (Maybe with changes). It's not clear to me how to be sure these remain identical in view of possible future commits into supabase root and *nix.

@samrose
Copy link
Contributor Author

samrose commented Jun 14, 2024

Please see comments on the PR code. Also, I see many supabase machinery is copied into *nix directories. (Maybe with changes). It's not clear to me how to be sure these remain identical in view of possible future commits into supabase root and *nix.

@pashkinelfe good point, and thanks for the review. The version file ( the vars) will need to be non-duplicated, and the *nix folder would source from the single source there. This way versions will remain in sync. The build process otherwise is different (with some similarities) and this parallel ami generation and configuration are intended to be relatively temporary, with the goal of merging them and being able to temporarily roll out either version.

If there is some reason they cannot be merged in a reasonable amount of time, I agree that a refactor should target smarter deduplication for long term maintainability

Copy link
Contributor

@darora darora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming there are no changes between ansible and ansible-nix.
Worth adding a CI check that if one of them is updated, the other one should be updated too? Someone or the other's going to forget otherwise, and they'll go out of sync.

-e POSTGRES_PASSWORD=${{ env.POSTGRES_PASSWORD }} \
-p ${{ env.POSTGRES_PORT }}:5432 \
--name supabase_postgres \
-d supabase/postgres:latest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this testing an upstream build, rather than the one build locally?

Would prefer to also use an exact version string rather than latest, just in case it starts falling back to a public image somehow

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (and all docker/docker-aio work) is going to be moved to PR that will follow #1012

@samrose
Copy link
Contributor Author

samrose commented Jun 24, 2024

I'm assuming there are no changes between ansible and ansible-nix.

@darora just fwiw there are indeed a number of changes between ansible and ansible-nix dirs. In the documentation of changes above I wrote

The Ansible and Packer code has been forked in parallel in the same repo, so that both the nix-built approach, and the existing ubuntu/debian package approach can be supported in paralell. This will allow continued production rollouts under the old method, while also allowing targeted rollouts with the nix build AMI.

The goal is to make it possible to deploy either one in the short term, then to eventually merge them into one once this all stabilizes.

@darora wrote

Worth adding a CI check that if one of them is updated, the other one should be updated too? Someone or the other's going to forget otherwise, and they'll go out of sync.

I agree I should create a CI check, because it could be in some cases that changes should be addressed in both. I am going to consolidate the vars file right now so that there will only be one of those.

@samrose
Copy link
Contributor Author

samrose commented Jun 25, 2024

This PR is superseded by pr #1012 please continue review there once that PR has been removed from draft mode

@samrose samrose closed this Jun 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants