Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create pipeline to push zip file with dependencies to an S3 bucket #683

Closed
constanca-m opened this issue Apr 11, 2024 · 3 comments · Fixed by #689
Closed

Create pipeline to push zip file with dependencies to an S3 bucket #683

constanca-m opened this issue Apr 11, 2024 · 3 comments · Fixed by #689
Assignees

Comments

@constanca-m
Copy link
Contributor

constanca-m commented Apr 11, 2024

Description

This issue comes from this comment thread of a PR to use terraform to install ESF.

The current approach for the terraform files:

  • Download this repository files based on the release version
  • Use this handler for the lambda function handler by using a docker environment and installing all dependencies

The desired approach: have all dependencies in a zip file and push this to an S3 bucket.

Steps

Step 1

Create a new buildkite pipeline in this directory.

Each version release (or commit?) triggers the creation of a new zip file with all the dependencies. This zip file needs to be pushed to an S3 bucket that will be used by customers. The S3 bucket needs to be read only.

The zip file will have the following structure:

  • It will have a directory for each needed package mentioned in requirements.txt. Package is generated through the use of command pip install --target ./package <REQUIREMENT>.
  • It will have the file with the code for the handler at the root. Because this file points to others, we will push the handlers directory as well.

Reference: https://docs.aws.amazon.com/lambda/latest/dg/python-package.html#python-package-create-dependencies.

Step 2
Refactor the terraform files:

  • Remove the need to download this repository.
  • Remove the module for lambda.
  • Insert a new resource aws_lambda_function that reads from the S3 bucket with the zip file:
resource "aws_lambda_function" "esf" {
// stuff

  s3_bucket = aws_s3_bucket.esf_bucket.id
  s3_key    = aws_s3_object.esf_zip_bundle.key

// stuff
}

Originally posted by @girodav in elastic/terraform-elastic-esf#1 (comment)

Tasks

  1. Add new workflow to automate release: [Github workflow] Add new tag on version.py update #685
  2. Trigger new workflow to push dependencies to S3 bucket in case of new release: Upload dependencies to S3 bucket on new tag release #689
  3. Add new dependencies: Add new dependencies to zip files: share, shippers and storage #692
@constanca-m constanca-m self-assigned this Apr 11, 2024
@constanca-m
Copy link
Contributor Author

Hey @girodav and @axw , can I have your thoughts on this to make sure everything is correct?

@girodav
Copy link
Contributor

girodav commented Apr 12, 2024

Hey Constança, thanks for opening this issue. Some comments below.

Create a new buildkite pipeline in this directory.

I don't think there is any need to create a Buildkite pipeline, since ESF does not need to be released as part of the Elastic stack. So feel free to keep using Github Actions as we already do, unless you find some benefit in moving to Buildkite.

Each version release (or commit?) triggers the creation of a new zip file with all the dependencies. This zip file needs to be pushed to an S3 bucket that will be used by customers.

We currently track releases with git tags, so the workflow could be triggered by the creation of a new git tag. We also track the version in version.py, which is currently updated manually. There is already a related issue about how to automate updates on this file and how to handle version bumps in general #540. I'd consider it as a preliminary task for this issue.

I would also make sure that the solution is extensible enough to be able to add automated deployment to SAR as well, in a future release.

Remove the module for lambda.
Insert a new resource aws_lambda_function that reads from the S3 bucket with the zip file:

This is more like an option, the current AWS Lambda Terraform module terraform-aws-modules/lambda/aws can still be used if it simplifies the implementation. It just need to be modified to use pre-built packages stored on S3

https://registry.terraform.io/modules/terraform-aws-modules/lambda/aws/latest#lambda-function-with-existing-package-prebuilt-stored-in-s3-bucket

Where should the S3 bucket be placed? Under which account? In any specific region?

You can use the same account where we store SAR artifacts.

Do all packages used in import statements in the handlers files need to be in the dependency zip?

Technically no, the AWS Lambda Python runtime already includes some of them (e.g boto3). However, we should stick to what is on requirements.txt to be sure to use the same versions everywhere. You should include only the dependencies used at runtime (i.e only requirements.txt). This part also depends on whether #204 is going to be prioritized or not.

@constanca-m
Copy link
Contributor Author

Thank you @girodav for such a detailed answer. I am working on setting a workflow on github actions like you mentioned. It seems a bit tricky to test, so I will do it in a private repository first and then I will open a PR and link it to this issue as well as to #540.

It won't be taking of the SAR currently but it seems easy to adapt the workflow if setting the right trigger:

on:
  push:
    branches:
      - 'main'
    paths:
      - 'version.py'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants