Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZIP file is not recreated if running in a separate stage #37

Closed
ocervell opened this issue Nov 27, 2019 · 5 comments · Fixed by #38
Closed

ZIP file is not recreated if running in a separate stage #37

ocervell opened this issue Nov 27, 2019 · 5 comments · Fixed by #38
Assignees
Labels
bug Something isn't working P3 medium priority issues triaged Scoped and ready for work

Comments

@ocervell
Copy link

ocervell commented Nov 27, 2019

When I modify a file in the source_directory, I find that the zip file is not recreated when one of the files in the source directory has changed.

It seems we fixed a part of this when fixing #32 with PR #35 by forcing update on the object in the GCS bucket - but the archive_file (zip) itself is not being updated, i.e in the end not forcing the update of the Cloud Function.

The upstream issue for failing to update the archive_file is here: hashicorp/terraform-provider-archive#39. It seems it's linked to running plan and apply in a different context (think a Cloud Build pipeline composed of two configs that effectively 'forgets' about the .terraform/ folder between plan and apply).

I see 3 ways we can work-around this in this module (while waiting for an upstream fix):

  • compute the hash of all files in the source directory and force the recomputing of the archive_file resource based on if any hash in the source_directory has changed; but I'm not certain how to do this for all files in the source_directory (for one file, it would be straightforward - for multiple, we'll need a custom shell script to list out all the files in the directory).

  • have a random id appended to the zip filename that would be forced to regenerate when we need it, i.e by passing e.g a force_update variable to the module.

  • advertise that plan and apply should be run in the same context, which would make some users struggle.

Any help is appreciated here :)

@morgante morgante changed the title ZIP file is not recreated ZIP file is not recreated if running in a separate stage Nov 27, 2019
@morgante
Copy link
Contributor

I'm not sure if we should try to work around what is clearly a bug in the upstream provider.

Given that, we should go with option 3. Plan can still be run in its own context, but you should run a full apply (instead of just applying the plan file).

@aaron-lane aaron-lane added bug Something isn't working triaged Scoped and ready for work labels Nov 27, 2019
@aaron-lane aaron-lane added P3 medium priority issues and removed triaged Scoped and ready for work labels Dec 5, 2019
@ocervell
Copy link
Author

ocervell commented Dec 7, 2019

Actually, even when doing a plan and apply, I still get the issue.

To trigger the recreation of the archive, I have to either:

  • Manually remove the previously created archive (rm <path/to/archive>.zip) prior to running apply.
  • Manually run terraform taint on the Cloud Function resource prior to running apply.

terraform-google-slo module
For my specific use case, I'm using this module from terraform-google-scheduled-function, itself called by terraform-google-slo, so the zip file is buried deep in .terraform/modules/ which makes it hard to remove using a script or a pipeline. The SLO module is almost useless if I cannot re-deploy the Cloud Function when the code change (part of my code is an SLO JSON config that can be regularly updated).

Workaround 1
We could simply have an archive_path_suffix input variable passed to the output_path in data.archive_file.main which would force recreating the archive because its path has changed.

Workaround 2
This one has the advantage of not changing the interface of our module at all.

The following code computes the hash of all files under the folder passed, and adds it to the zip filepath to force archive recreation:

data "external" "hash" {
  program = ["bash", "${path.module}/scripts/shasum.sh", "${path.module}/configs", "${timestamp()}"]
}

data "archive_file" "main" {
  type        = "zip"
  output_path = pathexpand("archive-${data.external.hash.result.shasum}.zip")
  source_dir  = pathexpand("${path.module}/configs")
}

The shasum.sh script looks like this:

#!/bin/bash

FOLDER_PATH=${1%/}
SHASUM=$(shasum $FOLDER_PATH/* | shasum | awk '{print $1}')
echo -n "{\"shasum\":\"${SHASUM}\"}"

Workaround 3
This one doesn't need a custom shell script at all, so it might be better than all of the above:

resource "random_uuid" "main" {
  keepers = {
    for filename in fileset("${path.module}/configs", "**/*"):
    filename => filemd5("${path.module}/${filename}")
  }
}

data "archive_file" "main" {
  type              = "zip"
  output_path = pathexpand("archive-${random_uuid.main.result}.zip")
  source_dir    = pathexpand("${path.module}/configs")
}

@aaron-lane
Copy link
Contributor

I think workaround 3 seems reasonable. We need to keep in mind it should be based on var.source_directory rather than a hard-coded module pathname.

@taylorludwig
Copy link
Contributor

@ocervell I tried to reproduce this but having a hard time reproducing the same scenario.

Using our simple examples the archive is re-created for me doing either a plan or apply if any files in the source_directory change. And actually, even looking stat of the file's creation and modified time the archive is re-created on each plan or apply even when there are no file changes.

It's happening as one of the first steps during an apply/plan, prior to any resources being evaluated for changes - as expected since that's when data providers typically happen.

Are you by chance having files in the source_directory generated by other terraform resources?

I see you mention using this module from within terraform-google-scheduled-function and terraform-google-slo as well as this comment in the parent issue hashicorp/terraform-provider-archive#39 (comment)

If that is the case - it makes sense to me. Order during any apply/plan would be

  1. Archive is created
  2. Your other resource adds files to the source_directory
  3. Object and function are created

Since the archive is created prior to step 2 it would require a subsequent apply to get the correct function created with the up to date source files.

One option I could see us doing is adding an optional variable archive_depends_on that gets passed into the depends_on block of the archive_file.
Terraform does support this - but it has the negative side effects as explained in the docs (terraform would never converge since it would never know if the archive was up to date until after the apply logic of your other resources finished.)

I also tried Workaround 3 but even with files being created during a terraform apply - the only way I could guarantee the files be in the zip was to add an explicit depends_on to the archive_file.

It feels like ultimately archive_file should be a resource and not a data provider - since its generating a changed file and not just reading external data (similar to the local file resource).

@taylorludwig
Copy link
Contributor

Looking a bit further into archive_file and workarounds I came across this

hashicorp/terraform-provider-archive#11 (comment)

Sounds like a valid solution that can solve the various problems, especially if we have a file_triggers or module_depends_on input var we can pass onto the null_resource for the dependency order.

I'll give it a try tomorrow, as well as in the slo module and see how it goes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P3 medium priority issues triaged Scoped and ready for work
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants