Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Procedural resource discovery and management #33465

Open
froazin opened this issue Jul 2, 2023 · 7 comments
Open

Procedural resource discovery and management #33465

froazin opened this issue Jul 2, 2023 · 7 comments

Comments

@froazin
Copy link

froazin commented Jul 2, 2023

Terraform Version

Terraform v1.5.1
on linux_amd64
+ provider registry.terraform.io/hashicorp/http v3.4.0
+ provider registry.terraform.io/newrelic/newrelic v3.25.0
+ provider registry.terraform.io/opsgenie/opsgenie v0.6.26

Your version of Terraform is out of date! The latest version
is 1.5.2. You can update by downloading from https://www.terraform.io/downloads.html

Use Cases

To understand the use cases, it's important to understand a little about the implementation here: I have a couple of applications that procedurally generate *.auto.tfvars.json files by querying our CMDB prior to terraform being run in CI pipelines.

Procedurally claim and manage resources

Resources can be imported into CMDBs with discovery and other methods, these resources will already exist. In this use case, we not only discover these resources, but, where applicable, manage them too.

Build onto existing solutions at bulk

This use case fits best into those occasions where terraform is being used to manage CI's rather than traditional resources, such as Teams in OpsGenie or sub-accounts in New Relic. For the purposes of enabling people within an organisation to self-service, we want to allow a senior members of staff to create a team or service in OpsGenie, or a sub account in New Relic - safe in the knowledge that the new resource will be picked up by terraform and have everything set up for them such as integration between New Relic and OpsGenie, Service Incident rules in OpsGenie, workloads for each of their services in New Relic and so on.

Attempted Solutions

For simplicity, I'll just include Services in OpsGenie as the same principle applies to everything else.

Step-by-step

1. Pre execution

Before terraform is run, another application is called that queries our CMDB, cleans up the result and writes the output to *.auto.tfvars.json files in the root of the terraform project.
Example:

{
  "services": [
    {
      "attributes": {
        "Key": "SVC-1",
        "Name": "Test Service 1",
        "Created": "___",
        "Updated": "___",
        "Description": "This is a test service",
        "Tier": "Tier 3",
        "Service_ID": "___",
        "Revision": "___",
        "Service_Owners": {
          "opsgenieTeam": {
            "id": "___",
            "name": "___"
          }
        }
      },
      "id": "1",
      "label": "Test Service 1",
      "name": "Test Service 1",
      "objectKey": "SVC-1",
      "objectTypeId": "1",
      "objectTypeName": "Service",
      "workspaceId": "___"
    }
  ]
}

2. The variable structure

This response is interpreted by Terraform as a variable like so:

variable "services" {
  description = "The Services to create."
  type = list(
    object(
      {
        attributes = object(
          {
            Key            = string
            Name           = string
            Created        = string
            Updated        = string
            Description    = optional(string)
            Tier           = optional(string)
            Service_ID     = optional(string)
            Revision       = optional(string)
            Service_Owners = optional(
              object(
                {
                  opsgenieTeam = object(
                    {
                      id   = string
                      name = string
                    }
                  )
                }
              )
            )
          }
        )
        id             = string
        label          = string
        name           = string
        objectKey      = string
        objectTypeId   = string
        objectTypeName = string
        workspaceId    = string
      }
    )
  )
}

3. The root module

The root module calls the OpsGenie Service submodule for each service object like so:

# Create the OpsGenie Services and necessary additional components
# Only include services that have an owner configured to avoid errors.
module "opsgenie_service" {
  for_each = {
    for service in var.services : service.id => {
      id          = service.attributes.Service_ID
      name        = service.attributes.Name
      description = service.attributes.Description
      team_id     = service.attributes.Service_Owners.opsgenieTeam.id
    } if lookup(service.attributes, "Service_Owners", null) != null
  }

  source = "./modules/opsgenie_service"

  id          = each.value.id
  name        = each.value.name
  description = each.value.description
  team_id     = each.value.team_id
}

4. Import and manage the service object

The OpsGenie Service sub module should import each service and provision services that are already created like so:

import {
  to = opsgenie_service.this
  id = var.id
}

resource "opsgenie_service" "this" {
  name    = var.name
  team_id = var.team_id
}

# Do more stuff with this service...

Issues encountered

1. Can't import to non-root module

An import block cannot be run as part of non-root module. While this can be worked around by running the import in the root module before calling the submodule. It's messy and it would be better for the import to be contained within the same module that will be managing that resource. Additionally, since import blocks don't support for_each calls this doesn't account for the inherent proceduralism in this implementation, whereas calling an import from a submodule would do.

│ Error: Invalid import configuration
│ 
│   on modules/opsgenie_service/main.tf line 10:
│   10: import {
│ 
│ An import block was detected in "module.opsgenie_service". Import blocks are only allowed in the root module.

2. Variables not allowed

Variables are not allowed in import blocks, giving the following error:

│ Error: Variables not allowed
│ 
│   on modules/opsgenie_service/main.tf line 12, in import:
│   12:   id = var.id
│ 
│ Variables may not be used here.

3. Value for import must be known

Likely the result of using a variable to declare the import ID. However, the value here is known as it is a static value given by the json input generated in step 1.

│ Error: Unsuitable value type
│ 
│   on modules/opsgenie_service/main.tf line 12, in import:
│   12:   id = var.id
│ 
│ Unsuitable value: value must be known

Proposal

Allow import blocks in sub modules

In cases where the ID is a known, static value - it should be possible to allow imports to be run in a sub-module. Even sub modules that are called in a for_each loop. Given all the information needed to complete a plan exists. This would be an awesome first step towards enabling proceduralism in terraform runs.

Allow variables as import IDs

In cases where the variable is a static and known value, it should be allowed to be used in an import block. Edge cases whereby a variable can be modified between runs - it should be ok to destroy the previously imported resource and replace it with the newly import version. This is how the codebase remains declarative and where the input value changes, it should be treated as a declaration of intent. No different to manually writing out the ID in your codebase. However, allowing variables as IDs for import promotes good coding practices by not including potentially sensitive information in your codebase.

Pre-execution queries

This is a potential and hypothetical solution for future discussion. It would be really nice if something similar to a http data block can be marked in such a way that it is not run a second time during the apply stage. Subsequently, getting rid of the need to use external applications to generate json input.

References

No response

@froazin froazin added enhancement new new issue not yet triaged labels Jul 2, 2023
@kmoe
Copy link
Member

kmoe commented Jul 4, 2023

Thanks for the detailed write-up. It's helpful to understand the use case in detail.

The issue tracker should contain one issue per issue. We tend to think of an "issue" as a problem statement, rather than a proposed solution, but the boundary is often permeable.

In this case, "Allow variables as import IDs" is covered by #33228. I've created #33474 for "Allow import blocks in sub modules", as this is also an independent feature.

Subtracting those, what remains is an interesting problem that slightly exceeds the limits of what Terraform currently considers its responsibility. Import blocks are currently the outermost interface that we offer for coupling with other infra management software, since Terraform is not a program that monitors, or picks up from a CMDB.

cc @omarismail

It would be really nice if something similar to a http data block can be marked in such a way that it is not run a second time during the apply stage.

What problems are caused by evaluating data sources during the apply stage?

@kmoe kmoe added plannable-import and removed new new issue not yet triaged labels Jul 4, 2023
@froazin
Copy link
Author

froazin commented Jul 4, 2023

Hi @kmoe, thanks for getting back! :)

Agreed, so communication with a CMDB is less of a Terraform specific request and more of a design consideration for Terraforms potential as a component in a system like the one I describe above. At the moment it's not possible to use data sources in Terraform to dynamically provision resources as data sources aren't static since they are executed once during planning and again during execution. I can whip together an example if you thank that would be helpful? The error you would get would say something along the lines of resource identifiers must be static.

Ultimately, this specific use case can be resolved by #33228 and #33474 so long as the dependency remains on an external application generating .auto.tfvars.json files prior to each terraform run.

Allowing data sources to be marked in such a way that indicates they should only be run during planning and not during execution means that the data retreived can be used to dynamically provision resources and eliminates the need for an external application generating the .auto.tfvars.json files. This combined with the previous two issues you've helpfully separated out, should place Terraform in a good position to acommodate incorporation into a discovery pipeline, without directly identifying it is a core responsibility of the product.

@froazin
Copy link
Author

froazin commented Jul 4, 2023

Another way to solve this could potentially be by introducing different variable providers. Such as indicating that a variable is a http variable:

variable "http" "opsgenie_services" {
  description = "The OpsGenie services to import and manage"
  type = list(
    object(
      {
        name = string
        id = string
      }
    )
  )
}

In this approach, variables as we currently know them might be called "local" variables. For example:

variable "local" "opsgenie_team" {
  description = "The name of the OpsGenie Team that will manage provisioned resources."
  type = string
}

Your TF Vars file will look a little different too:

http {
  opsgenie_services = {
    url = "https://api.eu.opsgenie.com/v1/services"
    method = "get"
    headers = {
      Accept = "application/json"
      Authorization = "Basic XXXXXXXX..."
    }
  }
}

local {
  opsgenie_team = "Some Team"
}

@apparentlymart
Copy link
Contributor

Hi @froazin! As others have said, thanks for sharing these use-cases.

You have mentioned a few times in your comments so far a problem of data sources being read twice, with one read during planning and one read during apply.

That doesn't match my understanding of Terraform's current behavior. The intended behavior is that Terraform reads a data source during planning if possible, or during apply if necessary. "If possible" is covering a few different rules here, but the general idea is that Terraform will wait until the apply phase if it seems like the result of the data source might be changed by other actions proposed in the plan

I'm not sure how crucial a part this plays in the use-case you are describing, but if you are seeing Terraform read a data source in both the plan and the apply phase and that's causing problems for your use of import then I'd like to learn more about what you have observed; reading the data both during plan and apply sounds like a bug.

@etaham
Copy link

etaham commented Jul 20, 2023

Would it be possible to add a fourth item referenced in the mentioned issue #33537? Specifically, creating a resource if the id in an import doesn't exist.

And finally, a fifth - support for for_each for imports. This would require the second item as well.

Thanks!

@apparentlymart
Copy link
Contributor

apparentlymart commented Jul 20, 2023

Hi @etaham,

"Import if it exists and create it otherwise" is intentionally not supported and is unlikely to be supported in future because in general it is not safe. We require the operator to explicitly decide between creating and importing because when importing it's the operator's responsibility to ensure that each object gets bound to no more than one resource address across all of your Terraform configurations.

If you would like to discuss a situation where that safety problem would not apply, please open a new issue to describe your use-case in more detail. Thanks!

@asaio
Copy link

asaio commented Aug 1, 2024

Hi @etaham,

"Import if it exists and create it otherwise" is intentionally not supported and is unlikely to be supported in future because in general it is not safe. We require the operator to explicitly decide between creating and importing because when importing it's the operator's responsible to ensure that each object gets bound to no more than one resource address across all of your Terraform configurations.

If you would like to discuss a situation where that safety problem would not apply, please open a new issue to describe your use-case in more detail. Thanks!

Hello @apparentlymart, is there a way to disregard creation of other resources given that an import source hasn’t been created in any of the terraform configs? In other words, try to import a resource from another config and if it doesn’t exist disregard creation of the resources that depend on the import target existing in this config?

My use case is similar to #33633 in that we provision resources to a number of accounts in a couple of terraform configurations, the primary config should create a resource but if for some reason it didn’t we don’t want the secondary config to fail, we just want to disregard its resources that are related to something imported from the primary config.

import {
    to = aws_glue_catalog_database.default_db
    id = “default”
}

resource “aws_glue_catalog_database” default_db“” {
    name = “default”
}

resource “aws_lakeformation_permissions” “default_permissions” {
    principal = aws_iam_role.sales_role.arn
    permissions = [“ALL”]

    table {
    database_name = “default”
    wildcard = true
}

if the above import failed I’d like for the aws_lakeformation_permissions to be ignored and the overall terraform apply to be successful

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants