Pre-refreshed resources #2976

apparentlymart · 2015-08-12T00:40:48Z

There are several resource types that don't actually create something but rather either just record some local state inside Terraform (logical providers) or retrieve data about a pre-existing object. The implementations of these usually simply do nothing except populating the id inside the Create function, and then do all of their interesting work inside Read, and possibly Update.

Some resources of this type include:

terraform_remote_state
consul_keys (though this one is weird because it can also write, depending on the config)
template_file
atlas_artifact
The various TLS-related logical resources I proposed in TLS utility provider #2778

Meanwhile, Terraform allows resource attributes to be interpolated into provider configurations, which is very handy when parts of the config are set dynamically from outside the configuration. For example:

resource "consul_keys" "deploy_config" {
    key {
        name = "deploy_region"
        path = "deploy_config/region"
    }
}
provider "aws" {
    region = "${consul_keys.deploy_config.var.deploy_region}"
}

This capability doesn't work quite right because provider configurations need to be processed in order to run terraform plan, but on the first run the consul_keys resource hasn't been created yet and so its state is empty. This works (after a fashion) if one adds the consul_keys resource first, runs terraform apply, and then adds the provider configuration using the result. But this sort of two-step configuration dance is troublesome when using deployment orchestration tooling.

This limitation could be addressed by allowing particular resources to opt-in to being pre-refreshed, which means that terraform refresh (and all of the implicit paths into the same functionality) treats them as immediately existing.

In other words, terraform refresh will call the Read implementation on such a resource even if there's no record of it in the state already. The Read implementation can then use the resource parameters from the configuration to populate any computed attributes just like it would for a pre-existing resource, making sure that the state is updated before we run terraform plan, and thus that the providers will get the values they are expecting.

Of course, the providers are needed for terraform refresh too, so Terraform still needs to take care to order the individual resource refresh operations so that e.g. in my above example the AWS provider is not instantiated until the consul_keys resource has been refreshed.

With this concept in place, I think it would then be reasonable to make it a fatal error for a provider to depend on a non-pre-refreshed resource, giving the user better feedback about why this is not a sensible configuration.

Although my primary motivation is addressing the "interpolated provider config" use-case, this new concept has some other benefits:

This class of resource would not result in a confusing "create" diff appearing in the plan the first time they are added. If the configuration happens to differ from the current resource state it would still show up as an "update" diff, and if it matches then there would be no diff at all. This provides a more accurate diff to the user.
For resource types where "create" and "update" are not significantly-different operations e.g. because a unique id is part of the configuration, such as aws_s3_bucket_object, these resources could be set as "pre-refreshed" and then the Read implementation could check if a same-named object already exists and record it into the state, thus causing it to show up as an update rather than a create in the diff. Again, this provides a more accurate diff to the user, and allows the user to make a better decision about whether to apply the plan.

Overall I think this concept better models a class of resources and helps Terraform to "do the right thing" in more cases, by allowing the provider to give it more information.

The text was updated successfully, but these errors were encountered:

phinze · 2015-08-13T15:36:51Z

Great write up, @apparentlymart!

This is a concept I've been kicking around the back of my head in various forms, and I think your model may just be the way to go.

I'm going to let this percolate a bit and I'll follow up.

bitglue · 2015-10-23T20:55:57Z

Seems there's another detail necessary for this to work: having a provider configuration depend on a terraform_remote_state or similar quite frequently makes dependency cycles on terraform destroy. For example, I just got:

Error creating plan: 1 error(s) occurred:

* Cycle: aws_security_group.www_elb (destroy), aws_security_group.db (destroy), aws_security_group.www (destroy), provider.fleet, fleet_unit.www (destroy), aws_db_instance.main (destroy), terraform_remote_state.core (destroy), terraform_remote_state.core

It would seem the condition for this is that not only a provider depends on the remote state, but also some other resource that was created. But that's a pretty common scenario.

But since pre-refreshed resources shouldn't need to participate in the create process, they shouldn't need to participate in destroy, either. Thus breaking the cycle.

apparentlymart · 2015-10-28T18:17:45Z

Having played with my prototype in #3060 some more, my conception of this issue has changed slightly.

The concept of loading data to be used in the configuration seems distinct from the concept of creating something where another thing of the same name may already exist. Therefore I think there are two separate ideas here:

I think there's room for a new concept, distinct from resource, that belongs to a provider but only supports a "read" operation. I'd like to call this concept data_source. Data sources are provided by resource providers but have a different lifecycle where they only support a "read" operation, and that read operation is carried out only during the terraform refresh action and other cases where that's implied. (that is, not when -refresh=false is applied to plan or to the bare apply).
Certain resource types are able to compute an instance id using just the resource configuration, since it's just the value of one or more of the configuration properties. Terraform should be able to check for the existence of conflicting existing resources during plan so that the naming conflict error can be reported at plan time rather than at apply time. (This is different than my earlier proposal where I called for Terraform to effectively "adopt" the existing resource and update it to match the config. I've found in experimentation that this violates the principle of least surprise, and I'd rather it just be an error that gets caught earlier on.)

The first of these issues is a more pertinent concern to me right now, since my Terraform configs are getting a lot of these "data sources" as we use both terraform_remote_state and consul_keys extensively to allow particular configurations to be deployed in many different environments, with the data sources providing settings that are different in each case.

When I have more time to focus on this I intended to break this issue into two separate issues and then prototype the data_source concept in isolation, putting aside the "resources whose ids must be unique" problem for the moment. In the mean time I just wanted to note here my intent in case other folks come across this and wonder what's going on.

bitglue · 2015-10-28T18:30:25Z

I would point out that we already have a mechanism for "loading data to be used in the configuration": terraform.tfvars. So perhaps if that were made to be more expressive it would be a natural place to do something like loading configuration from remote state on S3, or whatever.

This is basically how I've had to work around this issue anyway: I have a bunch of hairy scripts and rules in make which pull the outputs from one configuration and stuff them into the terraform.tfvars of the next configuration. It works pretty well except for the hair.

mtougeron · 2015-10-28T19:41:22Z

This functionality would be extremely useful for us for the scenario I described in #3657 (closed).

sl1pm4t · 2015-11-03T00:26:52Z

+1

apparentlymart · 2015-12-04T20:12:10Z

As I previously mentioned, I'm splitting this into two separate proposals that each aim to solve half of the problem space presented in this ticket:

Data-driven Terraform Configuration #4169: Data-driven Terraform Configuration
Pre-flight Check on Resource Creation #4170: Pre-flight check on Resource Creation

ghost · 2020-04-29T02:21:46Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

apparentlymart mentioned this issue Aug 13, 2015

Rundeck Provider #2412

Merged

15 tasks

phinze added enhancement core thinking labels Aug 13, 2015

apparentlymart mentioned this issue Aug 23, 2015

Prototype of Initial Resource State #3060

Closed

apparentlymart mentioned this issue Sep 11, 2015

Vault provider #2221

Closed

glenjamin mentioned this issue Sep 28, 2015

Recover partial state when instance times out in cloudstack provider #3171

Closed

apparentlymart mentioned this issue Oct 3, 2015

Provider cannot use variables from resources #3390

Closed

This was referenced Oct 21, 2015

Terraform does not handle dependencies on resources by providers correctly #3575

Closed

Configuring one provider with a dynamic attribute from another (was: depends_on for providers) #2430

Closed

lwander mentioned this issue Oct 28, 2015

[feature request] Allow depends_on for providers #3657

Closed

apparentlymart mentioned this issue Nov 2, 2015

aws_region doesn't work with terraform_remote_state #2751

Closed

This was referenced Dec 2, 2015

Partial/Progressive Configuration Changes #4149

Closed

Data-driven Terraform Configuration #4169

Closed

apparentlymart closed this as completed Dec 4, 2015

trawler mentioned this issue Jun 13, 2019

Possible regression in the handling of dynamically generated provider variables #21714

Closed

ghost locked and limited conversation to collaborators Apr 29, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pre-refreshed resources #2976

Pre-refreshed resources #2976

apparentlymart commented Aug 12, 2015

phinze commented Aug 13, 2015

bitglue commented Oct 23, 2015

apparentlymart commented Oct 28, 2015

bitglue commented Oct 28, 2015

mtougeron commented Oct 28, 2015

sl1pm4t commented Nov 3, 2015

apparentlymart commented Dec 4, 2015

ghost commented Apr 29, 2020

Pre-refreshed resources #2976

Pre-refreshed resources #2976

Comments

apparentlymart commented Aug 12, 2015

phinze commented Aug 13, 2015

bitglue commented Oct 23, 2015

apparentlymart commented Oct 28, 2015

bitglue commented Oct 28, 2015

mtougeron commented Oct 28, 2015

sl1pm4t commented Nov 3, 2015

apparentlymart commented Dec 4, 2015

ghost commented Apr 29, 2020