Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubernetes namespace already created fails #1406

Open
katlimruiz opened this issue Sep 14, 2021 · 36 comments
Open

kubernetes namespace already created fails #1406

katlimruiz opened this issue Sep 14, 2021 · 36 comments

Comments

@katlimruiz
Copy link

katlimruiz commented Sep 14, 2021

Terraform Version, Provider Version and Kubernetes Version

Terraform version:
Terraform v0.15.5
on windows_amd64
+ provider registry.terraform.io/cloudflare/cloudflare v2.21.0
+ provider registry.terraform.io/hashicorp/azuread v1.5.1
+ provider registry.terraform.io/hashicorp/azurerm v2.62.1
+ provider registry.terraform.io/hashicorp/helm v2.2.0
+ provider registry.terraform.io/hashicorp/kubernetes v2.3.2
+ provider registry.terraform.io/hashicorp/local v2.1.0
+ provider registry.terraform.io/terraform-providers/azuredevops v0.1.5

Affected Resource(s)

kubernetes_namespace

Terraform Configuration Files

resource "kubernetes_namespace" "kubnss" {
for_each = toset(var.namespaces)
metadata {
name = each.key
}
}

Expected Behavior

If the namespace is already created, it should just omit the statement and move on to the next one

Actual Behavior

Error: namespaces "xxxx-web-xxxxx" already exists

│ with module.production.module.kubernetes_web.kubernetes_namespace.kubnss["myns"],
│ on m/kubernetes/main.tf line 69, in resource "kubernetes_namespace" "kubnss":
│ 69: resource "kubernetes_namespace" "kubnss" {

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@katlimruiz katlimruiz added the bug label Sep 14, 2021
@curtbushko
Copy link

Good day @katlimruiz,

I do not think this is a bug.

You asked terraform to create a resource and it failed because it was already created. The common practice is to import that already created resource into terraform state so that terraform can manage it.

Thanks!

@katlimruiz
Copy link
Author

However, Terraform whole idea is to be declarative, if the object exist and no diff is made, then it doesn't make sense to recreate it (hence no error). Every other object in TF works like this, why would a k8s namespace differ from it?. Even the k8s cluster work like that. It is this specific object that is not.

@curtbushko
Copy link

curtbushko commented Oct 8, 2021

Are you positive that all terraform resources are ignored if they already exist?

I ask this because there have been dozens of times where we have broken the terraform state and I was forced to import buckets, databases, kubernetes services, etc before an apply worked correctly again...

@katlimruiz
Copy link
Author

if they are in the state and they are already there, then yes, all terraform scripts work like that otherwise this would just never work.

This is my experience. When you already have a whole platform in place, and you want to use terraform, then yes, it is a pain in the b*tt because it tells you to import a lot of resources, and importing them is very slow and problematic.

When you create a platform from scratch in terraform, then things go much much easier. There are some objects like Kubernetes cluster that do not communicate their full creation to the cloud provider therefore (even documentation says) you have to separate the creation from the apply of more kub resources, otherwise it would not work.

The only times I've seen the state broken is when 1) you use git to store the state 2) you made changes to the cloud manually and therefore discrepancies occur.

@curtbushko
Copy link

curtbushko commented Oct 8, 2021

I definitely agree, getting everything that you already created into terraform state is a big pain in the b*tt! I would really like to see an --auto-import or --ignore-exists flag added to terraform.

One tool that I haven't tried yet but looks interesting is terraformer. It might help with those missing resources.

I've had two cases of major terraform state corruption:

  1. A bad version of the google cloud provider being released for one day. They changed a timestamp integer size for the BigQuery resource and then reverted it. I had to manually remove the timestamp field from the state to get it working and after that I always lock down my terraform provider versions. :)
  2. Before kubernetes_manifest existed we used a provider called k8sraw that allows you to apply raw yaml files to kubernetes. It has been useful for Istio resources. It has bugs that sometimes corrupt the state :(

@jbg
Copy link

jbg commented Oct 12, 2021

@katlimruiz This is not a bug with the kubernetes provider. If you think terraform should magically do terraform import when an object doesn't exist in the state, that would be a feature request on terraform itself.

@katlimruiz
Copy link
Author

katlimruiz commented Oct 12, 2021 via email

@jbg
Copy link

jbg commented Oct 12, 2021

Did you confirm that the object was in the state after the first apply? Did the first apply complete successfully?

@astove
Copy link

astove commented Dec 3, 2021

I'm seeing the same issue.

@chengjianhua
Copy link

chengjianhua commented Jan 19, 2022

The same here, any progress?

@digihunch
Copy link

an option should be there to bypass the namespace creation if found already existent.
the behaviour of failing on already-exist namespace renders then entire terraform template non-declarative.

@AxelJoly
Copy link

Same thing happened to me just now.

@pauleikis
Copy link

Same here. Basically we have to comment out namespace.tf file after first apply, otherwise it's failing.

@jbg
Copy link

jbg commented Jan 25, 2022

Does your first apply fail? If the first apply is successful then the namespace is added to the state and creation of the resource shouldn't be attempted by tf the next time. On the other hand if something prevents the resource from being added to the state in the first apply then it will indeed try to create it again on the next run.

@digihunch
Copy link

it appears that if the resource (e.g. namespace) was created by terraform provider, then it remains declarative. Since it is registered in the tf state, attempting to create it again doesn't fail. However, if the resource had been created outside of terraform, then the kubernetes provider will fail on it.

@jbg
Copy link

jbg commented Jan 26, 2022

That is normal, and is just how Terraform works. If you create a resource outside of Terraform, then you need to import it into the state before you try to manage it with Terraform.

@mossad-zika
Copy link

@jbg that's not true, helm_release for example, doesn't work the same way. If I installed some chart bypass terraform with helm install and then I added it to terraform with no changes, terraform apply will do the job, no error, and this is how things must be

and this is not magic, it is programming and there is a definition for "declarative" manner, whether you like it or not

@jbg
Copy link

jbg commented Jul 21, 2022

That's not "how things must be". Read up on the design of terraform and why it uses state. The helm_release behaviour you describe is not how most terraform resources work. If you don't want it to be that way, then what you really want is a different tool.

@mossad-zika
Copy link

@jbg you are literally the only one who thinks this open ticket shouldn't raise a fix. And since the ticket is not closed apparently you are not in charge of this provider anyhow, so it would be nice if you will stop posting here about "magic" and other nonsense without even providing any links to prove your words

@jbg
Copy link

jbg commented Jul 21, 2022

Just trying to help you understand the TF model and why this won't be "fixed". Feel free to keep waiting though! All the best!

@mossad-zika
Copy link

@jbg I don't need to wait for anything, because there is a properly working Kubernetes "kubectl" Provider

@jbg
Copy link

jbg commented Jul 21, 2022

That's an unfortunate side-effect of kubectl provider calling kubectl apply internally which doesn't differentiate between create and modify. It's not consistent with TF's provider contract (which is followed by the vast majority of providers; try creating a resource that already exists with any of the major TF providers and you'll see), and causes problems if people didn't intend to overwrite existing objects with the same kind/namespace/name. The author of the kubectl provider considered fixing this behaviour in gavinbunney/terraform-provider-kubectl#73 (by changing to use kubectl create when TF asks the provider to create a resource); it seems they didn't get around to it but hopefully it will be fixed in future.

TF's view of the world is the state. If the object doesn't exist in the state, the provider is asked to create it, and that operation is supposed to fail if the object already exists in the "real world". This is how almost every provider works, and it's the reason why the import operation exists. I'm curious what you think terraform import is for, if you think that the incorrect behaviour of the kubectl provider is "properly working".

@oferchen
Copy link

oferchen commented Feb 17, 2023

I came across this issue as well, applied the following workaround. IMO this is not a bug and I also don't think a feature is justified for something that can be worked around.

data "kubernetes_all_namespaces" "allns" {}

resource "kubernetes_namespace" "this" {
  for_each = toset([ for k in var.namespaces : k if !contains(keys(data.kubernetes_all_namespaces.allns), k) ])
  metadata {
    name = each.key
  }
  depends_on = [data.kubernetes_all_namespaces.allns] # potentially more if you want to refresh list of NS
}

@d3vpasha
Copy link

d3vpasha commented Mar 9, 2023

@oferchen I don't agree with you, Terraform is a declarative language & it is supposed to ignore a resource that already exists in a state that is wanted. So an error because a resource exists is against Terraform.

@nsainaney
Copy link

nsainaney commented Apr 1, 2023

It all comes down to the behaviour one would like. For instance:

kubectl create ns foo
kubectl create ns foo # <---- error


kubectl apply -f ns.yaml
kubectl apply -f ns.yaml # <---- no error

So it comes down to if you want the create or apply behaviour but it appears that this provider only support create. I was hoping to use the cleaner language features of terraform to deploy our applications however, these limitations do not make it a good alternative to tools like helm or kustomize.

There doesn't seem to be clean way in terraform to deploy your app to say develop and then again to staging without the staging apply destroying the develop app first.

@jbg
Copy link

jbg commented Apr 1, 2023

It all comes down to the behaviour one would like. For instance:

kubectl create ns foo
kubectl create ns foo # <---- error

kubectl apply -f ns.yaml
kubectl apply -f ns.yaml # <---- no error

So it comes down to if you want the create or apply behaviour but it appears that this provider only support create. I was hoping to use the cleaner language features of terraform to deploy our applications however, these limitations do not make it a good alternative to tools like helm or kustomize.

The big difference between ad-hoc applying of yaml to your cluster and Terraform is Terraform's state. Learning about it will explain why it works the way it does, and why all correctly-written providers work the same way as this one. Try creating a S3 bucket that already exists with the AWS provider, for example.

Start here to learn about state in Terraform: https://developer.hashicorp.com/terraform/language/state

If a resource is added to your TF config and it does not exist in the state, it will be created. If you desire to work with an existing object, you simply need to import it first. It's not a limitation, since it doesn't limit you, it merely forces you to be explicit -- not a bad thing when you're managing production infra.

https://developer.hashicorp.com/terraform/cli/import

There doesn't seem to be clean way in terraform to deploy your app to say develop and then again to staging without the staging apply destroying the develop app first.

Of course there is, this is a common use case. If you want to duplicate your whole config, look into workspaces. If you want to duplicate only part of it, define that part in a module and then use the module twice in your config, with a variable for namespace and whatever else needs to vary.

@nsainaney
Copy link

nsainaney commented Apr 2, 2023

There doesn't seem to be clean way in terraform to deploy your app to say develop and then again to staging without the staging apply destroying the develop app first.

Of course there is, this is a common use case. If you want to duplicate your whole config, look into workspaces. If you want to duplicate only part of it, define that part in a module and then use the module twice in your config, with a variable for namespace and whatever else needs to vary.

We provision the GKE cluster in the root module and deploy our app using a child module. We'd like to use the exact same module for all pre-production environments. Changing any variable in the child module (e.g. changing the namespace from develop to staging) triggers a destroy of the other environment (by design). We also tried to see if we could change the state storage based on variables e.g:

terraform {
 backend "gcs" {
   bucket  = "my-app"
   prefix  = var.app_environment == "develop" ? "terraform/develop-app-state" : "terraform/staging-app-state"
 }
}

works but we'd like to do this for any feature branch so the app environment is not deterministic. However, variables are not allowed in setting up the state so that seems to block us from reusing the same module to provision different kubernetes environments within namespaces.

I haven't explored workspaces so will check that out now.

@nsainaney
Copy link

@jbg thank you for the RTFM tip. Just tried out workspaces and that was exactly what I was missing.

We have our terraform broken down into modules so for anyone trying this out, be sure to add a -chdir. What worked for me was:

terraform -chdir=gke init
terraform -chdir=app init
terraform -chdir=gke apply ....                                    # Sets up the GKE cluster
terraform -chdir=app workspace new <branch>
terraform -chdir=app apply ....                                    # Deploys the application for preview

@jbg
Copy link

jbg commented Apr 2, 2023

We provision the GKE cluster in the root module and deploy our app using a child module. We'd like to use the exact same module for all pre-production environments. Changing any variable in the child module (e.g. changing the namespace from develop to staging) triggers a destroy of the other environment (by design).

If you want to keep develop around and add staging alongside it, you wouldn't change the namespace from develop to staging on your existing module block (which means develop no longer exists in your config and will be removed upon apply).

Instead, you would add a second module block (referencing the same source) with the different variable. Both module instances can use the same provider(s).

Or, as you've found, you can use workspaces to have multiple separate states in the same backend. This is more appropriate when you want your whole configuration to be duplicated N times.

@ashtonian
Copy link

What about leaving this resource and behavior as is, later renaming it to kubernetes_namespace_create while adding a kubernetes_namespace_apply with the desired adoption behavior so that the user can take advantage of both.

I don't think there is currently a workaround to check if a namespace exists before creating one with just the kubernetes provider?

@jbg
Copy link

jbg commented Jun 19, 2023

What about leaving this resource and behavior as is, later renaming it to kubernetes_namespace_create while adding a kubernetes_namespace_apply with the desired adoption behavior so that the user can take advantage of both.

Terraform resources represent things, not operations.

I don't think there is currently a workaround to check if a namespace exists before creating one with just the kubernetes provider?

As of Terraform 1.5.0 you could use import blocks to do this without the separate import step. You do still have to know whether the NS exists or not in order to know whether you need the import block. I'm curious to understand better the use case where you don't know?

@joaocc
Copy link

joaocc commented Oct 25, 2023

@oferchen , regarding the workaround you proposed, can you please confirm that when you run apply a 2nd time, it doesn't try to remove the namespace (and thus creating/recreating every other execution) due to the fact that the kubernetes_all_namespaces will start containing the new namespace once it is created by terraform?

@joaocc
Copy link

joaocc commented Oct 25, 2023

Regarding the broader discussion, I feel that tools should focus in allowing people to do what they need to do, and in some of the points above, it looks like the need perform this simple operation is seen as less important than the purity of terraform perspective on things, but without providing a simple workaround to perform this simple but common operation.

In the points above, just would like to point out that kubernetes_namespace already implements the "kubectl-as-command" logic (creating 2x the same ns fails). However, kubernetes_manifest also implements the same logic (while the kubectl apply of the same manifest twice succeeds).

In our case, we need to "create a namespace if one doesn't yet exist", as we are deploying terraform resources and gitops/flux resources, and sometimes some are deployed before others.
After spending a non-trivial amount of time, barring from using "kubectl" to daploy YAML resource of namespace, we could not find how to do this in a simple manner in pure terraform.

So, using kubectl behaviour as the spec for this module, a possible solution for this would be to:

  • kubernetes_namespace implement the imperative approach of kubectl create ns
  • kubernetes_manifest implement the declarative approach of kubectl apply

@oferchen
Copy link

@oferchen , regarding the workaround you proposed, can you please confirm that when you run apply a 2nd time, it doesn't try to remove the namespace (and thus creating/recreating every other execution) due to the fact that the kubernetes_all_namespaces will start containing the new namespace once it is created by terraform?

the snippet was specifically designed to omit namespaces that already exists, while I do agree that this is an issue that should be mitigated by Terraform this issue is already almost two years old and there is no progress so a workaround makes sense here.

@joaocc
Copy link

joaocc commented Nov 14, 2023

Hi @oferchen . Would it make more sense to reopen the ticket as something more focused on adding the ability to have the provider do both kubectl apply and kubectl create behaviours?
As above, one option could be to have kubernetes_manifest offer the ability of kubectl apply and kubernetes_* offer the kubectl create models, but I understand this might cause confusion with existing code, so a new resource or a flag (see below) might be better suited.

In our case, we ended up having to shift to "alekc/terraform-provider-kubectl" where it behaves like apply (and I can select "apply_only"), as the risk of removing a lot of content from k8s if we remove the terraform kubernetes_namespace (or one created via kubernetes_manifest) that was created via kubectl_manifest was simply too high.

Having a similar option on kubernetes_manifest would make like much easier.
Thx

@oferchen
Copy link

Hi @oferchen . Would it make more sense to reopen the ticket as something more focused on adding the ability to have the provider do both kubectl apply and kubectl create behaviours? As above, one option could be to have kubernetes_manifest offer the ability of kubectl apply and kubernetes_* offer the kubectl create models, but I understand this might cause confusion with existing code, so a new resource or a flag (see below) might be better suited.

In our case, we ended up having to shift to "alekc/terraform-provider-kubectl" where it behaves like apply (and I can select "apply_only"), as the risk of removing a lot of content from k8s if we remove the terraform kubernetes_namespace (or one created via kubernetes_manifest) that was created via kubectl_manifest was simply too high.

Having a similar option on kubernetes_manifest would make like much easier. Thx

@joaocc I think this provider behavior is inconsistent with how Terraform is supposed to behave
Code is supposed to represent an end state and not an explicit operation and there should no distinction between kubectl create and kubectl apply. I don't think I should care what would happen between the current state and the end state as long as operations happen.

kubernetes_manifest is an explicit yaml resource definition so I don't think the example applies here.

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests