Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prow-build-canary-cluster: provisioning scripts #5063

Merged
merged 13 commits into from
Apr 4, 2023
Merged
125 changes: 125 additions & 0 deletions infra/aws/terraform/prow-build-cluster/.terraform.lock.hcl

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

46 changes: 46 additions & 0 deletions infra/aws/terraform/prow-build-cluster/Makefile
pkprzekwas marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# Copyright 2023 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

TF ?= terraform
ASSUME_ROLE ?= true

# Valid values are: canary, prod
PROW_CLUSTER ?= canary

.PHONY: init
init:
$(TF) init \
-backend-config=./tfbackends/$(PROW_CLUSTER).tfbackend

.PHONY: plan
plan:
$(TF) plan \
-var-file=./terraform.$(PROW_CLUSTER).tfvars \
-var="assume_role=$(ASSUME_ROLE)"

.PHONY: apply
apply:
$(TF) apply \
-var-file=./terraform.$(PROW_CLUSTER).tfvars \
-var="assume_role=$(ASSUME_ROLE)"

.PHONY: destroy
destory:
$(TF) destroy \
-var-file=./terraform.$(PROW_CLUSTER).tfvars \
-var="assume_role=$(ASSUME_ROLE)"

.PHONY: clean
clean:
rm -rf ./.terraform
69 changes: 69 additions & 0 deletions infra/aws/terraform/prow-build-cluster/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
# Provisioninig EKS clusters

## Prod vs Canary

These scripts support provisioning two types of EKS clusters. One is meant for hosting prow jobs
on production and the other one is for testing infrastructure changes before promoting them to
production.

Here are some differences between canary and production setups:
* cluster name,
* cluster admin IAM role name,
* secrets-manager IAM policy name,
* canary is missing k8s prow OIDC provider and corresponding role,
* subnet setup is different,
* instance type and autoscaling paramethers (mainly for saving),

## Provisioning Cluster

Running installation from scratch is different than consecutive invocations of Terraform.
First run creates a role that can be later assumed by other users. Becasue of that additional
variable has to be set:

```bash
# For provisioning Prod:
export PROW_CLUSTER=prod
# For provisioning Canary:
export PROW_CLUSTER=canary

# Just making sure we don't have state cached locally.
make clean

ASSUME_ROLE=false make init
ASSUME_ROLE=false make apply
```

Once the infrastructure is provisioned, next step is RBAC setup:

```bash
# Fetch & update kubeconfig.
# For Prod:
aws eks update-kubeconfig --region us-east-2 --name prow-build-cluster
# For Canary:
aws eks update-kubeconfig --region us-east-2 --name prow-build-canary-cluster
pkprzekwas marked this conversation as resolved.
Show resolved Hide resolved

# create cluster role bindings
kubectl apply -f ./resources/rbac
```

Lastly, run Terraform script again without additinal variable. This time, it will implicitly assume
previously created role and provision resources on top of EKS cluster.

```bash
make apply
```

From here, all consecutive runs should be possible with command from above.

## Removing cluster

Same as for installation, cluster removal requires running Terraform twice.
**IMPORTANT**: It's possible only for users with assigned `AdministratorAccess` policy.

```bash
# First remove resources running on the cluster and IAM role. This fails once assumed role gets deleted.
make destroy

# Clean up the rest.
ASSUME_ROLE=false make destroy
```
54 changes: 26 additions & 28 deletions infra/aws/terraform/prow-build-cluster/eks.tf
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,31 @@ limitations under the License.
# EKS Cluster
###############################################

locals {
aws_auth_roles_base = [
# Allow access to the Prow-Cluster-Admin IAM role (used with assume role with other IAM accounts).
{
"rolearn" = aws_iam_role.iam_cluster_admin.arn
"username" = "eks-cluster-admin"
"groups" = [
"eks-cluster-admin"
]
},
]

aws_auth_roles = var.is_canary_installation ? local.aws_auth_roles_base : concat(
local.aws_auth_roles_base, [
# Allow access to the Prow-EKS-Admin IAM role (used by Prow directly).
{
"rolearn" = aws_iam_role.eks_admin[0].arn
"username" = "eks-admin"
"groups" = [
"eks-prow-cluster-admin"
]
}
])
}

module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "19.10.0"
Expand All @@ -31,34 +56,7 @@ module "eks" {
manage_aws_auth_configmap = true

# Configure aws-auth
aws_auth_roles = [
# Allow access to the Prow-EKS-Admin IAM role (used by Prow directly).
{
"rolearn" = aws_iam_role.eks_admin.arn
"username" = "eks-admin"
"groups" = [
"eks-prow-cluster-admin"
]
},
# Allow access to the Prow-Cluster-Admin IAM role (used with assume role with other IAM accounts).
{
"rolearn" = aws_iam_role.iam_cluster_admin.arn
"username" = "eks-cluster-admin"
"groups" = [
"eks-cluster-admin"
]
},
]
# Allow EKS access to the root account.
aws_auth_users = [
{
"userarn" = local.root_account_arn
"username" = "root"
"groups" = [
"eks-cluster-admin"
]
},
]
pkprzekwas marked this conversation as resolved.
Show resolved Hide resolved
aws_auth_roles = local.aws_auth_roles

# Allow access to the KMS key used for secrets encryption to the root account.
kms_key_administrators = [
Expand Down
4 changes: 2 additions & 2 deletions infra/aws/terraform/prow-build-cluster/iam.tf
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,8 @@ data "aws_iam_user" "user_pprzekwa" {
}

resource "aws_iam_role" "iam_cluster_admin" {
name = "Prow-Cluster-Admin"
description = "IAM role used to delegate access to prow-build-cluster"
name = "${local.canary_prefix}Prow-Cluster-Admin"
description = "IAM role used to delegate access to ${local.canary_prefix}prow-build-cluster"

assume_role_policy = jsonencode({
Version = "2012-10-17"
Expand Down
10 changes: 10 additions & 0 deletions infra/aws/terraform/prow-build-cluster/kubernetes.tf
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@ limitations under the License.
*/

module "cluster_autoscaler" {
count = var.assume_role ? 1 : 0
pkprzekwas marked this conversation as resolved.
Show resolved Hide resolved

source = "./modules/cluster-autoscaler"
providers = {
kubernetes = kubernetes
Expand All @@ -30,6 +32,8 @@ module "cluster_autoscaler" {
}

module "metrics_server" {
count = var.assume_role ? 1 : 0

source = "./modules/metrics-server"
providers = {
kubernetes = kubernetes
Expand All @@ -42,6 +46,8 @@ module "metrics_server" {

# AWS Load Balancer Controller (ALB/NLB integration).
resource "helm_release" "aws_lb_controller" {
count = var.assume_role ? 1 : 0

name = "aws-load-balancer-controller"
namespace = "kube-system"
repository = "https://aws.github.io/eks-charts"
Expand Down Expand Up @@ -75,6 +81,8 @@ resource "helm_release" "aws_lb_controller" {

# AWS Secrets Manager integration
resource "helm_release" "secrets_store_csi_driver" {
count = var.assume_role ? 1 : 0

name = "secrets-store-csi-driver"
namespace = "kube-system"
repository = "https://kubernetes-sigs.github.io/secrets-store-csi-driver/charts"
Expand All @@ -87,6 +95,8 @@ resource "helm_release" "secrets_store_csi_driver" {
}

resource "helm_release" "secrets_store_csi_driver_provider_aws" {
count = var.assume_role ? 1 : 0

name = "aws-secrets-manager"
namespace = "kube-system"
repository = "https://aws.github.io/secrets-store-csi-driver-provider-aws"
Expand Down
Loading