Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Terraform init #171

Open
wants to merge 66 commits into
base: main
Choose a base branch
from
Open

Feat: Terraform init #171

wants to merge 66 commits into from

Conversation

Bahugunajii
Copy link
Member

@Bahugunajii Bahugunajii commented Apr 29, 2024

Description

  • Terraform initialization

Checklist

  • Self testing

@Bahugunajii Bahugunajii self-assigned this Apr 29, 2024

echo "Starting user_data script execution"

# No need for 'sudo su' when running as user data, the script runs with root privileges by default
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you wrote no need to write sudo but all the commands below are using sudo :P

@Bahugunajii Bahugunajii requested a review from deepansh96 January 2, 2025 09:49
Comment on lines +40 to +54
"traces": {
"buffer_size_mb": 3,
"concurrency": 8,
"insecure": false,
"region_override": "ap-south-1",
"traces_collected": {
"xray": {
"bind_address": "127.0.0.1:2000",
"tcp_proxy": {
"bind_address": "127.0.0.1:2000"
}
}
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should remove x-ray tracing feature from tracking. Its mostly used for microservices and it'll cost us money

Comment on lines +24 to +25
shared_config_files = [data.dotenv.env_file.env["AWS_CONFIG_FILE"]] # AWS config file path
shared_credentials_files = [data.dotenv.env_file.env["AWS_CREDENTIALS_FILE"]] # AWS credentials file path
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its trying to look for AWS_CONFIG_FILE and AWS_CREDENTIALS_FILE in env file, but those two are not present in .env.example file?


cloudflare = {
source = "cloudflare/cloudflare" # Official Cloudflare provider
version = "~> 4.0" # Uses 4.x version
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should update it. Latest one is 5.0.0. https://registry.terraform.io/providers/cloudflare/cloudflare/latest/docs
The 4.0.x versions are not recommended on the docs it says it might contain bugs.

When this is updated, the block

resource "cloudflare_record" "cdn_cname" {
  zone_id = data.dotenv.env_file.env["CLOUDFLARE_ZONE_ID"]      # The Cloudflare Zone ID from .env file
  name    = data.dotenv.env_file.env["CLOUDFLARE_CNAME"]        # The CNAME record name/subdomain from .env file
  value   = aws_cloudfront_distribution.backend_cdn.domain_name # Points to the CloudFront distribution domain
  type    = "CNAME"                                             # Specifies this is a CNAME record type
  proxied = false                                               # Disables Cloudflare proxying/CDN for this record
}

will also change a little. You can see the terraform docs to see what's the latest syntax

type = string
}

variable "environments" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's call this environmentSpecificConfig. So we will use it like this
var.environmentSpecificConfig[local.environment]

connection {
type = "ssh"
user = "ubuntu"
private_key = file("C:/Users/amanb/.ssh/AvantiFellows.pem")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the PEM_FILE_PATH env variable here

}

provisioner "local-exec" {
command = "aws ec2 stop-instances --instance-ids ${self.id} --region ap-south-1"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Force aws_default_profile here. See related comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need of sudo anywhere in this file. User data file always runs with root access

#!/bin/bash

# Specify the log file
LOG_FILE="/var/log/user_data.log"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not needed. You're already specifying the env variable LOG_FILE outside in ec2-with-asg.tf file.

user_data = base64encode(templatefile("user_data.sh.tpl", { # Bootstrap script with environment variables
    # Template variables
    LOG_FILE              = "/var/log/user_data.log"
    BRANCH_NAME_TO_DEPLOY = data.dotenv.env_file.env["BRANCH_NAME_TO_DEPLOY"]
    TARGET_GROUP_NAME     = aws_lb_target_group.alb_tg.name
    environment_prefix    = local.environment_prefix
    DATABASE_URL          = data.dotenv.env_file.env["DATABASE_URL"]
    SECRET_KEY_BASE       = data.dotenv.env_file.env["SECRET_KEY_BASE"]
    POOL_SIZE             = data.dotenv.env_file.env["POOL_SIZE"]
    BEARER_TOKEN          = data.dotenv.env_file.env["BEARER_TOKEN"]
    PORT                  = data.dotenv.env_file.env["PORT"]
  }))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think we should change the way we deploy changes a little bit. I'm not saying what you did is wrong, I did the same thing in quiz backend. But when I was working on etl-next, a better way to do this came out.

What we're doing currently:

  • Github action runs
  • It sets some environment variables and runs the runOnGithubAction script.
  • In runOnGithubAction, we try to ssh into bastion instance, transfer some env files there, and then run runOnBastion inside the bastion
  • The runOnBastion file, it goes through all ec2 instances and sends them new env files and restarts the server.

This whole process is not wrong but a little convoluted and complex. There is also an issue of maintaining two versions of server start code, one in user_data and other in runOnBastion file.

Better solution:
Assumption: What if, when we simply reboot an instance, everything gets setup on its own? All libraries, folders, github pull etc all gets setup? Assuming that,

  • Github workflow runs
  • It checks if any of the ec2 instances are running or not. Stops the running ones
  • Updates / exports the env variables in the user-data of those ec2 instances
  • Restarts the instances.

And that's it, deployment is done. Now those instances will start their own setup steps.

So what all needs to be changed for this?

  1. user_data file
Content-Type: multipart/mixed; boundary="//"
MIME-Version: 1.0

--//
Content-Type: text/cloud-config; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="cloud-config.txt"

#cloud-config
cloud_final_modules:
- [scripts-user, always]

--//
Content-Type: text/x-shellscript; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="userdata.txt"

#!/bin/bash
# --- YOUR SCRIPT HERE ---

--//

This text needs to be added to the user_data file. This will setup this script to run on every reboot.
Along with this, you need to re-think about the things to do when re-running this script on every reboot. For example, we check if github repo is already there. If it is not, we clone. If it is already there, we do a git pull. If cloudwatch agent is already present, we don't re-download it. Stuff like that.
You can see an example of such a file here.

  1. Github action code: When we push to github, the github action workflow should read the user_data.sh.tpl file, update that file by replacing all variable names with proper env variable values, and update the ec2 instances with this new user-data. I think we'll have to update the launch template of the auto scaling group in this case as that is the one that spins up instances. So we set the desired instances to 0 so all instances are shut down, update the user_data in launch template, and then reset the desired instances to 2 for example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants