Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: CDI-2179 - Add databricks-workspace-e2 module #529

Merged
merged 1 commit into from
Oct 31, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 64 additions & 0 deletions databricks-workspace-e2/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
## References
* [Here](https://databrickslabs.github.io/terraform-provider-databricks/overview/) is the provider docs.

<!-- START -->
## Requirements

| Name | Version |
|------|---------|
| <a name="requirement_terraform"></a> [terraform](#requirement\_terraform) | >= 0.13 |

## Providers

| Name | Version |
|------|---------|
| <a name="provider_aws"></a> [aws](#provider\_aws) | n/a |
| <a name="provider_databricks"></a> [databricks](#provider\_databricks) | n/a |

## Modules

| Name | Source | Version |
|------|--------|---------|
| <a name="module_databricks_bucket"></a> [databricks\_bucket](#module\_databricks\_bucket) | github.com/chanzuckerberg/cztack//aws-s3-private-bucket | v0.60.1 |

## Resources

| Name | Type |
|------|------|
| [aws_iam_role.databricks](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_role) | resource |
| [aws_iam_role_policy.policy](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_role_policy) | resource |
| [aws_security_group.databricks](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/security_group) | resource |
| [databricks_mws_credentials.databricks](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_credentials) | resource |
| [databricks_mws_networks.networking](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_networks) | resource |
| [databricks_mws_storage_configurations.databricks](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_storage_configurations) | resource |
| [databricks_mws_workspaces.databricks](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_workspaces) | resource |
| [aws_caller_identity.current](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/caller_identity) | data source |
| [aws_iam_policy_document.databricks-s3](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_policy_document) | data source |
| [aws_iam_policy_document.databricks-setup-assume-role](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_policy_document) | data source |
| [aws_iam_policy_document.policy](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_policy_document) | data source |
| [aws_region.current](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/region) | data source |

## Inputs

| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| <a name="input_audit_log_bucket_name"></a> [audit\_log\_bucket\_name](#input\_audit\_log\_bucket\_name) | Name of bucket to write cluster logs to - also where the audit logs go, too | `string` | `"czi-audit-logs"` | no |
| <a name="input_databricks_external_id"></a> [databricks\_external\_id](#input\_databricks\_external\_id) | The ID of a Databricks root account. | `string` | n/a | yes |
| <a name="input_env"></a> [env](#input\_env) | The environment / stage. Aka staging, dev, prod. | `string` | n/a | yes |
| <a name="input_object_ownership"></a> [object\_ownership](#input\_object\_ownership) | Set default owner of all objects within bucket (e.g., bucket vs. object owner) | `string` | `null` | no |
| <a name="input_owner"></a> [owner](#input\_owner) | n/a | `string` | n/a | yes |
| <a name="input_passable_role_arn"></a> [passable\_role\_arn](#input\_passable\_role\_arn) | A role to allow the cross-account role to pass to other accounts | `string` | `""` | no |
| <a name="input_private_subnets"></a> [private\_subnets](#input\_private\_subnets) | List of private subnets. | `list(string)` | n/a | yes |
| <a name="input_project"></a> [project](#input\_project) | A high level name, typically the name of the site. | `string` | n/a | yes |
| <a name="input_service"></a> [service](#input\_service) | The service. Aka databricks-workspace. | `string` | n/a | yes |
| <a name="input_vpc_id"></a> [vpc\_id](#input\_vpc\_id) | ID of the VPC. | `string` | n/a | yes |
| <a name="input_workspace_name_override"></a> [workspace\_name\_override](#input\_workspace\_name\_override) | Override the workspace name. If not set, the workspace name will be set to the project, env, and service. | `string` | `null` | no |

## Outputs

| Name | Description |
|------|-------------|
| <a name="output_role_arn"></a> [role\_arn](#output\_role\_arn) | ARN of the AWS IAM role. |
| <a name="output_workspace_id"></a> [workspace\_id](#output\_workspace\_id) | ID of the workspace. |
| <a name="output_workspace_url"></a> [workspace\_url](#output\_workspace\_url) | Url of the deployed workspace. |
<!-- END -->
282 changes: 282 additions & 0 deletions databricks-workspace-e2/aws_iam_role.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,282 @@
locals {
cluster_log_bucket_prefix = "databricks-cluster-logs"
}

data "aws_iam_policy_document" "databricks-setup-assume-role" {
statement {
principals {
type = "AWS"
identifiers = ["arn:aws:iam::${local.databricks_aws_account}:root"]
}

actions = ["sts:AssumeRole"]
condition {
test = "StringLike"
variable = "sts:ExternalId"
values = [var.databricks_external_id]
}
}
}

resource "aws_iam_role" "databricks" {
name = local.name
assume_role_policy = data.aws_iam_policy_document.databricks-setup-assume-role.json
tags = local.tags
}

data "aws_iam_policy_document" "policy" {
statement {
sid = "NonResourceBasedPermissions"
actions = [
"ec2:CancelSpotInstanceRequests",
"ec2:DescribeAvailabilityZones",
"ec2:DescribeIamInstanceProfileAssociations",
"ec2:DescribeInstanceStatus",
"ec2:DescribeInstances",
"ec2:DescribeInternetGateways",
"ec2:DescribeNatGateways",
"ec2:DescribeNetworkAcls",
"ec2:DescribePlacementGroups",
"ec2:DescribePrefixLists",
"ec2:DescribeReservedInstancesOfferings",
"ec2:DescribeRouteTables",
"ec2:DescribeSecurityGroups",
"ec2:DescribeSpotInstanceRequests",
"ec2:DescribeSpotPriceHistory",
"ec2:DescribeSubnets",
"ec2:DescribeVolumes",
"ec2:DescribeVpcAttribute",
"ec2:DescribeVpcs",
"ec2:CreatePlacementGroup",
"ec2:DeletePlacementGroup",
"ec2:CreateKeyPair",
"ec2:DeleteKeyPair",
"ec2:CreateTags",
"ec2:DeleteTags",
"ec2:RequestSpotInstances",
]
resources = ["*"]
effect = "Allow"
}

statement {
effect = "Allow"
actions = ["iam:PassRole"]
resources = ["arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/databricks/*"]
}

dynamic "statement" {
for_each = length(var.passable_role_arn) > 0 ? [1] : []

content {
actions = [
"iam:PassRole"
]
resources = [
var.passable_role_arn
]
}
}

statement {
sid = "InstancePoolsSupport"
actions = [
"ec2:AssociateIamInstanceProfile",
"ec2:DisassociateIamInstanceProfile",
"ec2:ReplaceIamInstanceProfileAssociation",
]

resources = ["${local.ec2_arn_base}:instance/*"]

condition {
test = "StringEquals"
variable = "ec2:ResourceTag/Vendor"
values = ["Databricks"]
}
}

statement {
sid = "AllowEc2RunInstancePerTag"
actions = [
"ec2:RunInstances",
]

resources = [
"${local.ec2_arn_base}:instance/*",
"${local.ec2_arn_base}:volume/*",
]

condition {
test = "StringEquals"
variable = "aws:RequestTag/Vendor"
values = ["Databricks"]
}
}

statement {
sid = "AllowEc2RunInstanceImagePerTag"
actions = [
"ec2:RunInstances",
]

resources = [
"${local.ec2_arn_base}:image/*",
]

condition {
test = "StringEquals"
variable = "aws:ResourceTag/Vendor"
values = ["Databricks"]
}
}

statement {
sid = "AllowEc2RunInstancePerVPCid"
actions = [
"ec2:RunInstances",
]

resources = [
"${local.ec2_arn_base}:network-interface/*",
"${local.ec2_arn_base}:subnet/*",
"${local.ec2_arn_base}:security-group/*",
]

condition {
test = "StringEquals"
variable = "ec2:vpc"
values = ["${local.ec2_arn_base}:vpc/${var.vpc_id}"]
}
}

statement {
sid = "AllowEc2RunInstanceOtherResources"
actions = [
"ec2:RunInstances",
]

not_resources = [
"${local.ec2_arn_base}:image/*",
"${local.ec2_arn_base}:network-interface/*",
"${local.ec2_arn_base}:subnet/*",
"${local.ec2_arn_base}:security-group/*",
"${local.ec2_arn_base}:volume/*",
"${local.ec2_arn_base}:instance/*"
]
}

statement {
sid = "EC2TerminateInstancesTag"
actions = [
"ec2:TerminateInstances",
]

resources = [
"${local.ec2_arn_base}:instance/*",
]

condition {
test = "StringEquals"
variable = "ec2:ResourceTag/Vendor"
values = ["Databricks"]
}
}

statement {
sid = "EC2AttachDetachVolumeTag"
actions = [
"ec2:AttachVolume",
"ec2:DetachVolume",
]

resources = [
"${local.ec2_arn_base}:instance/*",
"${local.ec2_arn_base}:volume/*",
]

condition {
test = "StringEquals"
variable = "ec2:ResourceTag/Vendor"
values = ["Databricks"]
}
}

statement {
sid = "EC2CreateVolumeByTag"
actions = [
"ec2:CreateVolume",
]

resources = [
"${local.ec2_arn_base}:volume/*",
]

condition {
test = "StringEquals"
variable = "aws:RequestTag/Vendor"
values = ["Databricks"]
}
}

statement {
sid = "EC2DeleteVolumeByTag"
actions = [
"ec2:DeleteVolume",
]

resources = [
"${local.ec2_arn_base}:volume/*",
]

condition {
test = "StringEquals"
variable = "ec2:ResourceTag/Vendor"
values = ["Databricks"]
}
}

statement {
actions = [
"iam:CreateServiceLinkedRole",
"iam:PutRolePolicy",
]

resources = [
"arn:aws:iam::*:role/aws-service-role/spot.amazonaws.com/AWSServiceRoleForEC2Spot",
]

condition {
test = "StringLike"
variable = "iam:AWSServiceName"
values = ["spot.amazonaws.com"]
}

effect = "Allow"
}

statement {
sid = "VpcNonresourceSpecificActions"
actions = [
"ec2:AuthorizeSecurityGroupEgress",
"ec2:AuthorizeSecurityGroupIngress",
"ec2:RevokeSecurityGroupEgress",
"ec2:RevokeSecurityGroupIngress",
]

resources = [
"${local.ec2_arn_base}:security-group/${aws_security_group.databricks.id}",
]

condition {
test = "StringEquals"
variable = "ec2:vpc"
values = ["${local.ec2_arn_base}:vpc/${var.vpc_id}"]
}
}
}

resource "aws_iam_role_policy" "policy" {
name = "extras"
role = aws_iam_role.databricks.id
policy = data.aws_iam_policy_document.policy.json
}
33 changes: 33 additions & 0 deletions databricks-workspace-e2/bucket.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
data "aws_iam_policy_document" "databricks-s3" {
statement {
sid = "grant databricks access"
effect = "Allow"
principals {
type = "AWS"
identifiers = ["arn:aws:iam::${local.databricks_aws_account}:root"]
}
actions = [
"s3:GetObject",
"s3:GetObjectVersion",
"s3:PutObject",
"s3:DeleteObject",
"s3:ListBucket",
"s3:GetBucketLocation",
]
resources = [
"arn:aws:s3:::${local.name}/*",
"arn:aws:s3:::${local.name}",
]
}
}

module "databricks_bucket" {
source = "github.com/chanzuckerberg/cztack//aws-s3-private-bucket?ref=v0.60.1"
bucket_name = local.name
bucket_policy = data.aws_iam_policy_document.databricks-s3.json
project = var.project
env = var.env
service = var.service
owner = var.owner
object_ownership = var.object_ownership
}
Loading