This repository contains code to support this blog https://aws.amazon.com/blogs/storage/choosing-the-right-storage-for-cloud-native-ci-cd-on-amazon-elastic-kubernetes-service-eks/.
You should start by reading the blog to understand the context!
You can reproduce the results discussed in the blog, and experiment with your own configurations to find the right solution for your use-case.
Note, these resources are beyond the free tier! Recommend reviewing the settings and assuring you understand the costs before deploying.
When you're experiments are finished, you can destroy the stack easily using the CDK!
To support our benchmarks, this CDK project will create:
- An EKS Cluster
- An EC2 instance worker node (you can optionally choose the type and quantity)
- An FSxL filesystem (you can optionally configure the throughput and size)
- An S3 bucket
- All the required IRSA roles and permissions to function.
- Supporting helm charts:
- The EBS CSI Driver
- The FSxL CSI Driver
- Prometheus Operator
- Grafana preloaded with a dashboard and pointing to the Prometheus environment
An AWS Cloud9 environment will contain all the tools and software to use this repository right away. Alternately anything with a command line and a text editor should do the trick!
You can follow the getting started guide for Cloud9 here
If you're using Cloud9, you should already have the CDK installed (use version 2).
Otherwise, you can follow these instructions to install the CDK (use version 2).
After installing the CDK, install the required NPM modules for the project by running:
npm install
Configure your AWS CLI Credentials to work against the account you will deploy to.
If you're in an AWS Cloud9 environment this should already be done for you! If you're not using AWS Cloud9 configure the AWS CLI using these instructions.
Be sure to set the region to match the region you wish to deploy to. eg:
export AWS_REGION=us-east-1
Run a quick test to make sure the credentials are working
aws sts get-caller-identity
This command should succeed and show the identity you're using with AWS.
The CDK requires a place to put assets it builds. Bootstrap this account to handle this by running:
cdk bootstrap
If you're not using Cloud9 - you'll need to install the kubectl
command. Follow instructions here
Edit file bin/eks-cicd-storage.ts
. You can edit content in section:
ec2InstanceType: ec2.InstanceType.of(ec2.InstanceClass.MEMORY6_AMD, ec2.InstanceSize.XLARGE8),
ec2InstanceCount: 1,
ec2InstanceStorageGb: 1000,
fsxStorageSizeGb: 1200,
fsxThroughputPerTb: 500,
The ec2InstanceType
that is deployed for the cluster can be set using the CDK: https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.aws_ec2.InstanceType.html#example
The ec2InstanceCount
configures how many nodes are deployed.
The ec2InstanceStorageGb
configures the size of the EBS volume attached to the instance(s).
You can control the amount of storage, and the throughput per TB for FSx. Read through this documentation to understand the ratios and impacts: https://docs.aws.amazon.com/fsx/latest/LustreGuide/managing-storage-capacity.html
The fsxStorageSizeGb
controls the size of the FSxL filesystem.
The fsxThroughputPerTb
controls how much FSxL throughput is available per TB.
There are additional configurations around Kubernetes versions, and Helm chart versions that can be configured here too if desired.
Once you're comfortable that everything looks good, execute a deployment!
cdk deploy --require-approval never
Leave off the --require-aproval never
if you'd like to be prompted when security groups / IAM roles will be created to allow it to proceed.
Deployment will take a little while since it is creating an EKS cluster, EC2 instance(s), then applying all the base-line manifests and helm charts to be able to run the benchmarks. Grab a coffee!
After the CDK App has finished deployment it will print out the command required to configure Kubectl. Output looks similar to the following:
Outputs:
EksCiCdStorage.eksClusterEksClusterConfigCommandDCE878CA = aws eks update-kubeconfig --name EKS-CiCd-Storage --region us-east-2 --role-arn arn:aws:iam::012345678910:role/EKS-CiCd-Storage-acccess-role
Run the command shown to configure kubectl
to access the provisioned cluster.
Verify you have access to the cluster using kubectl by checking for nodes.
kubectl get nodes
The node count should match the number you specified, or one if you went with defaults.
We're using an in-cluster Grafana and have imported a dashboard to see the results of our benchmarks.
We can view the Grafana web interface using a port forward and using the credentials created by the Helm chart.
First retrieve the credentials:
GRAF_PASS=$(kubectl -n monitoring get secret grafana -o jsonpath="{.data.admin-password}" | base64 --decode)
echo $GRAF_PASS
Now configure the port configure the port forward:
kubectl port-forward -n monitoring service/grafana 3000:80
Open http://localhost:3000 in your browser.
Then sign in using the admin
user and the credential output above.
For those interested - the code and manifests underpinning each benchmark is in the benchmarks
folder.
Grafana has a preloaded Dashboard showing you pod execution times for each style of build / test.
It's named 'Pod Run Times' and it's located in the 'examples' folder. Scroll within the dashboard to see the results of each benchmark.
Each benchmark supports a number argument up to 30. This is the number of pods to submit in parallel for that benchmark.
If no argument is specified it will create one pod for a baseline.
export RUN_PODS=5
Execute:
npm run ebs-backed-build ${RUN_PODS}
You can watch the pods get created, and eventually complete using the watch command below. When it's finished load the Grafana Dashboard and see your results!
watch -n 5 kubectl -n ebs-backed-workspace get pods
Execute:
npm run memory-backed-build ${RUN_PODS}
You can watch the pods get created, and eventually complete using the watch command below. When it's finished load the Grafana Dashboard and see your results!
watch -n 5 kubectl -n memory-backed-workspace get pods
Execute:
npm run direct-to-s3 ${RUN_PODS}
You can watch the pods get created, and eventually complete using the watch command below. When it's finished load the Grafana Dashboard and see your results!
watch -n 5 kubectl -n workspace-direct-to-s3 get pods
While working with FSxL I'm persisting a workspace between executions by using a UUID. This lets us re-use the same physical volume and workspace data when executing our test phase.
Note that if you're re-running the FSxL benchmarks on a regular basis you will need to delete your completed pods to reclaim that UUID for a new pod.
Execute:
npm run fsx-backed-build ${RUN_PODS}
You can watch the pods get created, and eventually complete using the watch command below. When it's finished load the Grafana Dashboard and see your results!
watch -n 5 kubectl -n workspace-to-fsx get pods
Execute:
npm run fsx-backed-test ${RUN_PODS}
These execute in the same workspace, but use a restoration of workspace from FSxL, then execute test cases. So you can watch those pods complete in the workspace-to-fsx
namespace as well.
watch -n 5 kubectl -n workspace-to-fsx get pods
In the 'Outputs' section of the CloudFormation template created by the CDK, you'll find the bucket name that the FSxL tests will write their outputs to.
You can view the contents, and download data from that bucket sorted by the 'uuid' used for the workspace. Example commands preceded by the $
(use your bucket name!)
$ export BUCKET="ekscicdstorage-fsxlfilesystemfsxreplicationbucket-j161nywpxr1t"
$ aws s3 ls s3://${BUCKET}/
PRE 2ee34c7e-eab4-4689-890e-467c08c78014/
PRE 513198c6-1630-44af-9b06-351b7e5f0a9d/
PRE 6614d346-baca-4c1b-a932-4d4b0604084e/
PRE daa7de23-a70a-4e7b-bcca-50dc3a028b1d/
PRE e3669a67-b5de-4b8b-87aa-a3de43aa8380/
$ aws s3 ls s3://${BUCKET}/2ee34c7e-eab4-4689-890e-467c08c78014/
2023-06-06 21:43:29 0
2023-06-06 21:55:31 400355 test-results.txt
$ aws s3 cp s3://${BUCKET}/2ee34c7e-eab4-4689-890e-467c08c78014/test-results.txt .
download: s3://${BUCKET}/2ee34c7e-eab4-4689-890e-467c08c78014/test-results.txt to ./test-results.txt
$ tail -n 10 test-results.txt
@aws-accelerator/installer: solutions-helper.ts | 100 | 100 | 100 | 100 |
@aws-accelerator/installer: validate.ts | 100 | 100 | 100 | 100 |
@aws-accelerator/installer: ---------------------------|---------|----------|---------|---------|-------------------
> Lerna (powered by Nx) Successfully ran target test for 9 projects
Done in 364.52s.
Destroy our stack
cdk destroy
Note that there will be two buckets that can't be removed by CloudFormation since they are not empty. Remove them by hand after the 'destroy' operation completes.