Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Update ec2 instance type in unit test workflow file #395

Closed
wants to merge 1 commit into from

Conversation

courtneypacheco
Copy link
Contributor

Resolves #394

Unit tests currently fail due to a minor configuration issue in one of our workflow files. This PR aims to resolve the typo. See referenced git issue.

@mergify mergify bot added CI/CD Affects CI/CD configuration ci-failure labels Jan 16, 2025
@courtneypacheco courtneypacheco force-pushed the fix-unit-test-workflow-file branch from 719ccf7 to 039c12a Compare January 16, 2025 15:08
@mergify mergify bot added ci-failure and removed ci-failure labels Jan 16, 2025
@courtneypacheco courtneypacheco force-pushed the fix-unit-test-workflow-file branch 2 times, most recently from 617a6e4 to 94efa2a Compare January 16, 2025 15:23
@mergify mergify bot added ci-failure and removed ci-failure labels Jan 16, 2025
@courtneypacheco courtneypacheco force-pushed the fix-unit-test-workflow-file branch from 94efa2a to f1fb4bf Compare January 16, 2025 15:47
@mergify mergify bot added ci-failure and removed ci-failure labels Jan 16, 2025
@courtneypacheco courtneypacheco force-pushed the fix-unit-test-workflow-file branch from f1fb4bf to 413e97e Compare January 16, 2025 16:32
@mergify mergify bot added ci-failure and removed ci-failure labels Jan 16, 2025
@courtneypacheco courtneypacheco force-pushed the fix-unit-test-workflow-file branch from 413e97e to d30e45e Compare January 16, 2025 17:09
@mergify mergify bot added ci-failure and removed ci-failure labels Jan 16, 2025
The ec2 instance used to run the unit tests cannot launch because of a configuration issue.

Signed-off-by: Courtney Pacheco <[email protected]>
@mergify mergify bot added the ci-failure label Jan 16, 2025
@courtneypacheco courtneypacheco force-pushed the fix-unit-test-workflow-file branch from d30e45e to c43ede2 Compare January 16, 2025 17:16
@mergify mergify bot removed the ci-failure label Jan 16, 2025
@courtneypacheco courtneypacheco requested review from JamesKunstle, RobotSail and a team January 16, 2025 17:22
Comment on lines +16 to +20
# TEMPORARILY ADDING TESTING BRANCH. WILL REMOVE.
branches:
- "main"
- "release-**"
- "fix-unit-test-workflow-file"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're good with the changes in this file, I will revert these temporary changes

@courtneypacheco courtneypacheco marked this pull request as ready for review January 16, 2025 17:26
ec2-instance-type: ${{ vars.AWS_REGION }}
# TODO: Update the EC2 instance type to use the EC2 runner variant when GPU calls are mocked
# ec2-instance-type: ${{ env.ec2_runner_variant }}
ec2-instance-type: g4dn.12xlarge
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the logic for using this instance type?

subnet-id: subnet-024298cefa3bedd61
security-group-id: sg-06300447c4a5fbef3
iam-role-name: instructlab-ci-runner
aws-resource-tags: >
[
{"Key": "Name", "Value": "instructlab-ci-github-large-runner"},
{"Key": "Name", "Value": "instructlab-ci-github-unit-test-runner"},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name wouldn't match the current pruning. It's currently looking for w+ between github and runner. Either need to adjust here or in the pruning.

@mergify mergify bot added the one-approval label Jan 16, 2025
@danmcp danmcp mentioned this pull request Jan 17, 2025
@@ -128,10 +134,11 @@ jobs:
aws-region: ${{ vars.AWS_REGION }}

- name: "Stop EC2 runner"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't appear to be setup so it will always run the way our other ec2-runner jobs are configured.

Ex: https://github.com/instructlab/instructlab/blob/main/.github/workflows/e2e-nvidia-l4-x1.yml#L150
Where it's a separate step that ways runs and depends on the previous step

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering why that 'if(always())' step was in there. I'll add that to my PR #409

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe be something leftover from an old job config - a lot of them have gotten out-of-sync, I had to make two hotfixes to the E2E Custom job in the Core repo today. @courtneypacheco has a Dev Doc open around consolidating some of these into common actions to make CI easier to maintain: instructlab/dev-docs#179

@JamesKunstle JamesKunstle mentioned this pull request Jan 23, 2025
@JamesKunstle
Copy link
Contributor

I incorporated these changes into #409 and added @courtneypacheco as the co-author on the commit with those changes. Thanks for your contribution Courtney!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI/CD Affects CI/CD configuration one-approval
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CI bug: Unit tests fail due to typo in git workflow
5 participants