Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEVPROD-13428: Reset Amboy database in e2e tests when mutations enqueue jobs #616

Merged
merged 3 commits into from
Feb 11, 2025

Conversation

minnakt
Copy link
Contributor

@minnakt minnakt commented Feb 5, 2025

DEVPROD-13428

Description

Why does the test fail?

The tests in host_buttons.ts fail because the host is sometimes in a decommissioned state. You cannot reprovision or restart Jasper for a decommissioned host.

But the host i-0d0ae8b83366d22be is defined as "running" in the local test data. That shouldn't happen.

True, it seems like this should be impossible because the database should reset after every test that executes a mutation.

What is happening then?

In all of the test failures linked in the corresponding JIRA ticket, the test that runs directly before is host_update_status.ts. This test calls a mutation that creates a job in Amboy called EnqueueTerminateHostJob.

I think the test was flaky due to a race condition. The two patterns would be:

  • Sometimes it would succeed because the Amboy job ran first and marked the host as decommissioned. Then we cleared the database and the host was reset correctly.
  • Sometimes it would fail because we cleared the database first. Then the Amboy job got picked up and ran a second later, marking the host as decommissioned.

I think that clearing the Amboy database will help avoid future occurrences of similar bugs. By removing newly enqueued jobs from the Amboy database, we can ensure that the data will not change after we reset the Evergreen database.

@minnakt minnakt marked this pull request as ready for review February 5, 2025 21:25
@minnakt minnakt requested a review from a team as a code owner February 5, 2025 21:25
Copy link
Contributor

@sophstad sophstad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great + nice writeup. I probably would've just renamed the test to zzz_host_update_status.ts so it would run last 😉

@minnakt minnakt changed the title DEVPROD-13428: Reset amboy database when jobs are enqueued from mutations DEVPROD-13428: Reset amboy database when mutations enqueue jobs in e2e tests Feb 11, 2025
@minnakt minnakt changed the title DEVPROD-13428: Reset amboy database when mutations enqueue jobs in e2e tests DEVPROD-13428: Reset Amboy database in e2e tests when mutations enqueue jobs Feb 11, 2025
@minnakt
Copy link
Contributor Author

minnakt commented Feb 11, 2025

merging with codegen failing because the fix would be requesting a field that doesn’t exist on the backend yet, requiring coordinated deploys, so just not updating to keep it safe

@minnakt minnakt merged commit 2b95279 into evergreen-ci:main Feb 11, 2025
1 of 4 checks passed
@minnakt minnakt deleted the DEVPROD-13428 branch February 11, 2025 01:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants