Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rerun System Tests on failure #524

Closed
wants to merge 2 commits into from
Closed

Conversation

michalinacienciala
Copy link
Contributor

@michalinacienciala michalinacienciala commented Feb 7, 2023

We've seen that from time to time System Tests workflow fails due to problems with connectivity to Electrum or due to some other randomly occurring issues (like the problems with installing dependencies). We now add a job to the System Tests workflow which is started only if one of the previous jobs failed. The job will trigger a Rerun workflow, which will then trigger the System Tests again. The reason why we need to rerun via a separate workflow is because reruns cannot be called on the workflows/jobs which are still running.
To prevent from getting stuck in the constant retries loop we've added a retry limit - we will not automatically execute more than 3 attempts of the workflow.

Inspired by solution presented here: https://gist.github.com/philip-gai/e3c02e68a32e8964fa3df2167b15cbff.

TODO:

  • Setup a Personal Access Token (classic) with workflow and read:org scopes

We've seen that from time to time System Tests workflow fails due to problems
with connectivity to Electrum or due to some other randomly occurring issues
(like the problems with installing dependencies). We now add a job to the System
Tests workflow which is started only if one of the frevious jobs failed. The job
will trigger a Rerun workflow, which will then trigger the System Tests again.
The reason why we need to rerun via a separate workflow is because reruns cannot
be called on the workflows/jobs which are still running.
We've been implementing workflow retries as a solution to some non-deterministic
failures of the 'System Tests' workflow that are not indicative of the problems
with the tested functionalities. The solution that we have built so far could
lead to constant rerty loop in case of consisting failures.
Now we're adding a retry limit - we will not automatically execute more than 3
attempts of the workflow.
@nkuba
Copy link
Member

nkuba commented Feb 23, 2023

IMHO we should hold with adding general job reruns. We should try to fix all the problems we're experiencing on installations and test execution.
Recent updates to tbtc-v2.ts library implementing ethereum and electrum requests retries (#541) should improve the stability of test execution.
We should also determine other problems and try fixing them one by one.

@pdyraga
Copy link
Member

pdyraga commented Feb 23, 2023

IMHO we should hold with adding general job reruns. We should try to fix all the problems we're experiencing on installations and test execution. Recent updates to tbtc-v2.ts library implementing ethereum and electrum requests retries (#541) should improve the stability of test execution. We should also determine other problems and try fixing them one by one.

This work is captured in #531 and is on our sprint board.

@michalinacienciala
Copy link
Contributor Author

We've decided to close this PR in order to not introduce a solution that could make us accustomed to the environment configuration issues we're currently facing (described in #531).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants