Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Continue reconciliation after a health check execution error #415

Merged

Conversation

cbartz
Copy link
Collaborator

@cbartz cbartz commented Dec 4, 2024

Applicable spec: n/a

Overview

Catch health check execution errors and continue reconciliation.

Rationale

We had an issue in production where the health check failed with an invoke.exception.CommandTimeout for an ssh command, and as this error was not caught, the whole reconciliation stopped and so no runners were cleaned up and rebuilt.

Juju Events Changes

n/a

Module Changes

  • github_runner_manager/openstack_cloud/health_checks.py: Catch errors on ssh execution and raise a OpenstackHealthCheckError
  • github_runner_manager/openstack_cloud/openstack_runner_manager.py: Catch the error and map it to an unknown health state. Do not perform cleanup tasks for runners in unknown health state.

Library Changes

n/a

Checklist

  • The charm style guide was applied.
  • The contributing guide was applied.
  • The changes are compliant with ISD054 - Managing Charm Complexity
  • The documentation is generated using src-docs.
  • The documentation for charmhub is updated.
  • The PR is tagged with appropriate label (urgent, trivial, complex).
  • The changelog is updated with changes that affects the users of the charm.
  • The changes do not introduce any regression in code or tests related to LXD runner mode.

@cbartz cbartz added bug Something isn't working senior-review-required labels Dec 4, 2024
@cbartz cbartz requested a review from a team as a code owner December 4, 2024 12:16
Copy link
Contributor

github-actions bot commented Dec 5, 2024

Test coverage for 92a43d9

Name                         Stmts   Miss Branch BrPart  Cover   Missing
------------------------------------------------------------------------
src/charm.py                   655    150    140     27    74%   254-256, 322-341, 359-361, 362->366, 392-396, 470-472, 481, 488-490, 511-516, 533-538, 559, 571-577, 592-593, 612-613, 622, 627, 657-658, 660->669, 664->669, 674-681, 715, 719-724, 776, 788->791, 814-826, 830-831, 864-865, 877-894, 918-920, 939-949, 1029-1030, 1032-1033, 1035-1036, 1115->1117, 1182-1183, 1221-1223, 1231-1239, 1315-1348, 1362-1367, 1382-1425, 1433-1434, 1456
src/charm_state.py             450     17     82      3    95%   274-286, 505-509, 631-632, 687-688, 1123->1126, 1130-1131, 1178
src/errors.py                   25      0      0      0   100%
src/event_timer.py              52      6      0      0    88%   105-106, 143-144, 160-161
src/firewall.py                 51     18     10      0    67%   42-43, 66-69, 111-185
src/github_client.py            23      2      4      0    93%   71-72
src/logrotate.py                43      0      2      0   100%
src/lxd_type.py                 35      0      0      0   100%
src/runner_manager_type.py      39      0      0      0   100%
src/runner_type.py              38      0      0      0   100%
src/shared_fs.py                98     17     10      1    83%   60-61, 132-133, 162-163, 171-172, 178-179, 210, 213-214, 226-227, 270-271
src/utilities.py                32      4      6      2    79%   66-69, 111
------------------------------------------------------------------------
TOTAL                         1541    214    254     33    84%

Static code analysis report

Run started:2024-12-05 08:40:08.719921

Test results:
  No issues identified.

Code scanned:
  Total lines of code: 5084
  Total lines skipped (#nosec): 2
  Total potential issues skipped due to specifically being disabled (e.g., #nosec BXXX): 6

Run metrics:
  Total issues (by severity):
  	Undefined: 0
  	Low: 0
  	Medium: 0
  	High: 0
  Total issues (by confidence):
  	Undefined: 0
  	Low: 0
  	Medium: 0
  	High: 0
Files skipped (0):

@yanksyoon yanksyoon merged commit e477fb8 into main Dec 6, 2024
53 of 55 checks passed
@yanksyoon yanksyoon deleted the fix/continue-reconciliation-on-health-check-error-ISD-2786 branch December 6, 2024 02:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants