Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suppress most cli errors, but still warn if we get too many #416

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

plars
Copy link
Collaborator

@plars plars commented Nov 20, 2024

Description

When running a job while polling with the cli, there may be an occasional network delay that can cause you to get errors like this:

ERROR: 2024-11-20 19:04:11 client.py:64 -- Timeout while trying to communicate with the server.
WARNING: 2024-11-20 19:04:11 __init__.py:874 -- Unable to retrieve job state.
unknown

These can usually be ignored since it will try again, but can be often be interpreted by the user to think that something is wrong when it isn't. However, there's also a possibility that the server is unreachable for a long time, and we don't want to hide that from the user if it's happening.

I think this takes a pretty balanced approach and silences most of these warnings and errors (except when it's going to be fatal), while running a counter for consecutive timeout/connection errors. It will warn the user that something could be wrong at every interval of $TESTFLINGER_ERROR_THRESHOLD (default 3) consecutive errors, but also indicate that it will keep retrying.

Resolved issues

CERTTF-283

Documentation

Added a reference section to the documentation about the testflinger config command, and the supported configuration settings.

Web service API changes

N/A

Tests

Additional unit tests added

@plars plars requested a review from a team December 3, 2024 14:08
Copy link
Contributor

@boukeas boukeas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes are very useful as it is indeed the case that users may not know how to interpret these messages.

I do believe that a requirement for 10 consecutive failures will effectively suppress all messages, even when there are actual networking issues, thus making these messages ineffective as a diagnostic tool. So my suggestions are to:

  • retrieve the number of messages required to display a warning from the config file, so that we are able to control it more easily
  • reduce the number to 5 or even 3

self.error_count = 0
except (IOError, requests.exceptions.ConnectionError) as exc:
self.error_count += 1
if self.error_count % 10 == 0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The divisor is hard-coded to 10 here, as well as the tests. I think it will give us a lot more flexibility if we could get this value from the configuration file.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback! I've added a config setting, documentation, and made the default 3.

@plars plars force-pushed the suppress-cli-server-errors branch from 6132a08 to 7cd7fe0 Compare December 9, 2024 17:10
@plars plars requested review from boukeas and tang-mm December 9, 2024 17:15
Copy link
Contributor

@tang-mm tang-mm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the reference! I just need a few clarifications on the command usage.
Also, in the previous how to authentication guide, we asked users to reload an .env file to update their environment variable. should we also change that doc to recommend the method using cli?


The Testflinger CLI can be configured on the command line using the parameters described in ``testflinger --help``. However, sometimes it also supports using environment variables, or reading values from a configuration file. This can be helpful for CI/CD environments, or for setting up a config with different default values read from a config file.

The configuration file is read from ``$XDG_CONFIG_HOME/testflinger-cli.conf`` by default, but this can be overridden with the ``--configfile`` parameter.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: does every sub-command of testflinger-cli support this parameter? If used, could you specify if the conf file is used for the current session or only for this single command?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would only alter the config file that is read for the current execution - not permanently. I think that's typical for commands that take a parameter to specify where to read the config file though. Do you have a suggestion for how I could phrase this better?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed it's a typical usage, but since we are referring to multiple override orders, it might be clearer if we list the resolution order altogether. Trying to add a new paragraph here, please correct if it's wrong:

Suggested change
The configuration file is read from ``$XDG_CONFIG_HOME/testflinger-cli.conf`` by default, but this can be overridden with the ``--configfile`` parameter.
The configuration file is read from ``$XDG_CONFIG_HOME/testflinger-cli.conf`` by default, but this can be overridden with the ``--configfile`` parameter.
When a configuration variable is defined in multiple locations, Testflinger resolves the value by applying the following priority order, from highest to lowest:
1. command-line parameter (for example, ``--server <local_server>``)
2. configuration file specified via the ``--configfile`` parameter
3. environment variable
4. default configuration file located at ``$XDG_CONFIG_HOME/testflinger-cli.conf``

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 and 4 are essentially the same thing - it's just a matter of whether it reads the config from the default location or from a new one that you specify, as noted in the text on line 6. I've mostly added this text about resolution order but just combined those two into a single line. See what you think.

There's probably a good argument to be made for changing it to prefer the env var instead of the config file, but that's completely unrelated to this PR. We can discuss that later and decide whether we should do it though.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, not sure if I answer this already...

Also, in the previous how to authentication guide, we asked users to reload an .env file to update their environment variable. should we also change that doc to recommend the method using cli?

No, although you can specify those through a command line, it's probably not going to be the preferred way as noted in the authentication howto.


You can show the current values stored in the config file by running ``testflinger config``. If no value is found on the command line, config file, or environment variable, then a safe default value will be used.

To set a configuration value in the config file, you can edit it directly or use ``testflinger config key=value``.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this command update only the default config file?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would typically apply to the default config file, however if you were to specify a different location for the config file, then it would use that one instead (because you told it to). That would be an unusual way to use it, but nothing wrong with it I suppose. It's something you would normally use just out of convenience, rather than having to go locate the config file and manually edit it to something like:

[testflinger-cli]
server = https://some-other-server.local

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked because I suppose testflinger config only outputs the configs in the default file. If --configfile can be used, then perhaps I can check the current valid configs in the temporary file, and update it (it's a creative usage, but just wanted to see if it's technically valid)

@plars plars force-pushed the suppress-cli-server-errors branch from 3cabd62 to 13effc5 Compare January 6, 2025 17:57
@plars plars force-pushed the suppress-cli-server-errors branch from 13effc5 to 30f2ab3 Compare January 6, 2025 18:33
@plars plars requested a review from tang-mm January 6, 2025 18:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants