Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Config loading more debugging information #775

Open
stan-dot opened this issue Jan 10, 2025 · 9 comments
Open

Config loading more debugging information #775

stan-dot opened this issue Jan 10, 2025 · 9 comments
Labels
bug Something isn't working

Comments

@stan-dot
Copy link
Contributor

At i18 cluster blueapi fails but the pod does not crash. the error logs do not say 1) where is the config loaded from 2) which part of the config is invalid (rest, stomp, etc etc),

This might be some indentation error in the deployment yamls somewhere.

https://k8s-i18-dashboard.diamond.ac.uk/#/log/i18-beamline/i18-blueapi-0/pod?namespace=i18-beamline&container=blueapi

Steps To Reproduce

Logs from
blueapi
in
i18-blueapi-0
Matplotlib created a temporary cache directory at /tmp/matplotlib-p02eagla because the default path (/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
2025-01-10 14:46:39,528 INFO: Started server process [1]
2025-01-10 14:46:39,528 INFO: Waiting for application startup.
2025-01-10 14:46:39,529 - Invalid type ApplicationConfig for attribute '_config' value. Expected one of ['bool', 'str', 'bytes', 'int', 'float'] or a sequence of those types

device and version

blueapi 0.6.1

Acceptance Criteria

  • error messages are clear
  • i18 blueapi works with that or a later version
@stan-dot stan-dot added the bug Something isn't working label Jan 10, 2025
@callumforrester
Copy link
Contributor

@stan-dot your link is now expired

2025-01-10 14:46:39,529 - Invalid type ApplicationConfig for attribute '_config' value. Expected one of ['bool', 'str', 'bytes', 'int', 'float'] or a sequence of those types

Is this all the logs said or an abbreviation? I note it's missing a log level (DEBUG, INFO, etc)

@stan-dot stan-dot changed the title Config loading fails and sparse debugging information Config loading more debugging Jan 13, 2025
@stan-dot stan-dot changed the title Config loading more debugging Config loading more debugging information Jan 13, 2025
@stan-dot
Copy link
Contributor Author

ran into the same thing at i20-1 - indentation error where scratch/root key was in top level blueapi instead of inside the worker.

we could use a verifiable schema checker to ensure this is caught early.

options include: manual diff util, JSON schema versioned, external yaml linter, ci/cd verification with a script, or some tool like (https://github.com/karuppiah7890/helm-schema-gen)

this tool would check whether a ixx-beamline values.yaml for blueapi fits the object definition of the values.yaml in this chart

@callumforrester
Copy link
Contributor

@stan-dot Before we get into solutions I'd like to get my head around the problem. Please could you provide complete logs or a set of steps to reproduce.

@stan-dot
Copy link
Contributor Author

to reproduce:

  1. define an invalid values.yaml - modifying a valid one. Move indentation left on the blueapi.worker.stomp to instead be lower in the hierarchy - on blueapi.stomp
  2. Attempt deployment
  3. Consider if the readout is of the type Invalid type ApplicationConfig for attribute '_config' value. Expected one of ['bool', 'str', 'bytes', 'int', 'float'] or a sequence of those types or does it see that there is one extra higher-level config and one missing config inside the worker definition, telling this to the user/ developer plainly

@stan-dot
Copy link
Contributor Author

another instance at i20-1

logs-from-blueapi-in-i20-1-blueapi-0.log

callumforrester added a commit to epics-containers/p46-services that referenced this issue Jan 16, 2025
Deliberately break stomp config to reproduce a bug, see DiamondLightSource/blueapi#775 (comment)

Will revert after testing.
@callumforrester
Copy link
Contributor

@stan-dot I can partially reproduce, I did this: epics-containers/p46-services@aa421fb

Here are my logs:
logs-from-blueapi-in-daq-blueapi-0.log

The error message says that the fields auth and host are invalid ('loc': ('auth',), 'loc': ('host',),) but does not say they are part of stomp. That's because here, they actually are not part of stomp. By indenting them to the left, we have made them part of the root config (ApplicationConfig). The error message is pydantic saying they shouldn't be part of the root, it has no way of knowing from the data given that they should in fact be part of stomp. If there were an actual problem with the stomp config the error would specify (e.g. 'loc': ('stomp', 'port'),) but it's not getting that far. I'm not sure what we can do about that, pydantic can't predict a human's intentions, it can only cite problems with specific fields. Since we can't tell the problem is with stomp, can you think of any other contextual information we could print out that would have helped you to figure it out?

Printing the source from which the configuration is loaded in the error would be a good idea, I agree. How useful is that alone to you?

another instance at i20-1
logs-from-blueapi-in-i20-1-blueapi-0.log

I'm not sure how this is related, it's not a config/pydantic error, blueapi seems to be unable to import a function from dodal. Are these the logs you meant to post?

@stan-dot
Copy link
Contributor Author

the dodal import for skip_device got fixed but the line above has this:
2025-01-15 10:44:53,650 - Invalid type ApplicationConfig for attribute '_config' value. Expected one of ['bool', 'str', 'bytes', 'int', 'float'] or a sequence of those types
which is what we discussed above.

I'd like a printout of the loaded config yes,

@callumforrester
Copy link
Contributor

Okay, so acceptance criteria:

  • When a bllueapi.utils.invalid_config_error.InvalidConfigError is raised on startup it should include
    • The source(s) of the invalid configuration
    • A human-readable printout of all of the config loaded (YAML?)

@stan-dot
Copy link
Contributor Author

might add a json schema from blueapi and reference it in the helm template (?)
https://github.com/epics-containers/services-template-helm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants