Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Even further job logging improvements #131

Open
1 of 3 tasks
fmigneault opened this issue May 11, 2020 · 0 comments
Open
1 of 3 tasks

[Feature] Even further job logging improvements #131

fmigneault opened this issue May 11, 2020 · 0 comments
Assignees
Labels
feature/docker Issue related to Docker application package execution. feature/job Issues related to job execution, reporting and logging. project/CRIM-DEVOPS Project linked to the CRIM project DevOps/Weaver (https://crim-ca.atlassian.net/browse/GD-47). project/DACCS Related to DACCS project (https://github.com/orgs/DACCS-Climate) triage/feature New requested feature. triage/investigate Exploration tasks or issues requirering more analysis

Comments

@fmigneault
Copy link
Collaborator

fmigneault commented May 11, 2020

Context

PR #80 provided some improvements for logging execution percentage of the job.
Still, when the underlying job process is running a CWL package, any logging/output is lost (especially via a docker execution) since printed lines are not added to the WPS execution log.

To solve this issue, we can add a stdout capture of the CWL process that will act as a hook to capture and extend the WPS log with the app's logs. With this we get as fine grained details as printed out by the app.
https://www.commonwl.org/user_guide/05-stdout/index.html

Only remaining issue lies with current job monitoring using an internal WPS (see #21). Because of this, the full log is only retrieved at the end of the job execution, instead of gradually as they are being printed out.

if execution.isSucceded():
job.status = map_status(STATUS_SUCCEEDED)
job.status_message = "Job succeeded{}.".format(msg_progress)
wps_package.retrieve_package_job_log(execution, job)
job.save_log(logger=task_logger)

This removes any opportunity to provide job 'progression' reporting by the app, which is very inconvenient for long running applications that provide gradual updates.

Unless using very dirty hacks, there is no way to directly pass the job uuid to the sub-process (celery task), so that logs can be appended gradually, because it itself relies on the owslib WPSExecute request sent to PyWPS Service.

Investigate

All above 'progressive' logging assume that it is possible to retrieve the log asynchronously while a docker application is running.
This needs to be investigated further as it could be not available until CWL completes the subprocess.
edit: quick test using tail -f std[err|out].log shell call show that the file is correctly updated while the docker app is running locally. We should only need to retrieve them and append them to the job logs. This can be done directly at WpsPackage level. For remote process we can't do more than what is provided on the status monitoring route (ie: what is already done since PR #132).

Steps

  • add the 'base' CWL stdout hook to at least retrieve the full log at the end
  • implement Remove double process execution monitoring layers #21
  • use async routine that gradually appends the new log entries to the provided job full log to get gradual updates of the job execution (will require managing some concurrent job read/saves)

Relates to DAC-535

@fmigneault fmigneault added triage/feature New requested feature. project/DACCS Related to DACCS project (https://github.com/orgs/DACCS-Climate) labels May 11, 2020
@fmigneault fmigneault self-assigned this May 11, 2020
@fmigneault fmigneault added the triage/investigate Exploration tasks or issues requirering more analysis label May 12, 2020
@fmigneault fmigneault added the feature/job Issues related to job execution, reporting and logging. label May 12, 2020
@fmigneault fmigneault added the project/CRIM-DEVOPS Project linked to the CRIM project DevOps/Weaver (https://crim-ca.atlassian.net/browse/GD-47). label May 7, 2024
@github-actions github-actions bot added the feature/docker Issue related to Docker application package execution. label May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature/docker Issue related to Docker application package execution. feature/job Issues related to job execution, reporting and logging. project/CRIM-DEVOPS Project linked to the CRIM project DevOps/Weaver (https://crim-ca.atlassian.net/browse/GD-47). project/DACCS Related to DACCS project (https://github.com/orgs/DACCS-Climate) triage/feature New requested feature. triage/investigate Exploration tasks or issues requirering more analysis
Projects
None yet
Development

No branches or pull requests

1 participant