Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

suspend/suspend_advanced_auto always returns passed result #1579

Open
1 task done
hanhsuan opened this issue Nov 5, 2024 · 2 comments
Open
1 task done

suspend/suspend_advanced_auto always returns passed result #1579

hanhsuan opened this issue Nov 5, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@hanhsuan
Copy link
Contributor

hanhsuan commented Nov 5, 2024

Bug Description

while suspend/suspend_advanced_auto using fwts to do the test, the pipe command will make it to return pass always.

set -o pipefail; checkbox-support-fwts_test -f none -l "$PLAINBOX_SESSION_SHARE"/suspend_single.log -s s3 --s3-sleep-delay=30 --s3-device-check --s3-device-check-delay=45 | tee "$PLAINBOX_SESSION_SHARE"/suspend_single_times.log

Cert-blocker Test Case

  • This issue is about a test case that has the "blocker" certification status

To Reproduce

run suspend/suspend_advanced_auto

Environment

  • OS: 24.04
  • checkbox version: 4.2.0dev99ubuntu24.04.1
  • checkbox type: debian

Relevant log output

No response

Additional context

submission

@hanhsuan hanhsuan added the bug Something isn't working label Nov 5, 2024
Copy link

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/CHECKBOX-1642.

This message was autogenerated

@zongminl zongminl changed the title suspend/suspend_advanced_auto is always return pass suspend/suspend_advanced_auto always returns passed result Nov 7, 2024
@hanhsuan
Copy link
Contributor Author

From the kernel document:

In the /sys/power/suspend_stats directory, you’ll find several files that provide information about the system’s suspend and resume behavior. Two of these files are fail and failed_suspend. While they may seem similar, they actually track different aspects of the suspend process.

fail file: The fail file contains a counter that increments whenever a suspend operation fails due to a hardware or driver issue. This can happen when a device is not properly suspended or resumed, causing the system to fail to enter or exit suspend mode. The fail counter is reset to 0 when the system is rebooted.

failed_suspend file: The failed_suspend file contains a counter that increments whenever a suspend operation fails due to a system-wide issue, such as a kernel panic or an unhandled interrupt. This can happen when the system encounters an unexpected error during the suspend process, preventing it from completing successfully. Like the fail counter, the failed_suspend counter is also reset to 0 when the system is rebooted.

Key differences:

Scope: The fail file focuses on hardware or driver-specific issues, while the failed_suspend file tracks system-wide issues that prevent suspend from completing.
Error type: The fail file typically indicates errors related to device suspension or resume, such as a device not being properly suspended or resumed. In contrast, the failed_suspend file indicates more severe errors, like kernel panics or unhandled interrupts, that prevent the system from suspending or resuming correctly.
Impact: A non-zero value in the fail file may indicate a problem with a specific device or driver, while a non-zero value in the failed_suspend file suggests a more fundamental issue with the system’s suspend and resume mechanism.
To illustrate the difference, consider the following scenarios:

If a USB device is not properly suspended, the fail counter might increment, indicating a hardware or driver issue.
If the system encounters a kernel panic during suspend due to an unhandled interrupt, the failed_suspend counter would increment, indicating a more severe system-wide issue.
By monitoring both files, you can gain insights into the reliability and stability of your system’s suspend and resume behavior.

Tested In the 202411-35985

Could see the fail is 2 and failed_suspend is 1

ubuntu@ubuntu:~$ sudo rtcwake -d /dev/rtc0 -m no -s 30 && systemctl suspend
rtcwake: assuming RTC uses UTC ...
rtcwake: wakeup using /dev/rtc0 at Thu Dec 12 01:19:09 2024
==== AUTHENTICATING FOR org.freedesktop.login1.suspend ====
Authentication is required to suspend the system.
Authenticating as: ubuntu,,, (ubuntu)
Password: 
==== AUTHENTICATION COMPLETE ====
ubuntu@ubuntu:~$ cat /sys/power/suspend_stats/fail
2
ubuntu@ubuntu:~$ cat /sys/power/suspend_stats/failed_suspend
1

In the journal log we could see first suspend are failed due to [PM] MD suspend error: -110 and triggered another suspend. The second suspend is stopped by systemd-sleep[2795]: Failed to put system to sleep. System resumed again: Connection timed out.

The twice failures from device make the fail set to 2 and the one failure from systemd-sleep makes the failed_suspend to 1. Therefore failed_suspend might be the better one to fail the suspend_advaned_auto test case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant