Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDD IO check not working for disks in ZFS pools #132

Open
Fate6174 opened this issue Jun 20, 2024 · 1 comment
Open

HDD IO check not working for disks in ZFS pools #132

Fate6174 opened this issue Jun 20, 2024 · 1 comment

Comments

@Fate6174
Copy link

Fate6174 commented Jun 20, 2024

The function _check_hddio does not take disks in ZFS pools into account. In my case, the iostat command in lines 973-975

done < <(LC_ALL=C iostat -kdyNz -o JSON |
          jq -r '.sysstat.hosts[].statistics[].disk[] |
                 "\(.disk_device) \(.kB_read) \(.kB_wrtn)"');

gives IO information for disks sda to sdf. Those are looked up in the output of mount -l in line 905, but are not found, and so the check stops with the log message "DEBUG: Skipping as no mount point". This is because disks in ZFS pools do not show up in the output of mount -l, only the ZFS pool names. This leads to autoshutdown suspending my server, even though some ZFS resilvers or scrubs are running (which have more than enough disk IO).

For now, I have just commented out the mount point check in lines 905-907 and it correctly works in my setup. But I am not sure what the best general solution would be.

  • Is this check necessary at all? If there are disks that are not mounted, does this lead to some error or undesired behavior further down the line?
  • Is there a nice way to properly detect if disks are assigned in a ZFS pool that is mounted? E.g., the command zpool status outputs all available pools with their respective disks, but they may not be represented by the identifier used by iostat, but instead by their UUID or something similar.
@Fate6174
Copy link
Author

Answering my second bullet point, I think one could extend the check in lines 905-907 by changing it from

! mount -l | grep -q "${hdd}" && {
        _log "DEBUG: Skipping as no mount point"
        continue; }

to

 ! mount -l | grep -q "${hdd}" &&
    command -v zpool >/dev/null 2>&1 &&
    ! ZPOOL_SCRIPTS_AS_ROOT=1 zpool status -c upath | grep "${hdd}" | grep -q -e "ONLINE" -e "DEGRADED" && {
        _log "DEBUG: Skipping as no mount point"
        continue; }
Line Explanation
1 Test if ${hdd} is mounted normally (same as before). If NOT, go to line 2.
2 Test if the zpool command is available. If YES, go to line 3.
3 Test if ${hdd} appears as a disk with status "ONLINE" or "DEGRADED" in the output of zpool status -c upath. If NOT, go to line 4. The ZPOOL_SCRIPTS_AS_ROOT=1 variable is needed if autoshutdown is run with root privileges (I don't know if that is the case). If not, it can be omitted.
4 Log that the disk is not mounted (same as before).
5 Continue the loop (same as before).

Thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant