Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separating Ext2/3/4 from filesys and using lsblk -rpo instead /proc/mounts #3502

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

lherbolt
Copy link
Contributor

@lherbolt lherbolt commented Feb 1, 2024

[xfs,filesys,ext] Get list of devices by lsblk -lpo NAME,FSTYPE.

It can happen the FS of interest is not actually mounted and then we have no info about it. This is revorks of how we should get list of devices. It adds a new init.py fuction to decrease code duplication get_dev_by_fstype(self,fstype). It is using lsblk instead of /proc/mounts. Last change is separation of Ext2/3/4 info into separate module from filesys.


Please place an 'X' inside each '[]' to confirm you adhere to our Contributor Guidelines

  • Is the commit message split over multiple lines and hard-wrapped at 72 characters?
  • Is the subject and message clear and concise?
  • Does the subject start with [plugin_name] if submitting a plugin patch or a [section_name] if part of the core sosreport code?
  • Does the commit contain a Signed-off-by: First Lastname [email protected]?
  • Are any related Issues or existing PRs properly referenced via a Closes (Issue) or Resolved (PR) line?

Copy link

Congratulations! One of the builds has completed. 🍾

You can install the built RPMs by following these steps:

  • sudo yum install -y dnf-plugins-core on RHEL 8
  • sudo dnf install -y dnf-plugins-core on Fedora
  • dnf copr enable packit/sosreport-sos-3502
  • And now you can install the packages.

Please note that the RPMs should be used only in a testing environment.

Copy link
Member

@arif-ali arif-ali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this refactor, we are now collecting potentially the same set of items from 2 separate plugins, and we are typically not keen on that.

We'll also need to double check to see if any automation that our downstream have that interrogates files collected in filesys in order for them to be removed from there.

I am sure other core reviewers may have other things on this

sos/report/plugins/ext.py Outdated Show resolved Hide resolved
sos/report/plugins/ext.py Show resolved Hide resolved
@jcastill
Copy link
Member

jcastill commented Feb 1, 2024

Last change is separation of Ext2/3/4 info into separate module from filesys.

I'm not sure if this should go in the same commit or a separated one. My feeling is that it should be a second commit, but lets hear other's opinions.

We are still using /proc/mounts in filesys.py:

       dumpe2fs_opts = '-h'
        if self.get_option('dumpe2fs'):
            dumpe2fs_opts = ''
        mounts = '/proc/mounts'
        ext_fs_regex = r"^(/dev/\S+).+ext[234]\s+"
        for dev in self.do_regex_find_all(ext_fs_regex, mounts):
            self.add_cmd_output("dumpe2fs %s %s" % (dumpe2fs_opts, dev),
                                tags="dumpe2fs_h")

So it may be worth removing that code completely from filesys now, since you'll be getting it via this new plugin.

@lherbolt
Copy link
Contributor Author

lherbolt commented Feb 1, 2024

With this refactor, we are now collecting potentially the same set of items from 2 separate plugins, and we are typically
not keen on that.

Yeah sorry about that forgot to push filesys where the Ext2/3/4 are removed. I will update the pull request.

@jcastill
Copy link
Member

jcastill commented Feb 1, 2024

We'll also need to double check to see if any automation that our downstream have that interrogates files collected in filesys in order for them to be removed from there.

Same on our side, but I see a lot of benefits on these changes, specially separating ext from filesys like xfs is -that is, with my storage/filesystem engineer hat, I can find it really really useful.

We need to find a way to make sure these changes are properly propagated to users using automation. Some kind of policy for coexistence and maybe duplication for some time, in terms of minor releases. Other options, but I imagine these are less agreeable, are:

  • Create a link in filesys to the dumpe2fs outputs in ext.
  • If filesys is called with dumpe2fs, make sure the option is enabled in the ext module.

I don't think we've ever done these kind of things in sos, I guess to avoid interlinking plugins.

Copy link
Member

@arif-ali arif-ali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some other changes that came into mind, based on contribution guidelines

sos/report/plugins/ext.py Outdated Show resolved Hide resolved
sos/report/plugins/ext.py Outdated Show resolved Hide resolved
sos/report/plugins/ext.py Outdated Show resolved Hide resolved
sos/report/plugins/ext.py Outdated Show resolved Hide resolved
sos/report/plugins/xfs.py Outdated Show resolved Hide resolved
sos/report/plugins/xfs.py Outdated Show resolved Hide resolved
sos/report/plugins/xfs.py Outdated Show resolved Hide resolved
sos/report/plugins/xfs.py Outdated Show resolved Hide resolved
It can happen the FS of interest is not actually mounted and then
we have no info about it. This is revorks of how we should get
list of devices. It adds a new __init__.py fuction to decrease
code duplication get_dev_by_fstype(self,fstype). It is using lsblk
instead of /proc/mounts. Last change is separation of Ext2/3/4
info into separate module from filesys.

Signed-off-by: Lukas Herbolt <[email protected]>
@lherbolt
Copy link
Contributor Author

lherbolt commented Feb 2, 2024

I hope I add all the changes. About the duplicate info, we can append to the filesys dumpe2fs some note about being deprecated in next X.Y releases.

Copy link
Member

@TurboTurtle TurboTurtle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly a couple minor points below, however I think it warrants fleshing out moving the discovery of filesystems to SoSReport, as part of our _get_hardware_devices() bits, so that we generate this information once, and have plugins (and potentially any other part of sos) reference that collection.

Comment on lines +1619 to +1630
def get_devices_by_fstype(self, fstype):
"""Get list of devices based on fstype and returns empty array in case
of failure.
"""
dev = []
all_devs = self.exec_cmd("lsblk -rpo NAME,FSTYPE")
if (all_devs['status'] == 0 and not all_devs['truncated']):
if all_devs['status'] == 0:
for line in all_devs['output'].splitlines():
if (fstype in line) and (line.split()[0] not in dev):
dev.append(line.split()[0])
return dev
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm torn about having this move from Plugin to SoSReport, and have it run alongside our device introspection, like for block devs. Granted, these aren't hardware devices, but it's in the same vein as what we do for storage and network interfaces.

That way we run this once, when sos first initializes, and then in Plugin this function would instead simply filter through an already-made list/dict/whatever for those matching the requested fstype.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I will into it, the main idea here is to reuse the info, not just for FS but also md.py can benefit from it as the FSTYPE in limited to FS as per blkid -k

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm torn about having this move from Plugin to SoSReport, and have it run alongside our device introspection, like for block devs. Granted, these aren't hardware devices, but it's in the same vein as what we do for storage and network interfaces.

That way we run this once, when sos first initializes, and then in Plugin this function would instead simply filter through an already-made list/dict/whatever for those matching the requested fstype.

So I have the work almost ready to have this merged into _get_hardware_devices(), but there is thing. If I create nested dict like:

 {'storage': {'block': ['data0', 'data1'], 'fstype': {'unknown': ['block0','block1'],
                                 'vfat': ['block2']}}}

The function recursive_dict_values_by_key() will not correctly return the device list, only if I put it into same level as storage and network dict are. So the function does not seems to recursive properly. Any pointers what could be there wrong?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I should have been clearer - my intent was to suggest that this be at the same level as storage, network, or similar. So plugins would access it via something like for fs in self.devices['filesystems']['xfs']: ...blah....

Again, this is a bit of a misnomer placing it under devices, so if you prefer we can also make a separate attribute to hold these, so that plugins could do something like for fs in self.filesystems['xfs']:

Comment on lines +37 to +38
if allfs:
for fs in allfs:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conditional is superfluous, if allfs is an empty list, python just continues on, skipping whatever is in the for-loop.

Copy link
Contributor Author

@lherbolt lherbolt Feb 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting something like:

    def setup(self):
        allfs = self.get_devices_by_fstype('xfs')
        for fs in allfs:
            self.add_cmd_output(f"xfs_info {fs}", tags="xfs_info")
            self.add_cmd_output(f"xfs_admin -l -u {fs}")

        if not allfs:
            mounts = '/proc/mounts'
            ext_fs_regex = r"^(/dev/.+).+xfs\s+"
            for dev in zip(self.do_regex_find_all(ext_fs_regex, mounts)):
                for e in dev:
                    parts = e.split()
                    self.add_cmd_output(f"xfs_info {parts[1]}",
                                        tags="xfs_info")
                    self.add_cmd_output(f"xfs_admin -l -u {parts[0]}")

        self.add_copy_spec(self.files)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that's correct. We could even button it up a little more by for fs in self.get_devices_by_fstype('xfs'): provided there is nothing else using allfs later on.

Comment on lines +86 to +87
if allfs:
for fs in allfs:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, we can drop the conditional and unnest the entire block by one level here.

Comment on lines +29 to +30
if allfs:
for fs in allfs:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto here.

@TurboTurtle
Copy link
Member

We'll also need to double check to see if any automation that our downstream have that interrogates files collected in filesys in order for them to be removed from there.

Same on our side, but I see a lot of benefits on these changes, specially separating ext from filesys like xfs is -that is, with my storage/filesystem engineer hat, I can find it really really useful.

We need to find a way to make sure these changes are properly propagated to users using automation. Some kind of policy for coexistence and maybe duplication for some time, in terms of minor releases. Other options, but I imagine these are less agreeable, are:

  • Create a link in filesys to the dumpe2fs outputs in ext.
  • If filesys is called with dumpe2fs, make sure the option is enabled in the ext module.

I don't think we've ever done these kind of things in sos, I guess to avoid interlinking plugins.

Automation should be using sos_reports/manifest.json to find the files they're after. It's the entire point of the manifest - that things move and automation needs a source of truth for finding where stuff is written. Even better, would be the explicit use of the tags for the collections - every command gets an automated set of tags but we can add specific ones as well that can be tailored to what an automation pipeline wants to leverage.

This isn't really that different from other plugins we've broken up in the past, like general. Stuff moves, this just happens to be a particularly old plugin that is getting re-worked. Automation uses manifest.json, users get messaging. We'll include this in the change notes for 4.7.0 and from there it's up to downstreams to inform their own users.

Comment on lines +75 to +81
option_list = [
PluginOpt('dumpe2fs', default=False, desc='dump filesystem info'),
PluginOpt('frag', default=False,
desc='collect filesystem fragmentation status')
]

def setup(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, missed this on my first review. This seems like a copy/paste error. The options_list shouldn't go within setup(), and we've got a duplicate definition of the setup() method, which would break several things. If I'm reading everything right on the github webui, I think you just need to remove these lines entirely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah copy&paste error

@lherbolt
Copy link
Contributor Author

I will finalize this one after ACK/NACK/CHANGE of #3538

@arif-ali
Copy link
Member

@lherbolt any updated on this one, would be cool to finalise this, now that #3538 has been merged

@arif-ali arif-ali added the Status/Need More Info Feedback is required to reproduce issue or to continue work label Nov 20, 2024
@jcastill jcastill added the Status/RedHat Eng RedHat Engineering has been requested to review label Nov 21, 2024
@lherbolt
Copy link
Contributor Author

uups went out of my radar, will look at it asap

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status/Need More Info Feedback is required to reproduce issue or to continue work Status/RedHat Eng RedHat Engineering has been requested to review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants