-
-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v560tu seems slow on IO, needs confirmation #1894
Comments
I assume it can be useful to document disk/memory in both setups as well |
Modified OP @macpijan |
What are the models of SSDs in both laptops? V560TU could have a DRAM-less disk which are much slower in random I/O |
I think it's reasonably to assume disks are from what @wessel-novacustom is offering in the configurator. V56 offers only one option: https://novacustom.com/product/v56-series/ While NV41 used to offer Samsung disks: https://novacustom.com/product/nv41-series/ Can we please summarize the exact hw configurations being compared here? |
NV41 can come with 980 Pro (with DRAM cache) or 980 (DRAM-less). Apparently the PX700 disk in V560TU does not have DRAM cache but goodram sells it as |
I'm not sure if that's actually the issue. If I compare I/O write and especially read operations (especially with small files), the -TU models are performing bad compared to the NVIDIA variants. Now that's for UEFI of course, but might still make sense to compare to see if there is a difference and why. |
@mkopec @macpijan @wessel-novacustom : updated OP
and
Opened issue QubesOS/qubes-issues#9723 Want me to run some tests? Please detail. Also note that as pointed out at #1889 (comment): When we attempted that coreboot version bump, perf worsen to the point of systemd services timing out on first stage of qubesos installation, and templates installation took more than an hour. |
Some additional notes from testing, will edit (re-testing https://github.com/tlaurion/heads/tree/perf_comparison_with_reverted_coreboot_version_bump_v560tu)
Insight is not only IO being slower, but as if CPU speed was limited to be turtle speed as well.
full log: |
Exerpt:
So similar (but still slower) compared to @mkopec dump at #1894 (comment): |
I agree this is to be checked. Did we have a tracking of this problem already? Also, do we want to do it right now? Ideally, we release heads release from the same (similar) coreboot base as the previous UEFI release. |
Thanks for this test, this will be useful testing point for the future. We should focus on testing what we are aiming to release right now, however. We propose that we both confirm once again that the binary from this commit is "OK" in terms of performance as reported by us here previously: #1894 (comment) Where "OK" means "comparable to the existing UEFI release in the same hardware specification", not "strictly better than previous laptop model in this specific benchmark", especially that devices with different hardware configurations are being compared here. If confirmed, we propose that we use this commit as a release for Dasharo+heads @tlaurion @wessel-novacustom to not postpone it any longer. We can continue investigating the performance concerns, such as raised by @wessel-novacustom here #1894 (comment) in individual dasharo issues. |
Ideally, it would be fixed, despite the coreboot base being slightly different in that case. In case a coreboot patch could be made to fix this issue, I'm interested in getting a link to that patch, so I can assist customers with UEFI firmware who are complaining about this. My suggestion is to make a |
I will move this off-channel to discuss quicker and come back here with conclusions. |
|
...isn't |
Here's my all with default settings. Of course with LUKS, LVM and Xen between disk and benchmark this isn't measuring just I/O but it should be more objective. @tlaurion can you share yours for comparison? |
Sorry 64gb, will edit prior posts :( |
Thanks for testing! So you are having significantly worse performance than me... I think next we will test on the same 4TB disk model as in your V560TU |
It's really hard to compare things, intra-group and inter-group. For example, nv41 doesn't require to be installed with kernel-latest. Also, my nv41 setup is btrfs based with heavy optimizations as opposed to lvm default setup, with no revisions to keep (leaving wyng do the one revision to keep being snapshot corresponding to last backup), Still, some stats that could be eye opening to consider what needs to be improved next. nv41 statssnap: let's not use that. debian-template, stable kernel (6.6.68)fedora-40-dvm, stable kernel (6.6.68):fedora-40-dvm, latest kernel (after dom0
|
How come is this a Heads issue? As opposed to Dasharo-UEFI (or simply coreboot), m2 drive or qubesos latest kernel? I'm not sure I follow. As depicted in 3rd picture at #1894 (comment) Switching nv41 to using qubesos latest kernel showed similar perf issues of v560tu, which requires installation and usage of kernel-latest at install, as opposed to nv41. No? Then there is 4tb drive which has no SDRAM, offered optional, as compared to tests of @mkopec. Sorry, but this cannot be Heads specific, nor Heads fault. I'm ok putting this as known issues if sub-issues are opened and referred to in downstram releases. This issue will stay open and reffered in sub-created issues. From gut feeling here, it's either important perf issue caused by latest kernel on which v560tu depends, the 4tb drive lacking sdram or coreboot doing something funky, but Heads has nothing to do with what is observed here. Heads job is long done and irrelevant to observed issue. Unless the coreboot used commit between the two version is different and at cause for same HCL. |
So I guess what you mean is that the same notes should be present for both Dasharo-UEFI and Dasharo-Heads, since it's not Heads specific and if it is, it's because of coreboot commit difference. |
Issue has been raised separately for UEFI: Dasharo/dasharo-issues#1216 |
@tlaurion do you have any CLI version of a benchmark that shows this issue? Something that would produce a text output that I can then parse with a script, compare with plain diff etc. I would like to add disk performance test to our CI, but manually comparing graphical screenshots is not going to fly. |
Maybe some specific configs to the |
I don't have insights on this matter, unfortunately. |
@marmarek any command line command you propose I can reuse. kdisktats is what is used by end users on forum to show things visually. Replicating that kdisktats does, from a command line perspective, coukd help here, yes. |
Untested, please advise @marmarek Random Read Testfio --name=randread-test --filename=/path/to/testfile --rw=randread --bs=4k \
--size=1G --numjobs=4 --time_based --runtime=60 --iodepth=32 \
--ioengine=libaio --direct=1 --group_reporting Random Write Testfio --name=randwrite-test --filename=/path/to/testfile --rw=randwrite --bs=4k \
--size=1G --numjobs=4 --time_based --runtime=60 --iodepth=32 \
--ioengine=libaio --direct=1 --group_reporting Mixed Random Read/Write Testfio --name=randrw-test --filename=/path/to/testfile --rw=randrw --bs=4k \
--size=1G --numjobs=4 --time_based --runtime=60 --iodepth=32 \
--ioengine=libaio --direct=1 --group_reporting Sequential Read Testfio --name=seqread-test --filename=/path/to/testfile --rw=read \
--bs=256k --size=1G --numjobs=4 --time_based --runtime=60 \
--iodepth=32 --ioengine=libaio --direct=1 --group_reporting Sequential Write Testfio --name=seqwrite-test --filename=/path/to/testfile --rw=write \
--bs=256k --size=1G --numjobs=4 --time_based --runtime=60 \
--iodepth=32 --ioengine=libaio --direct=1 --group_reporting Explanation of Parameters:
These commands will output metrics such as IOPS, bandwidth (throughput), average latency, and queue depth, allowing you to replicate the performance insights provided by Citations : |
I prepared something similar based on example configs provided with the fio tool: QubesOS/qubes-core-admin#649 This is running on a "hw1" runner (some HP laptop), and I'm just repeating the test with kernel-latest. So far, results are pretty much the same (no regression detected). I'll repeat the tests on NV41 or v56. If difference is confirmed with this tool, I can for example run automated git bisect on the kernel between good and bad version to identify specific change causing the regression. |
@marmarek note that my results for nv41 were on btrfs as noted above the screenshots in comment. It was only a reminder that if at cause, v560tu depending on kernel-latest might explain partly perf degradation, where unfortunately, one cannot test kernel-stable on v560tu which requires kernel-latest on initial install for meteorlake support as opposed to nv41. |
I did a test with fio on NV41, using very similar config to "Mixed Random Read/Write Test" above (difference: I let it run for 90s, not 60s; and iodepth 16 not 32) with different kernel versions. I do see a small slowdown with 6.12 kernel in dom0, but nowhere near the effect you see. I'm looking at "read_bandwidth_kb" and "write_bandwidth_kb" columns (column 7 and 48 respectively): fio in dom0: @tlaurion can you check if you see the drastic difference with fio on your side? I did the test on LVM (very default installation of R4.2), if fio reports larger difference for you, then maybe it's btrfs-specific? But if you don't see the difference either, then we need another test. In that case, can you check other examples you provided above? Does any of them show the 2x+ difference between kernel versions for you? |
dom0 comparatives and used script @marmarek. btrfs, no revision to keep, 2tb drive with sdram cache, 64gb ram, on usbc power with all qubes shutdown: dom0_6.6.68-1.qubes.fc37.x86_64_additional-seq-read_output.txt tar.gz with script used inside: Ker mw know if I should go deeper with this and test fio from within qubes |
So, I did the test on NV41 with LVM using kdiskmark in a fedora-40-xfce VM to have the same testing methodology. In my case I see slight performance improvement (at least with read speeds) with 6.12 kernel in dom0... The laptop has Dasharo (coreboot+heads) 0.9.0. Is there any relevant change in 0.9.1 that could affect results? |
For the sake of automated testing, here are the fio commands called under the hood with default profile with kdiskmark @marmarek
qubes-public matrix channel thread So we should abort this and simply consider that this is a BTRFS perf regression for 6.12.9 when compared to 6.6.68.
I do not know what to do with nv41 vs v560tu performance, where nv41 performs better. |
Ok, so I reinstalled the NV41 with BTRFS, and I got different results. In the order of running: So, first of all, most are a lot slower than on LVM. Same hardware, also fresh install (almost empty disk etc). I added also Next tests I need to do with a script (update with the new fio commands etc), to be sure I haven't missed any step. But that will need to wait for the next week unfortunately. |
Only thing I can do is compare across devices.
This is
systemd-analyze blame
output, for referenceMy nv41 has bigger templates and is my main driver.
nv41 64 GB ram, drive: Samsung SSD 980 PRO 2TB
v560tu 64gb ram, drive: SSDPR-PX700-04T-80
cc @macpijan
The text was updated successfully, but these errors were encountered: