-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
block clone and bulk deletions with regards to files generated by Veeam. #16680
Comments
Just on terminology, deadlock is called dead because it never recovers. You are saying your system recovers after ~20 minutes, so it is not a deadlock. The question though is what the system doing all that time. The backtrace you've quoted shows waiting for a new transaction open. It likely means that current one is already full and waiting to be synced. The question may be what the sync thread is doing. Please take a look on what your system is doing (kernel/zfs processes, disks, etc). The sync threads and free requests it produces should be visible in Waiting for the new transaction we see likely stop all the writes to the pool, since they all should stuck waiting for it. In addition to that I see that the wait you quoted seems to be going from inode reclaim of the kernel, which makes me wonder if it can block even more of the system. This code is OS-specific, and I am not very good in Linux VFS layer to say what it means. I wonder if your system has many other open files besides the ones you are deleting here, that might overflow the limit or memory and trigger the eviction and something/everything else to wait for recycling. I've tried to reproduce this on FreeBSD with latest ZFS 2.3 (just what I have on my test system) and so far can't, even turning it to the max. What I do is creating similar HDD RAIDZ pool, writing there 6 10GB files, clones of each-other, and also 10 regular unique 1GB files and then delete them after exporting/importing pool to wipe the ARC caches. Even after I reduced recordsize to insane 4KB to blow out BRT tables, I still see the deletion process completes within may be 10 seconds. Profiler shows some lock contention related to cloning, which I might to look at, but nothing even close to the minutes you are talking about. So it could help to see more of what your system is doing during the process. |
While running some of my own tests with equivalent of ~30TB of cloned data I was able to reproduce some slowdowns, especially when BRT table is not in RAM. So would like to quote here some updates. While deleting cloned files, ZFS have to update per-block reference counters in BRT table. Roughly that table takes about 72+ bytes per block in RAM (plus overheads) and twice of that on disk. From To reduce/avoid those reads you might want to give ARC more RAM, which was improved in ZFS 2.3. May be some ARC tuning could help to keep the BRTs in ARC. To speedup the BRT reads you may use some good SSD(s) and special vdev, just consider that it will need to handle all the BRT writes, not only the reads. ZFS 2.3 got a command to manually preload the DDT tables (dedup has the similar problems, just several times worse), that should be more efficient than doing it on demand, but it still waits to be implemented for BRT tables. Also I create a couple of PRs to reduce per-block overheads, that might slightly help BRT memory usage with its 4KB blocks: #16684 and #16694 . |
thanks Alex, great work. We've changed our infrastructure a little now so we are now running on tin with 192GB RAM on Rocky9 (previously 64GB and running as a VM) This will certainly help to improve the overall efficiency. I've also switched to 2.3rc2 to help eliminate potential issues. |
@ashleyw-gh Here #16708 (comment) @robn is looking on the problem from a different side. |
#16740 should improve CPU-bound side of block cloning performance. |
#16773 should do even more for block cloning performance. |
#16814 should improve performance of large block cloning requests on HDD pools. |
Problem:
We are testing ZFS as backend storage to Veeam.
We are running Rocky9 on x64 on a Dell Poweredge server with 23x16TB spindles striping across 4x raid1 vdevs wih 192GB ram.
We notice that the system seems to deadlock/hang when deleting Veeam backup files where Veeam has block cloning enabled.
For example a deletion of the files below with a simple rm command hangs and deadlocks all subsequent transactions.
System information
view of subset of file system.
In the /var/log/messages file and console we see repetitions of messages such as;
Describe how to reproduce the problem
Deadlock can easily be reproduced by deletion of many large files at an OS level with high degree of cloned blocks in them.
When a deadlock occurs any subsequent activities from another shell session hang.
It appears that the transactions eventually complete after about 20 minutes or longer (depending on how large the files are and how many are in the batch).
I suspect similar deadlocking issues are causing various Veeam errors I've been seeing (due to the threading architecture within Veeam).
We have sync disabled for performance.
Is file deletion when block cloning is in the mix an extremely slow process and should this block all other activity on the zvol?
Is there anything to improve the reliability and throuput
and the zpool options for completeness.
The text was updated successfully, but these errors were encountered: