-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Let cache lock fail after 24h and warn after 2s #498
Conversation
Reviewer's Guide by SourceryThis pull request introduces a timeout mechanism for acquiring folder locks in the cache, preventing indefinite waiting. It sets a default timeout of 24 hours and warns the user after 2 seconds if the lock cannot be acquired, suggesting manual deletion of the lock file if necessary. The Sequence diagram for acquiring a folder lock with timeout and warningsequenceDiagram
participant User
participant FolderLock
participant FileLock
User->>FolderLock: FolderLock(folders, timeout, warning_timeout)
FolderLock->>FileLock: SoftFileLock(lock_file)
User->>FolderLock: __enter__()
alt warning_timeout < timeout
FolderLock->>FileLock: acquire(timeout=warning_timeout)
alt Lock acquired within warning_timeout
FileLock-->>FolderLock: returns
FolderLock-->>User: returns
else Lock not acquired within warning_timeout
FileLock-->>FolderLock: Timeout
FolderLock->>User: issues warning
FolderLock->>FileLock: acquire(timeout=remaining_time)
FileLock-->>FolderLock: returns
FolderLock-->>User: returns
end
else timeout < warning_timeout
FolderLock->>FileLock: acquire(timeout=timeout)
FileLock-->>FolderLock: returns
FolderLock-->>User: returns
end
Updated class diagram for FolderLockclassDiagram
class FolderLock {
-folders: str | Sequence[str]
-timeout: float
-warning_timeout: float
-lock_files: list
-locks: list
+__init__(folders: str | Sequence[str], timeout: float, warning_timeout: float)
+__enter__() : FolderLock
+__exit__()
}
File-Level Changes
Assessment against linked issues
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @hagenw - I've reviewed your changes - here's some feedback:
Overall Comments:
- Consider adding a brief explanation in the description why the default timeout was reduced to 24 hours.
- The warning message in
FolderLock
is very long; consider shortening it or breaking it into multiple lines for better readability.
Here's what I looked at during the review
- 🟢 General issues: all looks good
- 🟢 Security: all looks good
- 🟢 Testing: all looks good
- 🟢 Complexity: all looks good
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
audb/core/load.py
Outdated
accessing the database. If timeout is reached, ``None`` is | ||
returned. If timeout < 0 the method will block until the | ||
database can be accessed | ||
timeout: maximum wait time in seconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mention what is being waited for? My understanding is that you want to acquire the lock yourself and are willing to wait for timeout
before you give up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The full text is:
timeout: maximum wait time in seconds
if another thread or process is already
accessing the database.
If timeout is reached,
``None`` is returned
So, we are waiting that we can get access to the database. This needs to be blocked if another user is currently accessing it, as this means the cache might be not fully filled yet.
What would be your suggestion as additional text?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, we are waiting that we can get access to the database.
Yes, and you get access to the database when the lock is removed. When reading the first line "maximum wait time in seconds" I expect something like "before" - for example before giving up my attempt to acquire the lock (and hence giving up the attempt to get database access). So for me the "if" is a little confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The if
indicates that we only need to wait if another thread or process is accessing the database. I guess the original idea was to avoid terms like lock or lock file. But as we have the problem of lock files that are not deleted, it might indeed be better to change the text. How about:
timeout: maximum time in seconds
before giving up acquiring a lock to the database cache folder.
``None`` is returned in this case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me it becomes clearer like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated all affected docstrings (load_media()
, load()
, stream()
, FolderLock
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that the locking itself relies on the filelock
package,
so complications dependending on OS issues or multiprocessing approaches, threading, async etc. do not have to be regarded.
The main work happens in the context manager implemented in the FolderLock
class.
The warning is issued directly if the immediate acquisition of the lock fails - giving details about how long lock acquisition will be attempted.
Apart from allowing to configure timout using the package's settings mechanism in define, the test coverage is warranted
- by testing for the presence of the warning message
- the old way of specifying the timout using negative values is deprecated
One test case covering the old behavior is also removed.
The only point that I am not fully sure of: are all combinations of warning timeouts and lock acquisition timeouts covered?
And: with respect to the warning timeout: when making the timeout configurable via define, should one also making the warning timeout configurable at the same time?
Apart from that I think that this is safe to approve right now - there is only one cosmetic change that is already taken care of.
I decided against adding an entry to |
Closes #497
This reduces the default waiting time for acquiring the folder lock in cache from infinity to 24 h (handled by the
timeout
argument inaudb.load()
,audb.load_media()
, andaudb.stream()
). This seems to me a better approach as we have the problem of leftover lock files in shared cache, for whichaudb.load()
was stuck forever before.It also displays a warning if the lock can not be acquired after 2 s, mentioning that the user might need to delete the lock file manually.
To test it locally, you can try:
This first returns after 2 s:
and after additional 8 s it fails with
Summary by Sourcery
Bug Fixes: