-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AP_Filesystem: ROMFS: fix open race conditions #29201
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very good find!!
I'd prefer just to put the mutex at the start of open(). We only call open in places where timing is not critical, so I think simplicity is more important than efficiency, and it would be a more obvious patch for 4.6.x
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested on CubeOrange and it is working. 80 reloads with no issue. Rolling back I can reproduce every 5 attempts or so.
0fc4ad7
to
665ddf4
Compare
Lua opens scripts to load them into memory, then the logger opens them after to stream them into the dataflash log. When loading multiple large Lua scripts from ROMFS, decompression takes a significant amount of time. This creates the opportunity for the Lua interpreter and logging threads to both be inside `AP_Filesystem_ROMFS::open()` decompressing a file. If this happens, the function can return the same `fd` for two different calls as the `fd` is chosen before decompression starts, but only marked as being used after that finishes. The read pointers then stomp on each other, so Lua loads garbled scripts (usually resulting in a syntax error) and the logger dumps garbled data. Fix the issue by locking before searching for a free record (or marking a record as free). Apply the same fix to directories as well. This doesn't protect against using the same `fd`/`dirp` from multiple threads, but that behavior is to be discouraged anyway and is not the root cause here.
665ddf4
to
92d7456
Compare
Lua opens scripts to load them into memory, then the logger opens them after to stream them into the dataflash log. When loading multiple large Lua scripts from ROMFS, decompression takes a significant amount of time. This creates the opportunity for the Lua interpreter and logging threads to both be inside
AP_Filesystem_ROMFS::open()
decompressing a file.If this happens, the function can return the same
fd
for two different calls as thefd
is chosen before decompression starts, but only marked as being used after that finishes. The read pointers then stomp on each other, so Lua loads garbled scripts (usually resulting in a syntax error) and the logger dumps garbled data.Fix the issue by locking before searching for a free record (or marking a record as free). Apply the same fix to directories as well. This doesn't protect against using the same
fd
/dirp
from multiple threads, but that behavior is to be discouraged anyway and is not the root cause here.Patch to view the evidence:
The evidence (
fd
0 is opened twice):