fix: Correct mutex scope in execute_engine() #3310
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
//tests/cpp:test_runtime_thread_safety test fails inconsistently.
Currently mutex protects only enqueueV3(), it should protect setTensorAddress as they are part of the same operation.
Below logging confirms race condition.
execute_engine() {
compiled_engine->exec_ctx->setTensorAddress()
+std::cout << gettid() << " setTensorAddress " << outputs[pyt_idx].data_ptr() << std::endl;
std::unique_lockstd::mutex lock()
compiled_engine->exec_ctx->enqueueV3()
+std::cout << "enqueueV3" << std::endl;
759630 setTensorAddress 0xaeeffaa00
759626 setTensorAddress 0xaeeffba00
759624 setTensorAddress 0xaeeffca00
759631 setTensorAddress 0xaeeffda00
759629 setTensorAddress 0xaeeffea00
759627 setTensorAddress 0xaef5f5a00
759633 setTensorAddress 0xaef5f6a00
759628 setTensorAddress 0xaef5f7a00
759632 setTensorAddress 0xaef5f8a00
759625 setTensorAddress 0xaef5f9a00
759630 enqueueV3 //expects 0xaeeffaa00, but 759625 thread interrupted and set wrong output tensor
759626 enqueueV3
759624 enqueueV3
759631 enqueueV3
759629 enqueueV3
759627 enqueueV3
759633 enqueueV3
759628 enqueueV3
759632 enqueueV3
759625 enqueueV3
Fixes # (issue)
Type of change
Please delete options that are not relevant and/or add your own.
Checklist: