-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix race in rtx timer test #134
Conversation
Generated by renovateBot
Codecov Report
@@ Coverage Diff @@
## master #134 +/- ##
=======================================
Coverage 78.94% 78.94%
=======================================
Files 43 43
Lines 2798 2798
=======================================
Hits 2209 2209
Misses 454 454
Partials 135 135
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to use atomic, but I think logical race condition is not fixed.
(frequency of failure might be reduced, though)
@@ -310,7 +311,7 @@ func TestRtxTimer(t *testing.T) { | |||
<-doneCh | |||
|
|||
assert.True(t, rt.isRunning(), "should still be running") | |||
assert.Equal(t, 6, nCbs, "must have been called 6 times") | |||
assert.Equal(t, int32(6), atomic.LoadInt32(&nCbs), "should be called 6 times") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the old code caused data race with nCbs++
, I think using atomic just replace data race error by 6 not equals to nCbs
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Btw, could you please point me to the problem? I only saw nCbs error and did not notice other problems you are talking about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the race detector detected a data race, nCbs++
and if nCbs != expectedNCbs
in assert.Equal
were executed concurrently. (These two memory access operations were reachable at same time from two routines.)
It should mean that nCbs
potentially take two different values at assert.Equal
according to the CPU load and task switch timing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah.. I see. Yes, those two events (1: nCbs++ and 2: assert.Equal) occur in two different routines but from the same cause, onRTO
tho. From what I can see in this particular test when onRTO callback is made, nCbs++ happens first, followed immediately by assert.Equal. When nRtos = 6, the timeout interval would be 320msec. If there was a delay of more than 320 msec between nCbs++ and assert.Equal, then yes, nCbs could be 7 but I think it won't happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah... I guess you were talking about L.206, not L.313. Let me take a closer look at the test case.
Description
Race conditions were detected with the variable
nCbs
used in rtx_timer_tests.go.By using sync/atomic methods on the variable, the race was gone in my local environment after updating testify to v1.6.1.
Reference issue
Fixes #129