Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix race in rtx timer test #134

Merged
merged 2 commits into from
Jul 6, 2020
Merged

Conversation

enobufs
Copy link
Member

@enobufs enobufs commented Jul 6, 2020

Description

Race conditions were detected with the variable nCbs used in rtx_timer_tests.go.
By using sync/atomic methods on the variable, the race was gone in my local environment after updating testify to v1.6.1.

Reference issue

Fixes #129

@codecov
Copy link

codecov bot commented Jul 6, 2020

Codecov Report

Merging #134 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #134   +/-   ##
=======================================
  Coverage   78.94%   78.94%           
=======================================
  Files          43       43           
  Lines        2798     2798           
=======================================
  Hits         2209     2209           
  Misses        454      454           
  Partials      135      135           
Flag Coverage Δ
#go 78.94% <ø> (ø)
#wasm 69.90% <ø> (ø)
Impacted Files Coverage Δ
association.go 83.09% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 82cfade...2aac6a0. Read the comment docs.

@enobufs enobufs self-assigned this Jul 6, 2020
@enobufs enobufs requested review from at-wat and Sean-Der July 6, 2020 09:09
Copy link
Member

@at-wat at-wat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to use atomic, but I think logical race condition is not fixed.

(frequency of failure might be reduced, though)

@@ -310,7 +311,7 @@ func TestRtxTimer(t *testing.T) {
<-doneCh

assert.True(t, rt.isRunning(), "should still be running")
assert.Equal(t, 6, nCbs, "must have been called 6 times")
assert.Equal(t, int32(6), atomic.LoadInt32(&nCbs), "should be called 6 times")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the old code caused data race with nCbs++, I think using atomic just replace data race error by 6 not equals to nCbs.

Copy link
Member Author

@enobufs enobufs Jul 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!
Btw, could you please point me to the problem? I only saw nCbs error and did not notice other problems you are talking about.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the race detector detected a data race, nCbs++ and if nCbs != expectedNCbs in assert.Equal were executed concurrently. (These two memory access operations were reachable at same time from two routines.)
It should mean that nCbs potentially take two different values at assert.Equal according to the CPU load and task switch timing.

Copy link
Member Author

@enobufs enobufs Jul 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah.. I see. Yes, those two events (1: nCbs++ and 2: assert.Equal) occur in two different routines but from the same cause, onRTO tho. From what I can see in this particular test when onRTO callback is made, nCbs++ happens first, followed immediately by assert.Equal. When nRtos = 6, the timeout interval would be 320msec. If there was a delay of more than 320 msec between nCbs++ and assert.Equal, then yes, nCbs could be 7 but I think it won't happen.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah... I guess you were talking about L.206, not L.313. Let me take a closer look at the test case.

@enobufs enobufs merged commit dbc92ec into master Jul 6, 2020
@enobufs enobufs deleted the 129-fix-race-in-rtx-timer-test branch July 6, 2020 09:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Race found in travis-ci
3 participants