Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not use ros2cli daemon in generate_policy tests. #212

Closed
wants to merge 1 commit into from

Conversation

hidmic
Copy link
Contributor

@hidmic hidmic commented May 13, 2020

Precisely what the title says. As it stands, test_generate_policy.py may fail because of discovery latencies, CLI daemon state spanning multiple tests and even using a non-matching RMW implementation. This circumvents that problem by simply not using it and providing 1 second for discovery to take place (which should be enough for Fast-RTPS at least).

@hidmic hidmic requested review from kyrofa and dirk-thomas May 13, 2020 19:43
@mikaelarguedas
Copy link
Member

Yeah it's a trade-off, I considered removing it when it caused problems a few PRs ago but didn't do it for 2 reasons:

  • The wait time would need to be greater to account for discovery of various RMWs (e.g. connext is typically slower), otherwise we may end up trading one source of flakiness for another
  • Using the daemon in sros2 tests allows to find issues with the daemon that we wouldn't find with other packages, for example add support for get_node_names_and_namespaces_with_enclaves ros2cli#501

It's not clear to me that this change is a clear win overall. Maybe it's worth having some tests using it and some tests not using it to have coverage for both scenarios

@codecov
Copy link

codecov bot commented May 13, 2020

Codecov Report

Merging #212 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #212   +/-   ##
=======================================
  Coverage   77.95%   77.95%           
=======================================
  Files          16       16           
  Lines         576      576           
  Branches       52       52           
=======================================
  Hits          449      449           
  Misses        107      107           
  Partials       20       20           
Flag Coverage Δ
#unittests 77.95% <ø> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 57e08d8...70c94ef. Read the comment docs.

Copy link
Member

@dirk-thomas dirk-thomas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

The alternative would be to explicitly stop an already running daemon at the beginning of the test (which we do in some other tests).

@hidmic
Copy link
Contributor Author

hidmic commented May 13, 2020

The wait time would need to be greater to account for discovery of various RMWs (e.g. connext is typically slower), otherwise we may end up trading one source of flakiness for another

True, but the daemon doesn't really make things any better in this case i.e. where each test sets up a ROS graph of its own.

Using the daemon in sros2 tests allows to find issues with the daemon that we wouldn't find with other packages, for example ros2/ros2cli#501

True, and I agree we can change this test to cover both cases. But I do see this patch as an improvement, as it avoids cross-talk between tests and ensures these run in a consistent manner.

The alternative would be to explicitly stop an already running daemon at the beginning of the test (which we do in some other tests).

Right, plus an explicit wait for that daemon to discover nodes instantiated within tests.

@mikaelarguedas
Copy link
Member

True, but the daemon doesn't really make things any better in this case i.e. where each test sets up a ROS graph of its own.

For sure.
Could we run the no-daemon tests on CI using other RMWs to assess what flakiness this may introduce and tune the wait time accordingly ? (c.f. #168)

True, and I agree we can change this test to cover both cases.

The alternative would be to explicitly stop an already running daemon at the beginning of the test (which we do in some other tests).

Right, plus an explicit wait for that daemon to discover nodes instantiated within tests.

I think this is what we should do here, otherwise we remove test coverage for things we know broke the CLI and trading it for potential CI flakiness reduction.

@hidmic
Copy link
Contributor Author

hidmic commented May 14, 2020

Fair enough. I'll mix and match #168 and this PR to test all RMW implementations, with and without a daemon.

@hidmic
Copy link
Contributor Author

hidmic commented May 14, 2020

Alright, this has been superseded by #214. Let's continue the discussion there.

@hidmic hidmic closed this May 14, 2020
@hidmic hidmic deleted the hidmic/less-flakey-tests branch May 14, 2020 21:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants