Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use snprintf instead of stringstream to increase performance of lookupTransform() in error cases. #470

Closed
wants to merge 9 commits into from

Conversation

peci1
Copy link
Contributor

@peci1 peci1 commented Jun 18, 2020

For some reason, the stringstream construction of tf2 exception messages is much slower than snprintf().

In some cases, there may be lots of these messages generated, thus I think it makes sense to save time here.

A simple substitution of the stringstream approach with snprintf brings a two- to three-fold speedup for the error cases.

Speed test with the additions from #469 (the updated speed_test is not included in this PR, but I used it to measure the performance gain):

noetic-devel:

Info:    9 to 4
Info:    Doing 1000000 10-level 1.000000-interval tests
Info:    lookupTransform at Time(0) took 1.588220 for an average of 0.000001588
Info:    lookupTransform at Time(1) took 1.118618 for an average of 0.000001119
Info:    lookupTransform at Time(1.5) took 2.125916 for an average of 0.000002126
Info:    lookupTransform at Time(2) took 1.009223 for an average of 0.000001009
Info:    canTransform at Time(0) took 1.105921 for an average of 0.000001106
Info:    canTransform at Time(1) took 0.658616 for an average of 0.000000659
Info:    canTransform at Time(1.5) took 0.879653 for an average of 0.000000880
Info:    canTransform at Time(2) took 0.598742 for an average of 0.000000599
Info:    canTransform at Time(3) without error string took 1.912571 for an average of 0.000001913
Info:    canTransform at Time(3) with error string took 4.456371 for an average of 0.000004456

This patch:

Info:    9 to 4
Info:    Doing 1000000 10-level 1.000000-interval tests
Info:    lookupTransform at Time(0) took 1.548510 for an average of 0.000001549
Info:    lookupTransform at Time(1) took 1.044787 for an average of 0.000001045
Info:    lookupTransform at Time(1.5) took 2.035262 for an average of 0.000002035
Info:    lookupTransform at Time(2) took 0.976278 for an average of 0.000000976
Info:    canTransform at Time(0) took 1.073062 for an average of 0.000001073
Info:    canTransform at Time(1) took 0.632388 for an average of 0.000000632
Info:    canTransform at Time(1.5) took 0.828700 for an average of 0.000000829
Info:    canTransform at Time(2) took 0.603433 for an average of 0.000000603
Info:    canTransform at Time(3) without error string took 0.855472 for an average of 0.000000855
Info:    canTransform at Time(3) with error string took 1.685970 for an average of 0.000001686

There's one change that might be reverted if needed - the removal if <sstream> include from time_cache.h. It was just no longer needed for tf2, so I removed it, but if you think it'd be an API break, it can be put back.

There should also be discussion on the string length limit in buffer_core. I chose 1000. Whatever else can be used. I also tried a solution with dynamic allocation of memory for the string, but it looked ugly (but the performance was nearly the same).

@peci1 peci1 requested a review from tfoote as a code owner June 18, 2020 16:11
@peci1
Copy link
Contributor Author

peci1 commented Jun 18, 2020

Here's my rationale from a node that processes lots of laserscans which are quite fresh so they often need some time before their transform gets into the cache.

Snímek obrazovky pořízený 2020-06-18 18-14-05

This is a flamegraph where I circled all calls to createExtrapolationException*. Roughly estimated, these calls take about 20% run time of my code. With this PR, it could be less than 10%.

Merging #469 could help a bit with this as most of the internal canTransform() calls will no longer create the strings, but it still seems to me that people would profit from this further optimization.

jspricke and others added 8 commits July 20, 2020 01:01
* Revert "rework Eigen functions namespace hack" (ros#436)

* Fixed warnings in message_filter.h (ros#434)

the variables are not used in function body and caused -Wunused-parameter to trigger with -Wall

* Fix ambiguous call for tf2::convert on MSVC (ros#444)

* rework ambiguous call on MSVC.

Co-authored-by: Tully Foote <[email protected]>
Co-authored-by: moooeeeep <[email protected]>
Co-authored-by: Sean Yen <[email protected]>
Replaces the deprecated names
{tf_frames, view_frames} -> tf2_frames
Copy link
Member

@tfoote tfoote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for filling this in. I have merged it with the upstream changes to make the message a little more verbose. Just waiting for CI

@tfoote
Copy link
Member

tfoote commented Aug 25, 2020

Manually rebased and merged in 52bff88

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants