Use the codeobject pointer to idetify regions #105

AndreasGocht · 2020-06-24T13:15:54Z

As discussed in #88, we now used the pointer to the python code object to identify a region. This allows to differentiate regions in different python namespaces.

Moreover, this should decrease the runtime, as the region_name is only created as needed. This patch might also have effects to #67.

Special care is needed for user regions, as here the line number is important for identifying a region. I choose only to add the line number there. I am not sure if it is worth to add the line number also to a region name.

Moreover, we simply cannot support importlib.relaod(). If all information of two regions, including file name and line number, are the same, Score-P merges them to one region.

Flamefire

I'm having trouble understanding the meaning of the 3 maps, maybe a couple comments describing the intent would help here.

What I get:

"user region": Manually annotated region, i.e. not created by instrumenter
Any user region is identified by the string <module-name>::<function-name>, let's call it region_id
For any such region there is an entry in user_regions with region_id keys
Foreach "instrumented region" (created by instrumenter) there is an entry in regions identified by it's code pointer
If an instrumented region is entered the first time a user region of a matching region_id is searched for and used if found
a translation of (region_id, lineNo)-tuples to regions is added for instrumented regions
this translation is used before creating a new user region

As a result there can only be 1 user region with a given region_id.

I feel this can cause an issue and order-dependent results. Example (same region_id with different line numbers):

auto region 1 -> New Region ID 1
user region 1 -> Reuse Region ID 1
auto region 2 -> Reuse Region ID 1 from user region

OR:

auto region 1 -> new region id 1
auto region 2 -> new region id 2
user region 1 -> reuse region id 1

I feel like we can do a bit better. What (likely) cannot be done is that user region 1 and user region 2 can be differentiated, that would need a change to the region_end event to include the line number but I think we can at least always have the auto/instrumented region refer to correct/unique regions.
Idea: Write to region_translations in the user-region-enter and use that instead of the user_regions map in the instrumented-region-enter. I think this solves this problem as it merges instrumented and user regions only when the line numbers match no matter the order

2 final change requests:

have actual region handles in the region_translations map. This saves an extra map lookup
how about making the region_translations key a pair of (string, int)? IIRC std::pair is already hashable and this saves an int-to-string conversion on lookup and makes the key format clear.

src/methods.cpp

src/scorepy/cInstrumenter.cpp

src/scorepy/events.cpp

AndreasGocht · 2020-07-06T16:38:52Z

The main point here ist to deal with decorators. Annotating a function:

@scorep.user.region()
def foo():
    pass

will instrument the region. If the instrumenter is enabled, the user instrumentation will do nothing. If the instrumenter is disabled, the user instrumentation will take care. Calling

foo()
with scorep.instrumenter.disable():
    foo()
    with scorep.instrumenter.enable():
         foo()

is supposed to refer to the same region, no matter if the instrumenter was initially enabled or not.

At the same time, instrumenting a region using

scorep.user.region_begin("bla")
# some code
scorep.user.region_end("bla")

is supposed to work.

I feel like we can do a bit better. What (likely) cannot be done is that user region 1 and user region 2 can be differentiated, that would need a change to the region_end event to include the line number but I think we can at least always have the auto/instrumented region refer to correct/unique regions.

We can do that for the decorator and the with statement. We cannot do that for the classical scorep.user.region_begin() and scorep.user.region_end() calls. But it might simply be the time to deprecate or remove them. Otherwise, we could simply force the user to provide a "line number". (Both would be a great reason for having a 4.0 release 😆 )

Idea: Write to region_translations in the user-region-enter and use that instead of the user_regions map in the instrumented-region-enter. I think this solves this problem as it merges instrumented and user regions only when the line numbers match no matter the order

I think if we address the above this should also be fine.

how about making the region_translations key a pair of (string, int)? IIRC std::pair is already hashable and this saves an int-to-string conversion on lookup and makes the key format clear.

Seems to be a good idea 😉.

a few more `const`

Flamefire · 2020-07-06T17:14:13Z

I think if we address the above this should also be fine.

Ok so the only problem is that in in your example scorep.user.region_begin("bla") doesn't contain a line number, is this correct?

In this case I assume the line number passed into the C++ code is 0, isn't it? So my proposed solution would work: No "instrumented region" would be at line number 0 so it would never be gotten from the region_translations map. This is the correct behavior

The only issue I see is if the user uses something like:

scorep.user.region_begin("bla")
# some code
scorep.user.region_end("bla")

@scorep.user.region()
def bla():
    pass

What happens is: First begin creates a user-region "bla" and a mapping (bla, 0), second reuses the user-region but does not create a mapping. This will be a problem when the instrumenter gets called for bla as it will attribute calls to that function to a new region while when the instrumenter is disabled it will attribute calls to the custom region created before.

I'd call this suboptimal but as already written: User regions with the same region_id cannot be distinguished anyway so users should simply not do this.

src/scorepy/events.cpp

AndreasGocht · 2020-07-07T07:41:39Z

I'd call this suboptimal but as already written: User regions with the same region_id cannot be distinguished anyway so users should simply not do this.

They can, as long as they use decorators and context manages, as they do have a line number. I'll come up with another code change, once we agreed on the other Issues.

Best,

Andreas

Flamefire · 2020-07-07T10:48:22Z

They can, as long as they use decorators and context manages, as they do have a line number. I'll come up with another code change

Not in the current implementation or am I missing anything? The decorator calls region_begin(self.module_name, self.region_name, full_file_name, line_number) so no function object (pointer) and the context manager and the manual scorep.user.region_begin("bla") will resolve to the same region, the line number is passed but not used for identification. For the latter this is expected and correct behavior anyway: Fold all user regions with the same name.

The decorator can be changed to include the wrapped functions code object which solves this case:

scorep.user.region_begin("bla")
# some code
scorep.user.region_end("bla")

@scorep.user.region()
def bla():
    pass

The first will create a user region, the second will include a function code pointer and resolve to a different region.

Given that: Why do we need the region_translations map? We actually only have 2 cases: Either a manually or automatically decorated function which includes the code pointer or a named user region identified by the name only and without a code pointer (created by a context manager or manual calls) I don't see any use case where those should be folded

This also gets rid of the cross-checks (checking for a user-region in the code-pointer-using function and vice-versa) and even better: We will create new regions for decorated or automatically instrumented functions on reload as the pointer changed and there is no name-matching fallback.

AndreasGocht · 2020-07-15T09:53:51Z

Given that: Why do we need the region_translations map? We actually only have 2 cases: Either a manually or automatically decorated function which includes the code pointer or a named user region identified by the name only and without a code pointer (created by a context manager or manual calls) I don't see any use case where those should be folded

done

AndreasGocht · 2020-07-15T10:08:45Z

I think I addressed all open Issues. If there are no objections, I'll merge this next week.

Best,

Andreas

Flamefire

Almost ready, just a few trivial changes

scorep/user.py

src/methods.cpp

src/scorepy/events.cpp

src/scorepy/events.hpp

test/test_scorep.py

Co-authored-by: Alexander Grund <[email protected]>

AndreasGocht added 13 commits June 12, 2020 17:30

a first approach

df41342

special case, using the object pointer

35daf29

adopt testcase

17fec66

working state

30f6965

Merge remote-tracking branch 'origin/master' into issue-88

f991814

working with code pointer on profile

d92c5d6

address user regions with line numbers#

df09265

update tests

b9e4887

only create name when needed

ecb1fdd

fix region and module order

882d8c2

extend the other instrumenters

c01aa1f

update README.md

ec407c9

autopep8

a396c80

AndreasGocht mentioned this pull request Jun 24, 2020

Avoid memory allocations #67

Open

Flamefire suggested changes Jul 6, 2020

View reviewed changes

src/methods.cpp Outdated Show resolved Hide resolved

src/scorepy/cInstrumenter.cpp Outdated Show resolved Hide resolved

src/scorepy/events.cpp Outdated Show resolved Hide resolved

src/scorepy/events.cpp Outdated Show resolved Hide resolved

Flamefire mentioned this pull request Jul 6, 2020

Make benchmark output (stdout) more readable #102

Merged

Apply suggestions from code review

fa506c1

a few more `const`

Flamefire reviewed Jul 6, 2020

View reviewed changes

src/scorepy/events.cpp Outdated Show resolved Hide resolved

AndreasGocht added 2 commits July 15, 2020 11:51

sperate user regions and instrumented or decorated regions

fbdcffa

fix style

09ff745

AndreasGocht added 3 commits July 15, 2020 12:00

back to ref

5eb191d

region --> function_name

ac703fd

&SCOREP_User_LastFileHandle --> NULL

e2ac3e8

Flamefire suggested changes Jul 15, 2020

View reviewed changes

AndreasGocht and others added 10 commits July 15, 2020 15:05

Update scorep/user.py

3c708ab

Update src/scorepy/events.cpp

8afb236

Co-authored-by: Alexander Grund <[email protected]>

Update src/scorepy/events.cpp

6cafede

Co-authored-by: Alexander Grund <[email protected]>

Update src/scorepy/events.hpp

a609531

Co-authored-by: Alexander Grund <[email protected]>

Update src/scorepy/events.cpp

9274ba9

Co-authored-by: Alexander Grund <[email protected]>

remove optional arguments

646767b

restructure code, to make things more clear.

4793e03

codestyle

0996309

some doc

05a4fe9

Merge remote-tracking branch 'origin/master' into issue-88

e7f01a9

AndreasGocht merged commit 39006a2 into master Jul 15, 2020

AndreasGocht deleted the issue-88 branch July 15, 2020 14:13

This was referenced Jul 15, 2020

Include class in region name #88

Closed

Implement CString as a wrapper around a C-String to avoid using std::string #107

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use the codeobject pointer to idetify regions #105

Use the codeobject pointer to idetify regions #105

AndreasGocht commented Jun 24, 2020

Flamefire left a comment

AndreasGocht commented Jul 6, 2020

Flamefire commented Jul 6, 2020

AndreasGocht commented Jul 7, 2020

Flamefire commented Jul 7, 2020

AndreasGocht commented Jul 15, 2020

AndreasGocht commented Jul 15, 2020

Flamefire left a comment

Use the codeobject pointer to idetify regions #105

Use the codeobject pointer to idetify regions #105

Conversation

AndreasGocht commented Jun 24, 2020

Flamefire left a comment

Choose a reason for hiding this comment

AndreasGocht commented Jul 6, 2020

Flamefire commented Jul 6, 2020

AndreasGocht commented Jul 7, 2020

Flamefire commented Jul 7, 2020

AndreasGocht commented Jul 15, 2020

AndreasGocht commented Jul 15, 2020

Flamefire left a comment

Choose a reason for hiding this comment