-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use the codeobject pointer to idetify regions #105
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm having trouble understanding the meaning of the 3 maps, maybe a couple comments describing the intent would help here.
What I get:
- "user region": Manually annotated region, i.e. not created by instrumenter
- Any user region is identified by the string
<module-name>::<function-name>
, let's call it region_id - For any such region there is an entry in
user_regions
with region_id keys - Foreach "instrumented region" (created by instrumenter) there is an entry in
regions
identified by it'scode
pointer - If an instrumented region is entered the first time a user region of a matching region_id is searched for and used if found
- a translation of
(region_id, lineNo)
-tuples to regions is added for instrumented regions - this translation is used before creating a new user region
As a result there can only be 1 user region with a given region_id.
I feel this can cause an issue and order-dependent results. Example (same region_id with different line numbers):
- auto region 1 -> New Region ID 1
- user region 1 -> Reuse Region ID 1
- auto region 2 -> Reuse Region ID 1 from user region
OR:
- auto region 1 -> new region id 1
- auto region 2 -> new region id 2
- user region 1 -> reuse region id 1
I feel like we can do a bit better. What (likely) cannot be done is that user region 1
and user region 2
can be differentiated, that would need a change to the region_end event to include the line number but I think we can at least always have the auto/instrumented region refer to correct/unique regions.
Idea: Write to region_translations
in the user-region-enter and use that instead of the user_regions
map in the instrumented-region-enter. I think this solves this problem as it merges instrumented and user regions only when the line numbers match no matter the order
2 final change requests:
- have actual region handles in the region_translations map. This saves an extra map lookup
- how about making the
region_translations
key a pair of(string, int)
? IIRCstd::pair
is already hashable and this saves an int-to-string conversion on lookup and makes the key format clear.
The main point here ist to deal with decorators. Annotating a function:
will instrument the region. If the instrumenter is enabled, the user instrumentation will do nothing. If the instrumenter is disabled, the user instrumentation will take care. Calling
is supposed to refer to the same region, no matter if the instrumenter was initially enabled or not. At the same time, instrumenting a region using
is supposed to work.
We can do that for the decorator and the
I think if we address the above this should also be fine.
Seems to be a good idea 😉. |
a few more `const`
Ok so the only problem is that in in your example In this case I assume the line number passed into the C++ code is The only issue I see is if the user uses something like:
What happens is: First begin creates a user-region "bla" and a mapping I'd call this suboptimal but as already written: User regions with the same region_id cannot be distinguished anyway so users should simply not do this. |
They can, as long as they use decorators and context manages, as they do have a line number. I'll come up with another code change, once we agreed on the other Issues. Best, Andreas |
Not in the current implementation or am I missing anything? The decorator calls The decorator can be changed to include the wrapped functions code object which solves this case:
The first will create a user region, the second will include a function code pointer and resolve to a different region. Given that: Why do we need the This also gets rid of the cross-checks (checking for a user-region in the code-pointer-using function and vice-versa) and even better: We will create new regions for decorated or automatically instrumented functions on reload as the pointer changed and there is no name-matching fallback. |
done |
I think I addressed all open Issues. If there are no objections, I'll merge this next week. Best, Andreas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost ready, just a few trivial changes
Co-authored-by: Alexander Grund <[email protected]>
Co-authored-by: Alexander Grund <[email protected]>
Co-authored-by: Alexander Grund <[email protected]>
Co-authored-by: Alexander Grund <[email protected]>
As discussed in #88, we now used the pointer to the python code object to identify a region. This allows to differentiate regions in different python namespaces.
Moreover, this should decrease the runtime, as the region_name is only created as needed. This patch might also have effects to #67.
Special care is needed for user regions, as here the line number is important for identifying a region. I choose only to add the line number there. I am not sure if it is worth to add the line number also to a region name.
Moreover, we simply cannot support
importlib.relaod()
. If all information of two regions, including file name and line number, are the same, Score-P merges them to one region.