-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat[next][dace]: lowering of scan to SDFG #1776
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks generally good but I would strongly suggest to split the translate_scan
function into several smaller functions that each handle one particular aspect of the translation.
src/gt4py/next/program_processors/runners/dace_fieldview/gtir_builtin_translators.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace_fieldview/gtir_builtin_translators.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace_fieldview/gtir_builtin_translators.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace_fieldview/gtir_builtin_translators.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace_fieldview/gtir_builtin_translators.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace_fieldview/gtir_builtin_translators.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace_fieldview/gtir_builtin_translators.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace_fieldview/gtir_builtin_translators.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace_fieldview/gtir_builtin_translators.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace_fieldview/gtir_sdfg.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have addressed all comments except splitting the lowering of the scan into multiple functions / separate module. I will push another commit on this PR for that, so it is easier to review.
src/gt4py/next/program_processors/runners/dace_fieldview/gtir_builtin_translators.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace_fieldview/gtir_builtin_translators.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace_fieldview/gtir_builtin_translators.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace_fieldview/gtir_builtin_translators.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace_fieldview/gtir_builtin_translators.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace_fieldview/gtir_sdfg.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace_fieldview/gtir_builtin_translators.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace_fieldview/gtir_builtin_translators.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace_fieldview/gtir_builtin_translators.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace_fieldview/gtir_builtin_translators.py
Outdated
Show resolved
Hide resolved
8fb0c09
to
bb458cd
Compare
Thanks for the review comments. I have moved the scan translator to a separate module and split it into small functions. At some point I stopped, in order to avoid too much fragmentation. I agree, it looks much better now! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some smaller issues, but also one that is a bit bigger.
I think my main confusion comes from how it is translated.
As far as I can remember a lowered scan should look something like:
for(int i = scan_start; i != scan_end; ++i)
{
map_over_horizonal(); //Parallel
}
But is it the other way around?
src/gt4py/next/program_processors/runners/dace_fieldview/gtir_builtin_translators.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace_fieldview/gtir_scan_translator.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace_fieldview/gtir_scan_translator.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace_fieldview/gtir_scan_translator.py
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace_fieldview/gtir_scan_translator.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace_fieldview/gtir_scan_translator.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace_fieldview/gtir_scan_translator.py
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace_fieldview/gtir_scan_translator.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace_fieldview/gtir_scan_translator.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace_fieldview/gtir_scan_translator.py
Show resolved
Hide resolved
It is the other way around. It has always been like that, even in the ITIR backend. It is not like it should or should not, both ways will produce the same result. I am using the external map because it follows the GTIR representation, where |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that everything that is needed is here, but it is organized in a way that I not fully understand.
src/gt4py/next/program_processors/runners/dace_fieldview/gtir_scan_translator.py
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace_fieldview/gtir_scan_translator.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace_fieldview/gtir_scan_translator.py
Show resolved
Hide resolved
)(lambda_output) | ||
|
||
# we call a helper method to create a map scope that will compute the entire field | ||
return _create_scan_field_operator( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this function knows where the body of the scan is.
Furthermore, for me it looks like everything is done at this place.
I mean the nested SDFG containing the body has been created, the outputs are connected, everything that is left is the connection of the inputs, which _create_scan_field_operator()
does, but then that function also calls _create_scan_field_operator_impl()
which does much more.
This confuses me.
lambda_output_tree = gtx_utils.tree_map( | ||
lambda lambda_output_data: _connect_nested_sdfg_output_to_temporaries( | ||
sdfg, nsdfg, nsdfg_node, state, lambda_output_data | ||
) | ||
)(lambda_output) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an asymmetry between inputs and outputs.
You connect the outputs here but the inputs are on the return statement.
I know it is quite late, but I would think it is more logical if the function would do:
- Creating/setting up the inputs,
- Creating the body,
- Setting up the outputs.
Is there a particular reason why currently it is handled as:
- Creating body,
- Handle output,
- Handle inputs.
To me the function appears a bit unstructured.
src/gt4py/next/program_processors/runners/dace_fieldview/gtir_scan_translator.py
Outdated
Show resolved
Hide resolved
return [scan_column_size, dace.symbolic.SymExpr(list_size)] | ||
|
||
if isinstance(init_data, tuple): | ||
lambda_result_shape = gtx_utils.tree_map(get_scan_output_shape)(init_data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I get that part of the output shape can be inferred from the shape of the initial data.
But what is with the horizontal isn't it missing?
Furthermore, this value, lambda_result_shape
, is defined here and then used 100 lines later.
Because of this I have the feeling that its definition and its usage is fully disconnected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed it could probably be improved, but we should postpone that until it is more clear if and how scan will change in GTIR.
However, at some point we need to refactor this code.
This PR contains the lowering of the scan builtin function.