feat[next][dace]: support for field origin in lowering to SDFG #1818

edopao · 2025-01-22T21:42:21Z

This PR adds support for GT4Py field arguments with non-zero start index, for example:

inp = constructors.empty(common.domain({IDim: (1, 9)}), ...)

which was supported in baseline only for temporary fields, by means of a data structure called field_offsets. This data structure is removed for two reasons:

the name "offset" is a left-over from previous design based on dace array offset
offset has a different meaning in GT4Py

We introduce the GT4Py concept of field origin and use it for both temporary fields and program arguments. The field origin corresponds to the start of the field domain range.

This PR also changes the symbolic definition of array shape. Before, the array shape was defined as [data_size_0, data_size_1, ...], now the size corresponds to the range extent stop - start as [(data_0_range_1 - data_0_range_0), (data_1_range_1 - data_1_range_0), ...].

The translation stage of the dace workflow is extended with an option disable_field_origin_on_program_arguments to set the field range start symbols to constant value zero. This is needed for the dace orchestration, because the signature of a dace-orchestrated program does not provide the domain origin.

This reverts commit 792a8eb.

…not start at zero

This reverts commit 746f9d8.

…ir-scan

This reverts commit 5d5992a.

…ir-scan

…efact

…_with_non_zero_domain_start

…ro_domain_start

philip-paul-mueller

It looks generally good, but there are a few things.

The most important thing is, that are sometimes a not defined origin is converted to all zero, sometimes it is kept as None.
I made several comments about that using one form, i.e. implicitly converting to all zero, would be more consistent and could simplify the code.
Although it would no longer be possible to distinguish between the case of the case "explicitly set to zero" and "not defined", but I am not sure if this would be an issue.

src/gt4py/next/program_processors/runners/dace/gtir_builtin_translators.py

philip-paul-mueller · 2025-01-24T10:17:48Z

src/gt4py/next/program_processors/runners/dace/gtir_builtin_translators.py

+        outer_sdfg_state: dace.SDFGState,
+        symbol_mapping: dict[str, dace.symbolic.SymbolicType],
+    ) -> FieldopData:
+        """Helper method to map a field data container from a nested SDFG to the parent SDFG."""


I think the doc string is a bit incomplete.
How does something along the lines:
"Make the data descriptor, self refers to available, which is located inside a NestedSDFG available in its parent SDFG.
Thus it will turn self into a non transient and create a new data descriptor inside the parent."

Furthermore, something that comes to my mind, why was this function not needed before?
I mean in the end you replaced offset with origin, so the change is not that big?

This functionality existed before, but it was located in gtir_sdfg inside construct_output_for_nested_sdfg().
I decided to remove the method make_copy(), because it didn't mean much, and instead created this one.

philip-paul-mueller · 2025-01-24T10:31:29Z

src/gt4py/next/program_processors/runners/dace/gtir_builtin_translators.py

+        outer_desc = self.dc_node.desc(sdfg)
+        assert isinstance(outer_desc, dace.data.Array)
+        if self.origin is None:
+            outer_field_origin = [0] * ndims


At other places you set such values to None.
So why do you not do it here as well or why do you return None at other places?
Okay "not defined" is conceptional different from "having value 0", but still there are a lot of if that you could remove if you would follow this approach.

I agree, origin can be None only inside FieldopData, in case of ScalarType data. For FieldType data, it cannot be undefined.

This is right, I did not consider this case.

src/gt4py/next/program_processors/runners/dace/gtir_builtin_translators.py

philip-paul-mueller · 2025-01-24T12:41:29Z

src/gt4py/next/program_processors/runners/dace/gtir_sdfg.py

+                    dace.symbolic.pystr_to_symbolic(gtx_dace_utils.field_size_symbol_name(name, i))
+                )
+            else:
+                # the size of global dimensions for a regular field is the symbolic


What are you doing with transients, i.e. where does their symbols came from?

_make_array_shape_and_strides() is only called for non-transient arrays. I will add this comment:

This method is only called for non-transient arrays, which require symbolic memory layout. The memory layout of transient arrays, used for temporary fields, is assigned by DaCe during lowering to SDFG.

You should add "[...] used for temporary field, the default of DaCe, which is row major, is used and might be changed during optimization".
I think this makes it a bit clearer that the lowering does not really care about the strides beside being that they are correct (but not necessarily optimal).

philip-paul-mueller · 2025-01-24T12:46:52Z

src/gt4py/next/program_processors/runners/dace/gtir_sdfg.py

@@ -201,6 +199,24 @@ def _collect_symbols_in_domain_expressions(
    )


+def _get_field_doman_subset(


This is just my impression, but I am not sure if "subset" is the right term to use here.
For me a subset is more like a range, such as 3:6.
This here is more like a concrete access, such as a[i], so I would suggest to name the function more something like _make_access_index_for_field() or something.

If you keep this function, then there is a typo in "domain".

src/gt4py/next/program_processors/runners/dace/gtir_sdfg.py

philip-paul-mueller · 2025-01-24T12:56:48Z

src/gt4py/next/program_processors/runners/dace/gtir_sdfg.py

@@ -941,10 +901,47 @@ def visit_SymRef(
        return gtir_builtin_translators.translate_symbol_ref(node, sdfg, head_state, self)


+def _remove_field_origin_symbols(ir: gtir.Program, sdfg: dace.SDFG) -> None:


This function confused my.
First it was not clear that this function is essentially undoing all the work of this PR.
Until I realized that this function is only used for build_sdfg_from_gtir() when disable_field_origin_on_program_arguments is set to True.
I would add this comment to the doc string.

The second thing I do not understand is why do you only do it for transients.
I guess it is because they are never set, but I am not fully sure about that.

Comment added. I do not understand the second comment. I actually collect the range start symbols of the program parameters, that is the non-transient arrays in the top-level SDFG.

The second part is also wrong.
I wanted to write was "why do you not do it for transients".
However, in another comment you wrote that only non transient data descriptor have this symbolic sizes.

src/gt4py/next/program_processors/runners/dace/sdfg_callable.py

edopao

Thanks for the review.

src/gt4py/next/program_processors/runners/dace/gtir_builtin_translators.py

edopao · 2025-01-24T15:27:52Z

src/gt4py/next/program_processors/runners/dace/gtir_builtin_translators.py

+        outer_sdfg_state: dace.SDFGState,
+        symbol_mapping: dict[str, dace.symbolic.SymbolicType],
+    ) -> FieldopData:
+        """Helper method to map a field data container from a nested SDFG to the parent SDFG."""


This functionality existed before, but it was located in gtir_sdfg inside construct_output_for_nested_sdfg().
I decided to remove the method make_copy(), because it didn't mean much, and instead created this one.

edopao · 2025-01-24T15:29:58Z

src/gt4py/next/program_processors/runners/dace/gtir_builtin_translators.py

+        outer_desc = self.dc_node.desc(sdfg)
+        assert isinstance(outer_desc, dace.data.Array)
+        if self.origin is None:
+            outer_field_origin = [0] * ndims


I agree, origin can be None only inside FieldopData, in case of ScalarType data. For FieldType data, it cannot be undefined.

src/gt4py/next/program_processors/runners/dace/gtir_builtin_translators.py

edopao · 2025-01-24T15:57:14Z

src/gt4py/next/program_processors/runners/dace/gtir_sdfg.py

+                    dace.symbolic.pystr_to_symbolic(gtx_dace_utils.field_size_symbol_name(name, i))
+                )
+            else:
+                # the size of global dimensions for a regular field is the symbolic


_make_array_shape_and_strides() is only called for non-transient arrays. I will add this comment:

This method is only called for non-transient arrays, which require symbolic memory layout. The memory layout of transient arrays, used for temporary fields, is assigned by DaCe during lowering to SDFG.

edopao · 2025-01-24T16:05:30Z

src/gt4py/next/program_processors/runners/dace/gtir_sdfg.py

@@ -279,7 +291,8 @@ def make_field(
            raise NotImplementedError(
                "Fields with more than one local dimension are not supported."
            )
-        return gtir_builtin_translators.FieldopData(data_node, field_type, domain_offset)
+        field_origin = gtx_dace_utils.get_symbolic_origin(data_node.data, field_type)


make_field is actually called for SymRefs to global symbols, that is non-transient data.

src/gt4py/next/program_processors/runners/dace/gtir_sdfg.py

src/gt4py/next/program_processors/runners/dace/utils.py

philip-paul-mueller

There are some points that needs some further work, but nothing serious.

philip-paul-mueller · 2025-01-27T10:49:25Z

src/gt4py/next/program_processors/runners/dace/gtir_builtin_translators.py

@@ -78,8 +75,8 @@ class FieldopData:
    Args:
        dc_node: DaCe access node to the data storage.
        gt_type: GT4Py type definition, which includes the field domain information.
-        origin: List of start indices, in each dimension, when the dimension range


I would add a check to ensure that origin is None when you construct a FiledopData for scalar, i.e. adding a __post_init__ that does this.

philip-paul-mueller · 2025-01-27T10:50:06Z

src/gt4py/next/program_processors/runners/dace/gtir_builtin_translators.py

-        origin: List of start indices, in each dimension, when the dimension range
-            does not start from zero; assume zero, if origin is not set.
+        origin: List of start indices, in each dimension, for `FieldType` data.
+            Set to `None` only for `ScalarType` data.


Suggested change

Set to `None` only for `ScalarType` data.

Has to be `None` only for `ScalarType` data. For fields it is assumed to be all zero if not given.

I am also thinking that you should enforce the that origin is set correctly during construction.
Since this is a dataclass you have to implement __post_init__().

philip-paul-mueller · 2025-01-27T11:33:40Z

src/gt4py/next/program_processors/runners/dace/gtir_builtin_translators.py

+                lambda m: dace.sdfg.replace_properties_dict(outer_desc, m),
+            )
+            # Same applies to the symbols used as field origin (the domain range start)
+            assert self.origin is not None


You should do such a check at construction time.

I agree, done. However I will have to ignore a type-checking warning ([union-attr]).

philip-paul-mueller · 2025-01-27T11:35:25Z

src/gt4py/next/program_processors/runners/dace/gtir_builtin_translators.py

@@ -161,24 +168,18 @@ def get_symbol_mapping(
        """
        if isinstance(self.gt_type, ts.ScalarType):
            return {}
+        assert self.origin is not None


You should enforce such constraints in the constructor.

philip-paul-mueller · 2025-01-27T11:59:09Z

src/gt4py/next/program_processors/runners/dace/gtir_sdfg.py


-        In case of `ScalarType` data, the descriptor is constructed with `offset=None`.
+        In case of `ScalarType` data, the descriptor is constructed with `origin=None`.


Suggested change

In case of `ScalarType` data, the descriptor is constructed with `origin=None`.

In case of `ScalarType` data, the `FieldopData` is constructed with `origin=None`.

Could you also add a test that data_node is a transient.

I was wrong, the transient property does not always hold. The lowering creates a FieldopData also for access nodes to global arrays. I will modify the code comment.
A refactoring PR could move the declaration of this method to a type module, close to the FieldopData type declaration.

philip-paul-mueller · 2025-01-27T12:22:59Z

src/gt4py/next/program_processors/runners/dace/gtir_dataflow.py

@@ -835,12 +837,19 @@ def write_output_of_nested_sdfg_to_temporary(inner_value: ValueExpr) -> ValueExp

        outputs = {outval.dc_node.data for outval in gtx_utils.flatten_nested_tuple((result,))}

+        if nsdfg_symbols_mapping is None:
+            # `None` means that all free symbols are mapped to the symbols available in parent SDFG


Suggested change

# `None` means that all free symbols are mapped to the symbols available in parent SDFG

# `None` means that all free symbols are mapped to the symbols available in parent SDFG by the `add_nested_sdfg()` function.

This case also means that we never need to do a remapping where the names inside and outside are different.

I would actually kill this if because as far as I can see nsdfg_symbol_mapping is either None or {"__cond": ...}.

philip-paul-mueller · 2025-01-27T12:24:14Z

src/gt4py/next/program_processors/runners/dace/gtir_dataflow.py

+        if nsdfg_symbols_mapping is None:
+            # `None` means that all free symbols are mapped to the symbols available in parent SDFG
+            pass
+        else:


I do not really understand this case, is it really needed?

philip-paul-mueller · 2025-01-27T12:29:12Z

src/gt4py/next/program_processors/runners/dace/gtir_scan_translator.py

+    for psym, arg in lambda_args_mapping:
+        nsdfg_symbols_mapping |= gtir_translators.get_arg_symbol_mapping(psym.id, arg, sdfg)


This is just my paranoia, for me there is a potential sever error.
Could it be that one argument needs the mapping {'x': 'y'} while another argument needs {'x': 'z'}, or is this not possible or is implementing the check not worth it?

philip-paul-mueller · 2025-01-27T12:32:29Z

src/gt4py/next/program_processors/runners/dace/gtir_sdfg.py

@@ -321,28 +335,46 @@ def unique_tasklet_name(self, name: str) -> str:

    def _make_array_shape_and_strides(
        self, name: str, dims: Sequence[gtx_common.Dimension]
-    ) -> tuple[list[dace.symbol], list[dace.symbol]]:
+    ) -> tuple[list[dace.symbolic.SymExpr], list[dace.symbolic.SymExpr]]:


Suggested change

) -> tuple[list[dace.symbolic.SymExpr], list[dace.symbolic.SymExpr]]:

) -> tuple[list[dace.symbolic.SymbolicType], list[dace.symbolic.SymbolicType]]:

philip-paul-mueller · 2025-01-27T12:41:55Z

src/gt4py/next/program_processors/runners/dace/gtir_sdfg.py

-                outer_node = head_state.add_access(inner_data.dc_node.data)
-                outer_data = inner_data.make_copy(outer_node)
+                # This must be a symbol captured from the lambda parent scope.
+                outer_node = head_state.add_access(inner_dataname)


If I am not mistaken, then this implicitly assumes that the data container on the inside and the outside always have the same name.
If the above is true, then I am not sure if this correct all the time.
If the above statement is false, then what does it means.

Furthermore, what do you mean with the symbol?
How can a symbol be an output?
I am sure I miss something here.

edopao and others added 30 commits December 9, 2024 12:09

temporarily disable one optimize transformation

792a8eb

Revert "temporarily disable one optimize transformation"

61985f7

This reverts commit 792a8eb.

fix for scan output stride

aa236a2

fix previous commit

9bdc75b

converto scalar to array on nsdfg output

746f9d8

Support for calling a program with field arguments whose domain does …

aed4d1e

…not start at zero

Revert "converto scalar to array on nsdfg output"

0d894ff

This reverts commit 746f9d8.

Add test for input arg with different domain

f722c14

Fix format

c5a61e9

Split handling of let-statement lambdas from stencil body

440a474

Merge branch 'main' into field_arg_with_non_zero_domain_start

9e09c86

minor edit

500590b

update dace backend

9deb814

Merge remote-tracking branch 'origin/dace-refact-lambda' into dace-gt…

c56e062

…ir-scan

use dace auto-optimize on gpu

5d5992a

Merge remote-tracking branch 'origin/dace-gtir-scan' into dace-gtir-scan

c167def

Revert "use dace auto-optimize on gpu"

eb17345

This reverts commit 5d5992a.

make map_strides recursive

8b163da

rename module alias

d15213a

review comments

55811dc

Merge remote-tracking branch 'origin/dace-refact-lambda' into dace-gt…

8f0e515

…ir-scan

add test case for sdfg transformation

f01d291

review comments (1)

62e1648

review comments (2)

72e8830

Merge branch 'dace-refact-lambda' into dace-gtir-scan

39aeb20

review comments (2)

de4a80e

Merge remote-tracking branch 'origin/main' into dace-refact-lambda

45f9927

Merge remote-tracking branch 'origin/dace-refact-lambda' into dace-gt…

3fe538b

…ir-scan

Merge remote-tracking branch 'origin/main' into dace-gtir-scan

ee62266

Propagate strides to nested SDFG when changing transient strides

4b0ac60

edopao added 13 commits January 20, 2025 15:11

Merge remote-tracking branch 'origin/dace-gtir-scan' into dace-gtir-r…

f66a899

…efact

fix previous commit

970ed97

Merge remote-tracking branch 'origin/dace-gtir-scan' into dace-gtir-r…

b3139ed

…efact

Merge remote-tracking branch 'origin/main' into dace-gtir-refact

801e5e6

review comments

eafdf12

Merge remote-tracking branch 'origin/dace-gtir-refact' into field_arg…

45787a5

…_with_non_zero_domain_start

Merge remote-tracking branch 'origin/main' into field_arg_with_non_ze…

474f28e

…ro_domain_start

Merge remote-tracking branch 'origin/main' into field_arg_with_non_ze…

6e1ab45

…ro_domain_start

make get_field_symbols shared

14d307e

working draft

761a502

edit todo comment

050c1ae

minor edit

dbf653f

fix previous commit

e320f39

edopao changed the title ~~feat[next][dace]: support for filed origin in lowering to SDFG~~ feat[next][dace]: support for field origin in lowering to SDFG Jan 23, 2025

edopao added 7 commits January 23, 2025 11:45

minor edit

e0dfd0d

added todo comments for missing symbols issue

f452f45

Merge remote-tracking branch 'origin/main' into field_arg_with_non_ze…

7a3a56a

…ro_domain_start

fix previous commit

b563b4a

fix bug with symbol mapping

205ebd5

fix another bug with symbol mapping

86baffb

misc improvements

a96ff70

edopao marked this pull request as ready for review January 23, 2025 23:13

edopao requested a review from philip-paul-mueller January 24, 2025 06:46

philip-paul-mueller reviewed Jan 24, 2025

View reviewed changes

edopao commented Jan 24, 2025

View reviewed changes

review comments

bfc20a2

edopao requested a review from philip-paul-mueller January 24, 2025 17:05

use Range instead of Indices

c763363

philip-paul-mueller reviewed Jan 27, 2025

View reviewed changes

review comments (1)

e366281

		@@ -201,6 +199,24 @@ def _collect_symbols_in_domain_expressions(
		)


		def _get_field_doman_subset(

		@@ -941,10 +901,47 @@ def visit_SymRef(
		return gtir_builtin_translators.translate_symbol_ref(node, sdfg, head_state, self)


		def _remove_field_origin_symbols(ir: gtir.Program, sdfg: dace.SDFG) -> None:

	Set to `None` only for `ScalarType` data.
	Has to be `None` only for `ScalarType` data. For fields it is assumed to be all zero if not given.


		In case of `ScalarType` data, the descriptor is constructed with `offset=None`.
		In case of `ScalarType` data, the descriptor is constructed with `origin=None`.

	In case of `ScalarType` data, the descriptor is constructed with `origin=None`.
	In case of `ScalarType` data, the `FieldopData` is constructed with `origin=None`.

	# `None` means that all free symbols are mapped to the symbols available in parent SDFG
	# `None` means that all free symbols are mapped to the symbols available in parent SDFG by the `add_nested_sdfg()` function.

		for psym, arg in lambda_args_mapping:
		nsdfg_symbols_mapping \|= gtir_translators.get_arg_symbol_mapping(psym.id, arg, sdfg)

	) -> tuple[list[dace.symbolic.SymExpr], list[dace.symbolic.SymExpr]]:
	) -> tuple[list[dace.symbolic.SymbolicType], list[dace.symbolic.SymbolicType]]:

feat[next][dace]: support for field origin in lowering to SDFG #1818

Are you sure you want to change the base?

feat[next][dace]: support for field origin in lowering to SDFG #1818

Conversation

edopao commented Jan 22, 2025 • edited Loading

philip-paul-mueller left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

edopao left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

philip-paul-mueller left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

edopao commented Jan 22, 2025 •

edited

Loading