-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resolve Annotated
types (PEP 593) on dataflow visualizations
#1276
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Reviewed everything up to ace24a3 in 1 minute and 10 seconds
More details
- Looked at
15
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
4
drafted comments based on config settings.
1. hamilton/htypes.py:102
- Draft comment:
Consider using _get_origin (already imported from typing or typing_extensions) instead of typing.get_origin for consistency across the file. - Reason this comment was not posted:
Confidence changes required:50%
<= threshold50%
None
2. hamilton/htypes.py:102
- Draft comment:
Add a Sphinx documentation note under docs/ explaining how Annotated types are handled in visualizations. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold50%
None
3. hamilton/htypes.py:102
- Draft comment:
To ensure compatibility with Python versions below 3.9, consider using the existing alias (e.g.column
) instead of directly referencingAnnotated
. In the current conditional,Annotated
might be undefined when running on older Python versions. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50%
The comment suggests using column alias instead of Annotated for compatibility, but the code already handles this through version-specific imports. The direct use of Annotated here is inside a function that's used after the proper imports are set up. The code is actually well-structured for version compatibility.
I could be missing some edge case where the imports fail. Maybe there's a reason they specifically created the column alias that I'm not seeing.
The code explicitly handles Python version compatibility through proper imports and version checks. The use of Annotated here is safe because it will be properly defined regardless of Python version.
Delete the comment. The code already properly handles Python version compatibility through explicit version checks and appropriate imports.
4. hamilton/htypes.py:114
- Draft comment:
Add Sphinx documentation to explain how Annotated types are resolved (e.g. in a new docs/type_annotations.rst section) to help users understand the visualization and metadata behavior. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
Workflow ID: wflow_3xrEuMQOW72G2x0l
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good. @elijahbenizzy any thoughts?
Tests seem to be failing due to dask releasing a new version -- will try to see if I can make a fix on a separate branch for them.
The changes seem very clean. It's reasonable to think that Regarding |
This looks good! Technically this could be construed as backwards incompatible (E.G. if you were using the node -> dict representation), but I'm not particularly concerned about that. Specifically, annotated is metadata that may not be serialized, so I'm OK with it. We also don't make guarantees on the stability of the node to dict function. But if you wanted to be a stickleer about backwards compatibility, you could add a parameter (hide_annotated) that defaults to false. Up to you -- happy to merge either way! |
@elijahbenizzy Ah, wow, I didn't even think about that incompatibility. Thinking out loud here... I see that the result of I completely understand that it is a breaking change that someone out there could be relying on, perhaps it's better to introduce something like as temporary provision: def get_type_as_string(type_: Type, resolve_annotated: bool = True) -> Optional[str]:
"""Get a string representation of a type.
The logic supports the evolution of the type system between 3.8 and 3.10.
:param type_: Any Type object. Typically the node type found at Node.type.
:param resolve_annotated: Determines if Annotated types should be resolved to their underlying type.
:return: string representation of the type. An empty string if everything fails.
"""
if resolve_annotated and _is_annotated_type(type_):
type_string = get_type_as_string(typing.get_args(type_)[0])
elif getattr(type_, "__name__", None):
type_string = type_.__name__
elif typing_inspect.get_origin(type_):
base_type = typing_inspect.get_origin(type_)
type_string = get_type_as_string(base_type)
elif getattr(type_, "__repr__", None):
type_string = type_.__repr__()
else:
type_string = None
return type_string With all that said, if you think that mlflow or openlineage would benefit from having the actual type (vice |
Yeah, so thinking about it, |
@elijahbenizzy That sounds good to me! I also had an idea (that I mention only for the sake of posterity), that we could separate the |
okay if you rebase from main then dask issues should go away. |
a48b6c1
to
023a384
Compare
@@ -188,6 +194,12 @@ def test_get_type_as_string(type_): | |||
pytest.fail(f"test get_type_as_string raised: {e}") | |||
|
|||
|
|||
def test_type_as_string_with_annotated_type(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you feel like doing more testing work we should really test the string output of these, not just that they fail. But that's out of scope.
Recently, I was attempting to use
Annotated
types (PEP 593) for the output of some nodes (seeA
):The execution works as excepted, but when I run
dr.display_all_functions("dag.png")
the type ofA
is masked asAnnotated
on the resulting graphic:Ideally, IMHO, we would want to see the underlying type such as:
Changes
I updated
hamilton.htypes.get_type_as_string
so that it will considerAnnotated
types. Note that it appears thattyping.get_origin
andtyping_inspect.get_origin
do not produce the same result forAnnotated
types (see ilevkivskyi/typing_inspect#109).How I tested this
I updated the current test for
get_type_as_string
intest_type_utils.py
and added an additional check focusing onAnnotated
types.Checklist