-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow get_entity_subset_from_asset_graph_subset to stil return EntitySubsets for serialized AssetGraphSubsets when the underlying partitions definition has gotten larger #27470
Conversation
if ( | ||
not isinstance(self.value, TimeWindowPartitionsSubset) | ||
or not isinstance(partitions_def, TimeWindowPartitionsDefinition) | ||
or current_partitions_def.timezone != partitions_def.timezone | ||
or current_partitions_def.fmt != partitions_def.fmt | ||
or current_partitions_def.cron_schedule != current_partitions_def.cron_schedule | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The isinstance(self.value, TimeWindowPartitionsSubset)
check can be removed since it's redundant - this code is already inside an elif
block that confirms self.value
is a TimeWindowPartitionsSubset
. The redundant check will always evaluate to False
, making the condition harder to understand.
Spotted by Graphite Reviewer
Is this helpful? React 👍 or 👎 to let us know.
or not isinstance(partitions_def, TimeWindowPartitionsDefinition) | ||
or current_partitions_def.timezone != partitions_def.timezone | ||
or current_partitions_def.fmt != partitions_def.fmt | ||
or current_partitions_def.cron_schedule != current_partitions_def.cron_schedule |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cron_schedule
comparison is comparing the schedule with itself (current_partitions_def.cron_schedule != current_partitions_def.cron_schedule
). This should be comparing current_partitions_def.cron_schedule
with partitions_def.cron_schedule
instead.
Spotted by Graphite Reviewer
Is this helpful? React 👍 or 👎 to let us know.
3412c3b
to
14de39c
Compare
14de39c
to
d49a31e
Compare
@@ -66,6 +72,60 @@ def is_empty(self) -> bool: | |||
else: | |||
return not self.bool_value | |||
|
|||
def with_asset_graph_partitions_def( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i did not yet change get_entity_subset_from_asset_graph_subset because it has existing callsites in DA that seem to expect it to return None if its invalid - wanted to run the possible gameplan by you there first @OwenKephart since you had mentioned making is_compatible_with_partitions_def more of a DA-specific thing...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's reasonable!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify - you think leaving the method as it is today is reasonable? Or bringing it in line with the changes I made in this PR?
Still planning to flesh out the test cases more but I think this is ready for a first look based on our conversation earlier owen (also curious for your thoughts on the inline question). |
@@ -66,6 +72,60 @@ def is_empty(self) -> bool: | |||
else: | |||
return not self.bool_value | |||
|
|||
def with_asset_graph_partitions_def( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love the name, maybe just with_current_partitions_def
?
I also think this would be a bit cleaner if we passed in the AssetGraphView
object (which has access to all of the parameters we're passing in)
Alternatively, we could just make this a private method on AssetGraphView itself, which I might bias towards personally
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, I think the logic makes sense
It's possible that we'll want to refactor this in the future (or create a parallel version of this) that does just return None on incompatibilities instead of erroring, so that we can use this same logic in cases where those incompatibilities can be handled gracefully. However, I can't think of an easy way of doing that while preserving these error messages so this is ok for now.
Had some comments on structure in-line
d49a31e
to
68e2d51
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice
68e2d51
to
cebec3e
Compare
…ubsets for serialized AssetGraphSubsets when the underlying partitions definition has gotten larger > Insert changelog entry or delete this section.
cebec3e
to
b240e98
Compare
Summary & Motivation
Fixes an issue where a relatively benign change to a partitions definition like extending the range of the partition keys could cause a backfill to fail due to the partitions not exactly matching. Instead of rejecting the subset as invalid, it returns a corrected subset object with the latest partitions definition in it, as long as it is valid.
How I Tested These Changes
New test case
Changelog
NOCHANGELOG