-
Notifications
You must be signed in to change notification settings - Fork 27.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Updated BridgeTower Image processor #32384
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
@Sai-Suraj-27 This affects the image processing tests for bridgetower, which will need to be updated to reflect this fix |
…to fix_assignment1
Thanks for the review. Yes, I am looking into it. Will try to fix and update the PR 👍🏻 |
Hi @Sai-Suraj-27, thanks for iterating on this! Looking at the changes in the test, I realized that this fix (which is addressing a bug in the code) would be introducing a change in the default behaviour of the image processor, and therefore would produce different outputs for the user. Previously, because What we should do is:
|
Thanks for the review and explanation @amyeroberts. I've made the required changes. Please look into it and let me know if I missed anything. |
…to fix_assignment1
…to fix_assignment1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for tackling this! Just a few places we need to be careful to safely deprecate to avoid surprises for the user
@@ -119,6 +122,8 @@ def get_expected_values(self, image_inputs, batched=False): | |||
return expected_height, expected_width | |||
|
|||
def expected_output_image_shape(self, images): | |||
if self.do_center_crop: | |||
return self.num_channels, self.crop_size["height"], self.crop_size["width"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To reflect the crop behaviour of the image processor
return self.num_channels, self.crop_size["height"], self.crop_size["width"] | |
return self.num_channels, self.size["shortest_edge"], self.size["shortest_edge"] |
@@ -58,6 +59,7 @@ def __init__( | |||
self.rescale_factor = rescale_factor | |||
self.do_normalize = do_normalize | |||
self.do_center_crop = do_center_crop | |||
self.crop_size = crop_size if crop_size is not None else {"height": 288, "width": 288} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.crop_size = crop_size if crop_size is not None else {"height": 288, "width": 288} |
@@ -73,6 +75,7 @@ def prepare_image_processor_dict(self): | |||
"do_normalize": self.do_normalize, | |||
"do_resize": self.do_resize, | |||
"size": self.size, | |||
"crop_size": self.crop_size, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"crop_size": self.crop_size, |
@@ -296,10 +296,9 @@ def center_crop( | |||
The channel dimension format of the input image. If not provided, it will be inferred from the input | |||
image. | |||
""" | |||
output_size = size["shortest_edge"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, what we want to do is update the docstring to reflect the true behaviour of the method, but still take the same size["shortest_edge"]
argument. Otherwise this would be a breaking change and previous configs for the model would have a different behaviour. In fact - I think existing configs would break as they don't have "height"
or "width"
keys for size
@@ -42,6 +42,7 @@ def __init__( | |||
rescale_factor: Union[int, float] = 1 / 255, | |||
do_normalize: bool = True, | |||
do_center_crop: bool = True, | |||
crop_size: Dict[str, int] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
crop_size: Dict[str, int] = None, |
@@ -277,6 +268,7 @@ def center_crop( | |||
The channel dimension format of the input image. If not provided, it will be inferred from the input | |||
image. | |||
""" | |||
warnings.warn("The center_crop method is deprecated and will be removed in two releases time.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should specify the exact release version.
do_center_crop: Optional[bool] = None, | ||
crop_size: Dict[str, int] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't just remove these like this as this will break things for users. We need to still accept them and emit a warning if they are set
@@ -126,7 +126,6 @@ class PaliGemmaPreTrainedModel(PreTrainedModel): | |||
_no_split_modules = ["PaliGemmaMultiModalProjector"] | |||
_skip_keys_device_placement = "past_key_values" | |||
_supports_flash_attn_2 = False | |||
_supports_cache_class = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't remove this here
cc @qubvel can you take a look at this while Amy is out? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, btw I am not sure I understand from reading the interactions what this fixes? It's also kinda breaking as you remove something but more importantly could you explain the motivations?
Thanks for looking into the PR. Line 458, here is the initial reason I made the PR. transformers/src/transformers/models/bridgetower/image_processing_bridgetower.py Lines 455 to 458 in 2c3a44f
Initially i just changed the line to: do_center_crop = do_center_crop if do_center_crop is not None else self.do_center_crop But tht affected the image processing tests for bridgetower, which lead to these changes. |
What does this PR do?
Updated BridgeTower Image processor.
Fixes # (issue)
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@amyeroberts @ArthurZucker