You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, Thanks for sharing.
But I have a question, I hope I can get the answer.
We can usually get 36 detection objects using Faster R-CNN, but I see that det_sequences usually contain only a few objects in file coco_entities_release.json. I‘m not sure which mechanism in the model implements the filter effect from 36 objects to several objects?
Can I understand this? Sorting Network plays a role. Because the sorting network ranks the regions of higher importance in the front, and only the first few regions of the sequence of region sets are used to generate the caption. So there are still a lot of regions that have not been used to be filtered out.
Or is it that Adaptive attention with visual sentinel?
Thank U.
The text was updated successfully, but these errors were encountered:
Hi, Thanks for sharing.
But I have a question, I hope I can get the answer.
We can usually get 36 detection objects using Faster R-CNN, but I see that
det_sequences
usually contain only a few objects in filecoco_entities_release.json
. I‘m not sure which mechanism in the model implements the filter effect from 36 objects to several objects?Can I understand this? Sorting Network plays a role. Because the sorting network ranks the regions of higher importance in the front, and only the first few regions of the sequence of region sets are used to generate the caption. So there are still a lot of regions that have not been used to be filtered out.
Or is it that Adaptive attention with visual sentinel?
Thank U.
The text was updated successfully, but these errors were encountered: