The problem is with the Video Camouflaged Object Detection #20

Alex14101987 · 2025-01-21T03:37:23Z

你好！
When I try to go through the model acquisition cycle specified in the Readme, I run the command:
python main_for_image.py --config configs/icod_pretrain.py --info pretrain --model-name PvtV2B5_ZoomNeXt --pretrained

Then I run the command:
python main_for_video.py --config configs/vcod_finetune.py --info finetune --model-name videoPvtV2B5_ZoomNeXt --load-from outputs\PvtV2B5_ZoomNeXt_BS4_LR0.0001_E10_H384_W384_OPMadam_OPGMfinetune_SCstep_AMP_INFOpretrain\exp_0\pth\state_final.pth

And I get the error:

RuntimeError: Error(s) in loading state_dict for videoPvtV2B5_ZoomNeXt:
        size mismatch for hmu_5.fuse.0.temperal_proj_kv.weight: copying a param with shape torch.Size([2, 1]) from checkpoint, the shape in current model is torch.Size([10, 5]).
        size mismatch for hmu_5.fuse.0.temperal_proj.0.weight: copying a param with shape torch.Size([1, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([5, 5, 3, 3]).
        size mismatch for hmu_5.fuse.0.temperal_proj.2.weight: copying a param with shape torch.Size([1, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([5, 5, 3, 3]).
        size mismatch for hmu_4.fuse.0.temperal_proj_kv.weight: copying a param with shape torch.Size([2, 1]) from checkpoint, the shape in current model is torch.Size([10, 5]).
        size mismatch for hmu_4.fuse.0.temperal_proj.0.weight: copying a param with shape torch.Size([1, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([5, 5, 3, 3]).
        size mismatch for hmu_4.fuse.0.temperal_proj.2.weight: copying a param with shape torch.Size([1, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([5, 5, 3, 3]).
        size mismatch for hmu_3.fuse.0.temperal_proj_kv.weight: copying a param with shape torch.Size([2, 1]) from checkpoint, the shape in current model is torch.Size([10, 5]).
        size mismatch for hmu_3.fuse.0.temperal_proj.0.weight: copying a param with shape torch.Size([1, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([5, 5, 3, 3]).
        size mismatch for hmu_3.fuse.0.temperal_proj.2.weight: copying a param with shape torch.Size([1, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([5, 5, 3, 3]).
        size mismatch for hmu_2.fuse.0.temperal_proj_kv.weight: copying a param with shape torch.Size([2, 1]) from checkpoint, the shape in current model is torch.Size([10, 5]).
        size mismatch for hmu_2.fuse.0.temperal_proj.0.weight: copying a param with shape torch.Size([1, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([5, 5, 3, 3]).
        size mismatch for hmu_2.fuse.0.temperal_proj.2.weight: copying a param with shape torch.Size([1, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([5, 5, 3, 3]).

I think the problem is that in the "main_for_image.py" file, on line 386, the variable "num_frames=1" is specified, but if I accept this variable "num_frames=5", then I get an error:

File "C:\Users\ZabockiyAE\ZoomNeXt\methods\zoomnext\layers.py", line 61, in forward
    unshifted_x_tmp = rearrange(x, "(b t) c h w -> b c h w t", t=self.num_frames)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\ZabockiyAE\AppData\Roaming\Python\Python312\site-packages\einops\einops.py", line 591, in rearrange
    return reduce(tensor, pattern, reduction="rearrange", **axes_lengths)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\ZabockiyAE\AppData\Roaming\Python\Python312\site-packages\einops\einops.py", line 533, in reduce
    raise EinopsError(message + "\n {}".format(e))
einops.EinopsError:  Error while processing rearrange-reduction pattern "(b t) c h w -> b c h w t".
 Input tensor shape: torch.Size([4, 192, 12, 12]). Additional info: {'t': 5}.
 Shape mismatch, can't divide axis of length 4 in chunks of 5

P.S.
If we take the variable "num_frames=4" and then take the variable "num_frames=4" in the "configs\vcod_finetune.py" file, then everything works, but probably it's not how it was originally for

P.S.S.
There was no such problem with the pre-trained weights from the repository, but they are currently unavailable for download.

The text was updated successfully, but these errors were encountered:

erobernLi · 2025-01-22T07:54:50Z

You can refer to this #6

Alex14101987 closed this as completed Jan 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The problem is with the Video Camouflaged Object Detection #20

The problem is with the Video Camouflaged Object Detection #20

Alex14101987 commented Jan 21, 2025 •

edited

Loading

erobernLi commented Jan 22, 2025

The problem is with the Video Camouflaged Object Detection #20

The problem is with the Video Camouflaged Object Detection #20

Comments

Alex14101987 commented Jan 21, 2025 • edited Loading

erobernLi commented Jan 22, 2025

Alex14101987 commented Jan 21, 2025 •

edited

Loading