Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mismatch Decoder when Training a Network #173

Open
matteo-collina opened this issue Dec 26, 2024 · 2 comments
Open

Mismatch Decoder when Training a Network #173

matteo-collina opened this issue Dec 26, 2024 · 2 comments

Comments

@matteo-collina
Copy link

Hi,

I am using the last version of TagLab and I am getting this error while training a network:

RuntimeError: Error(s) in loading state_dict for DeepLab: size mismatch for decoder.last_conv.8.weight: copying a param with shape torch.Size([41, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([40, 256, 1, 1]). size mismatch for decoder.last_conv.8.bias: copying a param with shape torch.Size([41]) from checkpoint, the shape in current model is torch.Size([40]).

I made an hotfix adding +1 at line 567 of training.py like this:

net = DeepLab(backbone='resnet', output_stride=16, num_classes=(output_classes+1))

but I am not sure this is a good method. The model gets trained but the result is not working (the graph produced at the end does not show any value and during the training, I get "nan" starting from epoch 3). I get the same error if I try to auto-segment, but the error is generated from MapClassifier.py.

Moreover, there is also another problem with line 32 of coral_dataset.py:

#PixelDropout(always_apply=False, p=0.2, dropout_prob=0.02, per_channel=0, drop_value=(0, 0, 0), #mask_drop_value=None)

according to documentation drop_value should be a float or a sequence of float (https://albumentations.ai/docs/api_reference/augmentations/transforms/. I fixed it using None but again I am not sure this is the right way to proceed.

How can I solve those issues?

Cheers,
Matteo

@matteo-collina matteo-collina changed the title Mismatch Decoder when Train a Network Mismatch Decoder when Training a Network Dec 26, 2024
@maxcorsini
Copy link
Member

Hi, it seems that the number of classes set does not meet the number of classes during the training. This is not a problem of TagLab but a problem of the input data. Please, send me by email some information about the dictionary you are using, the classes selected to build the classifier, and what you have done to set up the dataset for the training.

Regarding the second problem, if you are the last version of albumentations you can replace drop_value = (0,0,0) with drop_value = [0,0,0].

Best

@matteo-collina
Copy link
Author

Thanks for your help. I just sent you an email.

Best

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants