After several months, I have forget something about deep learning and cant even write code.
I will finish and polish it slowly.
(Now this project ends wonderfully in Nov.17th 2024, final core code is: main.py )
The Stanford Dogs dataset contains images of 120 breeds of dogs from around the world. This dataset has been built using images and annotation from ImageNet for the task of fine-grained image categorization.
(before:I choose 10 categories to reduce train difficulty.)
after: contains all 120 categories
Download dataset at: http://vision.stanford.edu/aditya86/ImageNetDogs/
Here are some pictures of cute dogs to be category:
Chihuahua |
Pekinese |
Papillon |
I tried some simple model at first. See them at: models
Including CNN: CNN, VGG11: VGG. However they don't performance well.
Accuracy in train can achieve 100% but low in test. It's overfitting. Because images in dataset are not enough.
I do some work to enhance the data for example:
rotation,nomalization and so on. See them in main.py
, transform module.
I'm happy to see the difference between test and train becomes small but both of them can only reach 60% percent
Maybe my model is too simple?
I tried resnet18: resnet18.
No contribution!
Transfer learning may be more effective for small sample scenarios.
I used the trained resnet18 model for fine-tuning, and the results were obvious: results
Accuracy on test and train set both reach 93% !
Finally I visualized loss and accuracy in training process of different method (10 categories):
Pretrained resnet18 |
ResNet18 |
VGG |
120 categories of pretrained resnet:
Pretrained resnet 120 |
After the practice, something useful I have learnt will be shared here
Transfer learning is effective for training small samples and is also less difficult, we just need to fine-tune the trained model to produce good results.
It's especially useful for Stanford Dogs dataset.
Easier to obtain health featrues. Reduce training difficulty.