-
Notifications
You must be signed in to change notification settings - Fork 269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Caltech101 #255
base: master
Are you sure you want to change the base?
Caltech101 #255
Conversation
…uch for some categories.
|
||
from fuel.converters.base import fill_hdf5_file, MissingInputFiles | ||
|
||
CATEGORIES = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
101 lines just for the CATEGORIES
tuple is quite long, could we condense that?
I think having a downloader would be very helpful for the dataset. |
@abergeron I'm doing a cleanup of the open PRs. Would you still like to go ahead with this one? |
Is there much to do? |
Aside from my inline comments, we would need a downloader for the dataset, and we would also need to make this a variable-length dataset. I think you could reuse a lot of code from the Imagenet converter if you encode images as a variable-length vector of raw bytes. |
You might find this helpful: https://github.com/MartinThoma/algorithms/blob/master/ML/datasets/caltech101.py#L22-L37 |
This is a converter and a loader for caltech101, which could be useful for some other people.
It might lack a bit of the polish that other datasets have, but it could be useful, at least as a starting point.
There is no downloader and I couldn't figure out how to get scipy to read from an open file.
If you have ideas on how to improve that don't necessitate a long time investment, I can probably do it. Otherwise, this is mostly to avoid duplication of effort in case someone else is working on this.