Each dataset consist of two files: dataset-gen.lua
and dataset.lua
. The dataset-gen.lua
is responsible for one-time setup, while
the dataset.lua
handles the actual data loading.
The dataset-gen.lua
performs any necessary one-time setup. For example, the cifar10-gen.lua
file downloads the CIFAR-10 dataset.
The module should have a single function exec(opt, cacheFile)
.
opt
: the command line optionscacheFile
: path to output
local M = {}
function M.exec(opt, cacheFile)
local imageInfo = {}
-- preprocess dataset, store results in imageInfo, save to cacheFile
torch.save(cacheFile, imageInfo)
end
return M
The dataset.lua
should return a class that implements three functions:
get(i)
: returns a table containing two entries,input
andtarget
input
: the training or validation image as a Torch tensortarget
: the image category as a number 1-N
size()
: returns the number of entries in the datasetpreprocess()
: returns a function that transforms theinput
for data augmentation or input normalization