Deconvolutions where first introduced by Zeiler et al. (2010) and are also known under the names transposed convolutions and fractionally strided convolutions. This repository explores why these names make sense for certain combinations of input/output/kernel sizes and strides. It also shows that every deconvolution can be implemented by padding/upsampling the input and then a direct convolution.
For more details see our blog post. The examples and images are inspired by Dumoulin et al. (2018).
- Input size:
(4,4)
- Output size:
(2,2)
- Kernel size:
(3,3)
- Stride:
(1,1)
This example shows how the direct convolution from shape (4,4)
-> (2,2)
can be reversed by transposing the matrix which describes this direct convolution. The matrix which describes the covolution is generated by flattening the input and the filter at each filter position, here 4. This transposed convolution is then described with the shapes.
- Input size:
(2,2)
- Output size:
(4,4)
- Kernel size:
(3,3)
- Stride:
(1,1)
- Input size:
(2,2)
/ padded to(6,6)
- Output size:
(4,4)
- Kernel size:
(3,3)
- Stride:
(1,1)
This example illustrates how the deconvolution (2,2)
-> (4,4)
can be expressed as a convolution on a padded (upsampled) input.
- Input size:
(2,2)
/ padded to(7,7)
- Output size:
(5,5)
- Kernel size:
(3,3)
- Stride:
(1,1)
To transform an input of shape (2,2)
into an output of shape (7,7)
you need a deconvolution with stride
Zeiler, Matthew D., Dilip Krishnan, Graham W. Taylor, and Rob Fergus. “Deconvolutional Networks.” In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2528–35. San Francisco, CA, USA: IEEE, 2010. https://doi.org/10.1109/CVPR.2010.5539957.
Dumoulin, Vincent, and Francesco Visin. “A Guide to Convolution Arithmetic for Deep Learning.” ArXiv:1603.07285 [Cs, Stat], January 11, 2018. http://arxiv.org/abs/1603.07285.