Spatial Pyramid Pooling as an additional layer in caffe, the Deep Learning Framework


Abstract

An implementation of the concept proposed in He, Kaiming, et al. “Spatial pyramid pooling in deep convolutional networks for visual recognition“.
It adds a new custom layer to the AlexNet architecture which performs spatial pyramid pooling in order to remove the network’s need for fixed-size input images.
This was part of my Bachelor’s Thesis which I wrote at the Multimedia Computing and Computer Vision Lab, University of Augsburg under the supervision of Christian Eggert.

Thesis

Thesis