The treasure beneath convolutional layers: cross-convolutional-layer pooling for image classification

Liu, L.; Shen, C.; van den Hengel, A.

Please use this identifier to cite or link to this item: https://hdl.handle.net/2440/107960

Scopus	Web of Science®	Altmetric
Citations
?	?

Type:	Conference paper
Title:	The treasure beneath convolutional layers: cross-convolutional-layer pooling for image classification
Author:	Liu, L. Shen, C. van den Hengel, A.
Citation:	Proceedings / CVPR, IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2015, vol.07-12-June-2015, pp.4749-4757
Publisher:	IEEE
Issue Date:	2015
Series/Report no.:	IEEE Conference on Computer Vision and Pattern Recognition
ISBN:	9781467369640
ISSN:	1063-6919
Conference Name:	2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015) (7 Jun 2015 - 12 Jun 2015 : Boston, MA)
Statement of Responsibility:	Lingqiao Liu, Chunhua Shen, Anton van den Hengel
Abstract:	A number of recent studies have shown that a Deep Convolutional Neural Network (DCNN) pretrained on a large dataset can be adopted as a universal image descriptor, and that doing so leads to impressive peiformance at a range of image classification tasks. Most of these studies, if not all, adopt activations of the fully-connected layer of a DCNN as the image or region representation and it is believed that convolutional layer activations are less discriminative. This paper, however, advocates that if used appropriately, convolutional layer activations constitute a powerful image representation. This is achieved by adopting a new technique proposed in this paper called crossconvolutional- layer pooling. More specifically, it extracts subarrays of feature maps of one convolutional layer as local features, and pools the extracted features with the guidance of the feature maps of the successive convolutional layer. Compared with existing methods that apply DCNNs in the similar local feature setting, the proposed method avoids the input image style mismatching issue which is usually encountered when applying fully connected layer activations to describe local regions. Also, the proposed method is easier to implement since it is code book free and does not have any tuning parameters. By applying our method to four popular visual classification tasks, it is demonstrated that the proposed method can achieve comparable or in some cases significantly better performance than existing fully-connected layer based image representations.
Keywords:	Principal component analysis
Rights:	© 2015 IEEE
DOI:	10.1109/CVPR.2015.7299107
Grant ID:	http://purl.org/au-research/grants/arc/FT120100969 http://purl.org/au-research/grants/arc/LP130100156
Published version:	http://dx.doi.org/10.1109/cvpr.2015.7299107
Appears in Collections:	Aurora harvest 3 Computer Science publications

Files in This Item:

File	Description	Size	Format
RA_hdl_107960.pdf Restricted Access	Restricted Access	1.24 MB	Adobe PDF	View/Open

Show full item record

Adelaide Research & Scholarship