Desktop version

Home arrow Engineering

Mid-Level Features

BoW framework and spatial pyramid matching (SPM) are two popular examples of mid-level features. The target of BoW framework is to embed low-level descriptors in a representative codebook space. We introduce the key techniques employed in the BoW framework including SPM. First of all, low-level descriptors are firstly extracted at interest points or in dense grids. Then, a pre-defined codebook B is applied to encode each descriptor using a specific coding scheme. The code is normally a vector with binary or continuous elements depends on coding scheme, which can be referred as mid-level descriptor. Next, the image is divided into increasingly finer spatial subregions. Multiple codes from each subregion are pooled together by averaging or normalizing into a histogram. Finally, the final image representation is generated by concatenating the histograms from all subregions together. There are two modules in the framework, i.e., coding and pooling.

• Coding: Local features of each image can be transformed to a collection of

feature codes using a specific coding method. We compute a set of codes C = {c1; c2,...,cN} e to represent the input local features X =

{x1, X2 ,...,xN} eRDxN by

where the first term measures the approximation error and the second one serves as a regularization term. We minimize information loss mainly by adjusting the regularization term.

• Spatial Pooling: The pooling procedure transforms mid-level features from an image into a final image representation. A crucial component which has great impact on pooling is SPM. It captures spatial layout information by expressing spatial relations at multiple levels of quantization. The codes within each spatial subregion are summarized by using a specific statistics strategy, such as the average of codes or their maximum.

< Prev   CONTENTS   Source   Next >

Related topics