Home Engineering



HEp2 Cell Image Representation in the Adaptive CoDT Feature SpaceIn the previous section, we model the CoDT feature space of HEp2 cell images as a GMM, and learn the adaptive parameters X = {w_{t}, fi_{t}, E_{t}, t = 1, 2,T} of the GMM. The samples X can be described by the following gradient vector, a.k.a. score function: The gradient describes how the parameters X should be justified to best fit the input X. To measure the similarity between two HEp2 cell images, a Fisher Kernel (FK) [15] is calculated as
where F_{X} is the Fisher Information Matrix (FIM) formulated as
The superscript T means the transpose of G_{X}. Fisher information is a measurement about the amount of information that X carries with respect to parameters X. As F_{X} is symmetric and positive semideflnite, and Ff^{1} can be decomposed as F^{fl} = L^{T}kL_{X}, the FK can be rewritten as
where The normalized gradients with respect to the weights w_{t}, the mean fi_{t} and covariance X_{t} also correspond respectively to 0order, 1storder and 2ndorder statistics. Let z(t) denote the occupancy probability of the CoDT feature x_{n} for the tth Gaussian: It can be also regarded as the soft assignment of x_{n} to the tth Gaussian. To avoid enforcing explicitly the constraints in (7.6), we use a parameter e_{t} to reparameterize the wight parameter w_{k} following the softmax formalism, which is defined as: The gradients of a single CoDT feature x_{n} w.r.t the parameters e_{t}, /u_{t} and a_{t} of the GMM can be formulated as
where the superscript d denotes the dth dimension of the input vector. Then, the normalized gradients are computed by multiplying the squareroot inverse of the diagonal FIM. Let f_{e},, f^ and f_{a}d be the entry on the diagonal of F corresponding to y_{St} log p(x_{n} X), y^d log p(x_{n} X) and y_{a}d log p(x_{n} X) respectively, and calculated approximately as Д = w_{t}, f^ = w_{t}/(ad )^{2} and f_{a}d = 2w_{t}/(a_{t}^{d})^{2}. Therefore, the corresponding gradients as follows:
The Fisher representation is the concatenation of all the gradients for d = 1, 2,D dimension of the CoDT feature and for T Gaussians. In our cases, we only consider the gradients with respect to the mean and covariance, i.e., G^d (X) and G_{a}d (X), since the gradient with respect to the weights is verified that bring little additional information [13]. Therefore the dimension of the resulting representation is 2DT. The CoDT features are embedded in a higherdimensional feature space which is more suitable for linear classification. To avoid dependence on the sample size, we normalize the final image representation by the size of CoDT features from the HEp2 cell image, N, i.e., G (X) = NGx (X). After that, two additional normalization steps [23] are conducted in order to improve the results, that are the power normalization and ^normalization. Power normalization is performed in each dimension as:
In this study, we choose the power coefficient т = 2. The motivation of power normalization is to “unsparsify” the Fisher representation which becomes sparser while the number of Gaussian components of the GMM is increasing. ^normalization is defined as:
Our proposed AdaCoDT method has several advantages over the BoW framework [13, 23]. Firstly, it is a generalization of the BoW framework. The resulting representation is not limited to the occurrences of each visual word. It additionally includes the information about the distribution of the CoDT features. It overcomes the information loss raised by the quantization procedure of the BoW framework. Secondly, it defines a kernel from a generative model of the data. Thirdly, it can be generated from a much smaller codebook and therefore it reduces the computational cost compared with the BoW framework. And lastly, with the same size of vocabulary, it is much larger than the BoW representation. Hence, it assures an excellent performance with a simple linear classifier. 
<<  CONTENTS  >> 

Related topics 