Experiments and Analyses
To evaluate the performance of the proposed LLDC method for staining pattern classification, we use two publicly available HEp-2 cells datasets as described in Sect.1.3.
We firstly extract dense SIFT features as the local feature. SIFT features are invariant to scaling and rotation, and partially invariant to illumination change, viewpoint change and noise. These properties are advantageous in staining pattern classification as cell images are unaligned and have high within class variabilities. In our experiments, SIFT features are extracted at single-scale from densely located patches of gray-level images. The patches are centered at every 6 pixels and with a fixed size of 18 x 18 pixels.
To obtain local distance vectors, the number of anchor points (m^j for each class manifold Mc are fixed to 1024, then the size of the merged M for our proposed local distance vectors transformation is 6144 x 128. For the SIFT features and the corresponding local distance vectors, all the codebooks in coding process contain 1024 visual words learned from training samples by using k-means clustering method. One of the most important parameters for our proposed LLDC method is kLDV that defines the neighborhood of a local feature in local distance vector transformation. In the following coding process, the number of neighbors in the LLC method (i.e., kLLC) is another parameter which can influence the classification performance. We also adopt the LSC method to encode the local distance vector, therefore the impact of neighbor size (i.e., kLSC) will be discussed while the smoothing factor в is fixed as 10. We will study the influence of these parameters for staining pattern classification in Sect.5.3.4.
After coding process, we partition each cell image into three increasingly finer subregions, which is 1 x 1,2 x 2 and 4 x 4. We apply max-pooling strategy to pool the codes from each spatial subregion. The obtained features within all the subregions are concatenated to generate final image representation. Then we employ a linear SVM classifier for classification. In our experiments, we use the LIBLINEAR package , thanks to its efficiency in implementation. The linear SVM is trained based on the training set by 10-fold cross validation strategy and tested using the test set. The training set is randomly partitioned into 10 equal sized subset. A single subset is chosen as the validation data for testing the linear SVM and the remaining nine subsets are used for training. Each subset has to be used once as the validation data. Therefore, aforementioned procedure is repeated 10 times. The penalty parameter of the linear SVM is set as C = 10. Actually, the classification performance is almost steady for different values of penalty parameter.
The experimental results are reported at the cell level and the image level respectively. At the cell level, let tpi, tni, fpi and fni respectively denote the true positives, true negatives, false positives and false negatives for an individual staining pattern class Ci. In our experiments, we use the performance measures accuracy and sensitivity at the cell level which are formulated as
where S is the number of staining pattern classes.
At the image level, the prediction for staining pattern of each image is decided by the most frequently assigned pattern of the cells within that image. In our experiments, we use accuracy = #COTrectly#cladsified si^^s as the classification accuracy at the
image level, where # means “the number of”.