Desktop version

Home arrow Engineering arrow Cellular Image Classification

Experimental Datasets in the Book

In order to evaluate the performance of our proposed methods in this thesis, we use two publicly available HEp-2 cells datasets: ICPR2012 dataset from the ICPR’12 HEp-2 cell classification contest dataset and ICIP2013 training dataset from the ICIP’13 Competition on cells classification by fluorescent image analysis. Some examples of the datasets are shown in Fig. 1.6.

The ICPR2012 Dataset

The ICPR2012 dataset consists of 1455 HEp-2 cells segmented from 28 slide images which are obtained by using a fluorescence microscope of 40-fold magnification, equipped with a 50W mercury vapor lamp and a digital camera using a CCD with square pixel of 6.45 p,m. The resolution of obtained images is 1388 x 1038 pixels and the color depth is 24 bits. Each image can be categorized into one of six staining patterns, namely centromere (ce), coarse speckled (cs), cytoplasmic (cy), fine speckled (fs), homogeneous (ho) and nucleolar (nu). Also, fluorescent intensity (i.e., positive and intermediate) is assigned to each image. The cells in the images are manually segmented and annotated by specialists. Then, each cell image and slide image

Samples of the ICPR2012 dataset and the ICIP2013 training dataset with different staining patterns of HEp-2 cells

Fig. 1.6 Samples of the ICPR2012 dataset and the ICIP2013 training dataset with different staining patterns of HEp-2 cells

is verified by a medical doctor specialized in immunology with 11 years’ experience. According to the experimental protocol of the ICPR’12 contest, the ICPR2012 dataset is divided into a training set with 721 cells from half of the slide images and a test set with 734 cells from rest of the slide images. The composition of the dataset is reported in Table 1.1.

It is worth noting that the similarity of the cells in the same slide image is always higher than that of the cells from different slide images. To evaluate the generalization ability of the algorithms, the cells in one slide image can only be used for training or testing. If the cells for training and testing are randomly selected from the database, there can be some cells from the same slide image both for training and testing. The classification accuracy obtained via this strategy is much higher than that by using the contest instruction, which is unfair. In our experiments, we strictly following the experimental protocol of the contest.

Table 1.1 Composition of the ICPR2012 dataset. Each table item represents the number of cells and the number of images which is in parentheses

Type

Training set

Test set

Total

Centromere

208 (3)

149 (3)

357 (6)

Homogeneous

150 (3)

180 (2)

330 (5)

Nucleolar

102 (2)

139 (2)

241 (4)

Coarse speckled

109 (2)

101 (3)

210(5)

Fine speckled

94 (2)

114(2)

208 (4)

Cytoplasmic

58 (2)

51 (2)

109 (4)

Total

721 (14)

734 (14)

1455 (28)

 
Source
< Prev   CONTENTS   Source   Next >

Related topics