# Encoding Rotation Invariant Features of Images

In this section, we focus on the innovation of key techniques for our proposed classification framework for identifying staining patterns of HEp-2 cells.

## Pairwise LTPs with Spatial Rotation Invariant

Histogram-based features describe an image as an orderless collection of “patterns” occurrence frequency, ignoring spatial layout information. This seriously limits descriptive ability especially for shape of objects in the image. Inspired by the SPM [11], we propose to construct a spatial pyramid structure on the feature space of the HEp-2 cell image.

Firstly, histogram-based features are extracted from small overlapped patches within an image. Then, the image is partitioned into increasingly finer spatial subregions over the feature space. Let *t =* 1, *2,...,L* denote level of subpartition, such that there are 2^{t-1} x 2^{t-1} subregions at level t. At level t, the features within each subregion are combined together as

where *If* is the *i*th subregion at level *t* and *H _{i} ^{f}* is the corresponding image feature vector in

*If.*h =

*[h*e R

_{1}, h_{2}, ??? , h_{N}]^{T}*is the patch-level features and*

^{NxQ}*h*denotes the features within

_{i}e If*If. F(?)*is a specific statistics method aggregating occurrences of histogram-based features. In this thesis we adopt the

*max-pooling*strategy:

where the “max” function is a row-wise manner, *Hf* is the *k*-th element of and *hj _{k}* is the k-th element of

*hj*.

Within the spatial pyramid structure, we extract a new rotation invariant textural feature. As aforementioned, LBP is a simple yet effective textural feature. However, the LBP tends to be sensitive to noise and smooth weak intensity gradients, because it thresholds at the gray value of the central pixel [6]. Therefore, the LBPs are extended to LTP defined as

where *I (x _{i}, y_{i})* is the gray value of

*P*equal spaced pixels on a circle of

*R*around

*(x, y), (x*) =

_{i}, yi*(x + R*cos(2n

*i*/

*P), y + R*sin(2n

*i*/

*P)))*is the neighbors location and

*th*is a user-specified threshold value. Usually, each ternary pattern is slipped into a positive pattern and a negative pattern as

The differencebetween theLBP and LTP encoding procedures is shown in Fig. 6.2. The computational complexity for a LBP is mainly based on the number of neighbor pixels *P*. The complexity for a LTP is almost double that for a LBP. Furthermore, the computational times of the LBP and LTP increase proportionally to the pixel count of the image. In the following procedure, operations are implemented based on positive and negative pattern respectively. LTP partially solve aforementioned problems of LBP by encoding the small pixel difference into a separate state [12] and adding the threshold value. Meanwhile it combines the positive and negative halves making it more discriminative. Following operation is implemented on the positive and negative pattern respectively

**Fig. 6.2 ****Difference between LBP and LTP encoding procedures**

To achieve rotation invariance, we assign a rotation invariant value to each LTP pair which is defined by

where x = *(x, y)* is the position vector in image *I* and *Ax _{3} = (d* cos

*3, d*sin

*3)*is a replacement vector between a LTP pair based on the rotation angle

*3*. It is noted that one LBP has two patterns, i.e.,

*t*and

_{p}*t*

_{n}. Therefore, the rotation invariant values are assigned to

*t*and

_{p}*t*respectively.

_{n}*LTP*is the LTP at position x with the rotation angle

_{3}(x)*3*, which can be rewritten as

where *I (x + Ar _{i>3})* is the gray value of

*P*neighboring pixels around center pixel with respect to

*3*and

*Ar*cos(2n

_{i>3}= (R*i/P + 3), R*sin(2n

*i/P + 3))*is a replacement vector from the center pixel to neighboring pixels in a LTP.

Then, the same value is obtained by *P _{3}(x, Ax_{3})(3 = 0, n/4, ж/2, 3п/4, ж) *since their LTP pairs are rotational equivalent. We show that the pairwise LTPs can achieve rotation invariance in Fig. 6.3. For the rotation equivalence class ‘A’, all the

**Fig. 6.3 **An example of the rotation equivalence class. *Black* and *white circles* correspond to ‘0’ and ‘1’ respectively. *s (A)* is the start point of the binary sequence, where *s(A) = (x* + *Rcos(A), y* + *Rsin(A))*

**Fig. 6.4 ****Framework of pairwise LTPs with spatial rotation invariant**

LTP pairs obtain the same value as each of them is equal to the others in terms of rotation; the class ‘B’ is also the same. Particularly, the pairwise LTPs of ‘B’ can be obtained accordingly from that of ‘A’ by rotating 180°. Therefore, we define that the pairwise LTPs in Fig. 6.3 have the same rotation invariant value, that is *P$ = P$ *_{+ж}.

We calculate the histogram *h _{R},_{d}* of rotation invariant values for every

*$*and

*(R, d)*from every patch within an image. Based on the experiments, we choose

*P =*4 for enough accuracy with an affordable costs in computation and memory. The variation of computational cost and memory affected by the choice of parameters

*R*and

*d*is minor. To improve discrimination, the patch-level rotation invariant textural feature

*h = {h*

_{R},

_{d}} is obtained by combining

*h*with various

_{R},_{d}*(R, d).*This framework can be illustrated in Fig. 6.4. Firstly, the image is converted to grayscale image. Then, the grayscale image is partitioned into equal sized patches. The pairwise LTPs with rotation invariant are extracted from each patch. Next, the grayscale image is divided into a sequence of increasingly finer grids over the feature space. Within each grid, the extracted features are integrated using the max-pooling strategy. Finally, all the pooled features from the grids are concatenated together for final classification. Our proposed PLTP-SRI feature is rotation invariance. Meanwhile, it obtains strong descriptive and discriminative power.