The location, scale and orientation has been assigned to each keypoint of an image. The last stage of the SIFT calculation is to create the descriptor, which should be highly distinctive and be partial invariant under differing illumination and viewpoint.
Fig. 2.14 Keypoints detected in an image. The start point of arrow is the keypoint’s location, the direction indicates the orientation of the local gradient at the keypoint and the length denotes the magnitude of the local gradient
Fig. 2.15 Keypoint descriptor
Firstly, the coordinates of the descriptor and the orientations of the local gradient are rotated relative to the orientation of keypoint to achieve orientation invariance. The gradient magnitude and orientation are sampled in a region of 16 x 16 pixels around the keypoint. The magnitudes are weighted by a Gaussian window with a a that is 1.5 times that of the circular descriptor window. Then, the orientation histograms over 4 x 4 sample regions are calculated by accumulating the weighted magnitudes with nearly the same direction. Figure2.15 shows a 4 x 4 keypoint descriptor array with 8 orientation bins covering 360° in each. The length of each arrow is the sum of the gradient magnitudes of the samples near that orientation in the corresponding region. Since there are 4 x 4 histogram arrays with 8 orientation bins, which is verified to show the best result , the dimension of the feature vector 128.
At last, the feature vector is normalized to unit length and the values in the unit feature vectors which are larger than 0.2 are changed to 0.2. Then the modified feature vectors are normalized again. The final feature vector achieves invariant to illumination with affine changes.
Therefore, the final SIFT features achieve orientation and scale invariance, partial illumination invariance and be stable when noise is added into the image.