In the Histogram of Oriented Gradient (HOG) feature, the gradient orientation is described as the angle between the gradient and the horizontal or vertical direction. HOG concentrates on the gradient information of an image, but it is different from HOG employs the dense grid of uniformly spaced cells and the overlapping local contrast normalization to strengthen the robustness to illumination and shadow.
The extraction procedure of HOG is outlined in the following steps.
Step 1: Calculate image gradient
Firstly, calculate the gradient scales and orientations using the masks [−1,0,1] and [−1,0,1] T in the sub-image covered by the detection window as shown in the following Eqs: (1) and (2). For color images, separate gradients for each color channel are calculated and taken the one with the largest norm as the pixel’s gradient vector shown in Figure 1.
For x-direction of the gradient: (1) G_X=(∂I/∂X)=f(x+1,y)-f(x-1,y)
For y-direction of the gradient: (2) G_Y=(∂I/∂Y)=f(x,y+1)-f(x,y-1)
Step 2: Calculate gradient magnitude and orientation
Secondly, the magnitude and orientation of the gradient are calculated as the following Eqs (3) and (4) respectively. The gradient has a magnitude and direction at each pixel in the image. For color images, the gradient of the three channels is evaluated and the magnitude of the gradient is the maximum of the magnitude among the three channels. The direction is the angle related to this maximum of the gradient.
For gradient magnitude: (3) G_M= (G_(X² )+ G_(Y² ) )^(1⁄2)
For gradient orientation: (4) θ(x,y)= 〖tan〗^(-1) (G_Y/G_X )
Step 3: Accumulate weighted votes into the bins
Thirdly, the weighted vote of each pixel is computed in each cell of the block and each vote of the gradient magnitude is bi-linearly interpolated into the neighboring bins, which has several bins evenly spaced over 0°~180°. The feature vectors of each cell are accumulated using trilinear interpolation. Trilinear interpolation is used to distribute smoothly from the interpolating gradients over the horizontal axis, vertical axis, and orientation axis. The histogram is contributed corresponding to the gradient in each cell of a block.
Step 4: Normalize the contracts over the block of cells
Then, the cells are groups into a block and the contrasts are normalized within each block before concatenating the histogram of each region. The dimension of each block is determined by the number of orientation bins in the block. The illustration of the normalization within overlapping blocks of the cell is demonstrated in the following Figure 3.
Step 5: Calculate the HOG feature vector
Finally, the descriptor is calculated from all the components of the normalized cell responses at all of the blocks in the detection window. In Figure 4, the summary steps for the calculation of the HOG feature descriptor are illustrated. For example, the final feature descriptor is 3780 dimension if the parameter is set as 64 × 128 detection window, 7 ×15 blocks, and 9-bins orientation.