Feature extraction of hyperspectral images for detecting immature green citrus fruit

At an early immature growth stage of citrus, a hyperspectral camera of 369–1042 nm was employed to acquire 30 hyperspectral images in order to detect immature green fruit within citrus trees under natural illumination conditions. First, successive projections algorithm (SPA) were implemented to select 677, 804, 563, 962, and 405 nm wavebands and to construct multispectral images from the original hyperspectral images for further processing. Then, histogram threshold segmentation using NDVI of 804 and 677 nm was implemented to remove image backgrounds. Three slope parameters, calculated from the pairs 405 and 563 nm, 563 and 677 nm, and 804 and 962 nm were used to construct a classifier to identify the potential citrus fruit. Then, a marker-controlled watershed segmentation based on wavelet transform was applied to obtain potential fruit areas. Finally, a green fruit detection model was constructed according to Grey Level Co-occurrence Matrix (GLCM) texture features of the independent areas. Three supervised classifiers, logistic regression, random forest and support vector machine (SVM) were developed using texture features. The detection accuracies were 79%, 75%, and 86% for the logistic regression, random forest, and SVM models, respectively. The developed algorithm showed a great potential for identifying immature green citrus for an early yield estimation.


Introduction
Efficient detection and estimation of the number of fruit within tree canopies in natural environments is one of the main applications of computer vision in agriculture. In recent years, several studies on the detection of mature citrus fruit on trees were carried out by using computer vision. Bulanon et al. [1,2] applied multispectral imaging using six optical bands of 550, 600, 650, 700, 750, and 800 nm to enhance citrus fruit detection in the field under natural daylight condition. They presented a potential approach for improving fruit detection using thermal images when there was the largest temperature difference between fruit and non-fruit objects. Xu et al. [3] used red (R) and blue (B) components to identify citrus fruit , while Cai et al. [4] used 2R-G-B where G was the green component of a color image.
Early estimation of citrus yield can provide many advantages to citrus growers, such as irrigation adjustment, pest control, fertilization, weed control and labor assignment. There have been several studies on green citrus fruit detection based on spectral characteristics and shape analysis. Kane and Lee [5] found that the optimal wavelengths for separating green citrus fruit and leaves were 881, 781, and 1383 nm under laboratory conditions. To accommodate varying outdoors illumination conditions, researchers carried out some further studies. Kurtulmus et al. [6] used sub-windows to scan entire images and the green fruit were detected according to color, circular Gabor texture and eigenfruit in each sub-window. Sengupta and Lee [7] conducted shape analysis to detect as many fruit as possible. Three detection methods were used to remove false positives, a support vector machine (SVM), Canny edge detection combined with a graphbased connected component algorithm and Hough line detection. However, shape and texture analysis requires extensive computer resources. Zhao et al. [8] developed a block-matching method, the sum of absolute transformed difference, to detect potential fruit pixels. Li et al. [9] proposed a fast normalized cross correlation algorithm for detecting and counting immature green citrus fruit using outdoor color images. Okamoto and Lee [10] conducted stepwise waveband selection and a linear discriminant analysis with hyperspectral images, and correctly identified 80%-89% of the fruit. This study showed that hyperspectral imaging technique has the potential for detecting immature green citrus and constructing an early estimation yield system based on selected multispectral bands.
In summary, there are difficulties in detecting immature green citrus fruit due to the similar color between the leaves and fruit, occlusion of fruit, and varying outdoor illumination conditions. Though high dimensionality of data may increase the computational burden, it has been proved that multispectral imaging with selected bands from hyperspectral images is an efficient way to create an early yield estimation system. According to class separability, various methods can be used to select important bands, such as transformed divergence, Bhattacharyya distance, Jeffries-Matusita (JM) distance, spectral angle mapper, and orthogonal projection divergence. It is also likely that unsupervised band selection can be completed by applying variance or signal to noise ratio criteria. Further processing may also encounter many difficulties under outdoor illumination conditions. Furthermore, different spectral reflectance (Fig. 1) would be obtained under different illumination conditions even though the sample spots come from the same fruit or the same leaf.
In view of the above difficulties, this paper aimed to propose a new detection method for green citrus fruit. Specific objectives were: (1) To select sensitive wavebands from hyperspectral images based on successive projections algorithm; (2) To develop a marker-controlled watershed algorithm based on wavelet analysis; (3) To identify green citrus through texture analysis and three different classifiers (logistic regression, random forests and support vector machines). Figure 2 is the basic flowchart for algorithm development for this research. The algorithm is divided into four parts: (1) the acquisition of hyperspectral image and flat field correction, (2) identification of potential citrus, which involves band selection, background removal, feature extraction and preliminary identification of citrus, (3) precision identification of citrus, including marker discovery based on the two-dimensional wavelet transform, marker-controlled watershed segmentation, texture feature extraction and classifier construction, and (4) accuracy analysis of citrus identification.

Hyperspectral image acquisition
The hyperspectral imaging system used in this study contained a digital CCD camera (MV-D1312, Photonfocus AG, Lachen SZ, Switzerland) and a line scanning spectrometer (V10E, Specim, Oulu, Finland). The experiments were carried out at an experimental citrus grove at the University of Florida in Gainesville, Florida, USA on September 14, October 5 and November 4, 2015. A total of 30 hyperspectral images were collected. Each acquired image consisted of 388 wavebands with a spectral resolution of 1.59 nm and the spectral range was 398-1010 nm. In the process of image acquisition, three reference Lambertian reflectance panels of 50%, 75%, and 99% reflectance were used for flat field calibration.

Flat field correction
Flat field correction is a technique used to improve quality in digital imaging. The goal is to reduce hyperspectral data to relative reflectance. It can normalize each image spectrum to the flat field spectrum, which could remove most of the atmospheric and solar irradiance effects. Equation (1) below was used for the flat field correction to achieve the relative reflectance at different wavelengths for the original hyperspectral image [11] .
where R is relative reflectance image at different wavelengths, R O is the original raw image, R W is the image of the reflectance calibration board, R D is the dark image taken by covering the camera with a cap, and l is the wavelength.

Successive projections algorithm
The original hyperspectral image was composed of 388 different wavebands. Since redundant information of the whole spectrum can have significant adverse effect on detection performance, it is necessary to eliminate redundant and other useless information. As a forward selection method, successive projections algorithm (SPA) can be used to select a set of wavelengths including maximal information content and minimal redundancy. In the procedure, SPA begins with one wavelength specified manually by the researcher, and then incorporates a new wavelength at each operation of projection until a specified number (N) of wavelengths is reached. SPA steps are described below with a given initial wavelength k(0). The total number of wavelengths in the spectrum is J and the desired number of variables is N [12] .
Step 1: Let S be the set of wavelengths which have not been selected yet. That is, S = {j such that 1£j£J and Step 2: Calculate the projection of x j on the subspace orthogonal to x k(n -1) as Step 3: Let k(n) = arg(max ||Px j || j 2 S).
The above steps are exemplified in Fig. 3, which illustrates the first iteration of SPA.  To improve the speed and accuracy for discriminating green fruit and green leaves, the background, consisting of sky, shade, branches, reflectance calibration board and its fixed board, should be removed from the original image.
To develop the background removal method, the NDVI ij in Eq.
In the equation, NDVI ij is the normalized difference vegetation index and R i is the reflectance at wavelength i. With regard to green plants, there is much more reflected radiation in the near-infrared (NIR) range than in the visible range. NDVI is considered as one of the most successful indices used to identify the live green plant in multispectral remote sensing data. Since NDVI is directly related to the photosynthetic capacity and energy absorption of plant canopies, the histogram of NDVI image illustrates a bimodal distribution, with a peak of background pixels and a peak of green vegetation pixels and a trough. The Otsu's thresholding technique, which is used to select the most suitable gray scale as the threshold value, can utilize a discriminant analysis to find the maximum separability of green vegetation and background based on the histogram of the NDVI image.

Definition of features
For the mutual occlusion of leaves or fruit in the hyperspectral images taken in the natural environment, there were varying complex illumination conditions among different regions of the images. As shown in Fig. 4, different parts of the same fruit or leaf showed different reflectance values. Therefore, the direct use of reflectance spectroscopy makes it very difficult to classify green fruit and green leaves. To deal with this problem, the slope of reflectance curve between different wavebands can be calculated because it not only implies the change of reflectance but also reduces the negative effect of varying complex illumination conditions.
The slope between wavebands ðn,nþ1Þ can be calculated by the formula below: where the Δref is the reflectance difference value between waveband n þ 1 and waveband n, Δl is the wavelength difference value between waveband n þ 1 and waveband n, l # n is a normalized and dimensionless parameter about the waveband n. l # n can be calculated as follows: l # n ¼ l n -minðlÞ maxðlÞ -minðlÞ (4) in which l n is the wavelength of waveband n, minðlÞ is the minimum wavelength among all of the wavebands in a multispectral image after SPA algorithm, and maxðlÞ is the maximum wavelength. In this study, three relative spectral slope parameters, K (405,563) , K (563,677) and K (804,962) , were defined according to the results of SPA. As shown in Fig. 4, the value of K (405,563) is positive and the value of K (563,677) is negative because the chlorophyll absorbs in the blue and red bands of the electromagnetic spectrum in photosynthesis. As a result, for the spectrum of green vegetation, there will be two absorption valleys in the blue and red band regions and a reflection peak in the green band. Green leaves contain more chlorophyll, so the reflectance is lower than for green fruit in blue and red regions. As to the K (804,962) of green fruit, it is negative and has a relatively higher absolute value. In contrast, the K (804,962) of green leaves is positive or negative and its absolute value is relatively lower. In other words, these three relative spectral slope parameters can be used for the identification of potential green fruit.

Two-dimensional discrete wavelet transform
The discrete wavelet transform (DWT) is a linear transformation that separates data into different frequency components, and then studies each component with a resolution matched to its scale. To use the wavelet transform for image processing, a two-dimensional discrete wavelet transform (2D-DWT), known as a natural extension from the single dimension case, must be implemented. The 2D-DWT is used to decompose the image generally with the application of a high-pass filter (HPF) and low-pass filter (LPF) in both the horizontal and the vertical directions of the image.
With the generation of an approximation coefficient (cA1), horizontal details coefficient (cH1), vertical details coefficient (cV1) and diagonal details coefficient (cD1) in this process, three details coefficients (cH1, cV1 and cD1) can be stored for reconstruction of the original image later. If it is necessary to decompose the image further, the approximation coefficient (cA1) will be the input of HPF and LPF, thus producing four sets of coefficients (cA2, cH2, cV2 and cD2) as the result of the second decomposition, and so forth [13] .
As is shown in Fig. 5, an image would be decomposed into four parts by using the 2D-DWT with the size of each part as big as 1/4 of the original. Figure 5(a) is the result of using LPF in both the horizontal and the vertical directions, which can be decomposed further. Figure 5(b) is the result of using LPF in the horizontal direction and HPF in the vertical direction. The result of using HPF in both the horizontal and the vertical direction can be seen in Fig. 5(c), while the result of using HPF in the horizontal direction and LPF in the vertical direction can be found in Fig. 5(d).

Marker-controlled watershed segmentation
Separation of touching objects in an image is one of the difficult image processing operations. For this problem, the watershed transform is usually applied to find catchments basins and watershed ridge lines in an image. However, the watershed segmentation algorithm may lead to severe over-segmentation due to the great number of minima and noise interference within an image or its gradient. Because the noise problem can be minimized on low-pass filtered images, a robust image segmentation method is based on multi-resolution analysis and a marker-controlled watershed segmentation algorithm [14,15] .

Texture analysis
Grey Level Co-occurrence Matrices (GLCM), as main image texture analysis tool, is a statistical method to demonstrate image texture structure by statistical sampling of the pattern of the gray levels that occur in relation to other gray levels. GLCM can be generated by calculating how often a pixel with gray level value, i, occurs adjacent to a pixel with the value, j. That is to say, a co-occurrence matrix is specified by the relative frequencies P(i, j, d, θ) in which two pixels, separated by distance d, occur in a direction specified by the angle θ, one with gray level i and the other with gray level j [16] . Table 1 shows the difference in texture features between fruit and leaves under varying illumination conditions. To detect green fruit using texture features, six most relevant features, including autocorrelation, cluster prominence, cluster shade, sum of squares,

Supervised classifier
It is necessary to use a supervised classifier to obtain the potential fruit and the results of precision identification. The relative spectral slopes parameters, K ð405,563Þ , K ð563,677Þ and K ð804,962Þ , could be the input parameters for the classifier to obtain potential fruit regions from an image. The texture feature of each part after marker-controlled watershed segmentation could be the input parameters for a classifier to obtain the results of precision identification. Three supervised classifiers were applied in this research, which were logistic regression, random forest and support vector machine (SVM) [17][18][19][20] .

Successive projections algorithm
The original hyperspectral image consisted of 388 different wavebands. SPA, a forward selection method, was used to select a set of wavelengths. A set of five wavelengths was utilized in this experiment with the goal of developing a multispectral imaging system. Figure 6 shows the detection result of green fruit based on SPA and Mahalanobis distance classification. It can be clearly seen from Fig. 6 that the five relatively greatest differences are at 754, 677, 606, 489 and 549 nm in the spectral reflectance between green fruit and the green leaves. SPA was performed five times and the different starting vector was selected from five wavelengths above for each time.  Fig. 6(a), five subsets of different regions were selected as regions of interest (ROIs). As shown in Fig. 6(b), Karhunen-Loeve transformation was applied to form the principal component analysis (PCA) bands. Subsequently, three PCA bands were selected including the majority information for original hyperspectral image. Figure 6(b) shows the classification result using three PCA bands and ROIs of Fig. 6(a). Figure 6(c-g) show the classification result obtained using the potential five sets of wavelengths and ROIs of Fig. 6(a). Since there is high accuracy in the result

Background removal
To increase the speed and accuracy of discrimination between green fruit and green leaves, it is necessary to remove the background from the original image. For green plants, there is much more reflected radiation in the nearinfrared range than in the visible range. The NDVI (804,677) was calculated after SPA algorithm for background removal.
As is shown in Fig. 7(a) the RGB image is composed of 405, 563 and 677 nm as B, G and R components. Figure 7(b) shows the gray image of NDVI (804,677) and Fig. 7(c) shows its histogram. In the histogram, the peak around gray level zero indicates it is the primary value for three Lambertian reflectance panels used for field calibration. The second peak appears around level 0.3, which is the primary value for the background of the image. The third peak around level 0.8 represents the primary value for the green vegetation of the image. The valley at level 0.55 between the second and the third peaks could be chosen as a segmentation threshold to remove the background. It is demonstrated from the experimental results that the different images have different segmentation thresholds ranging from 0.5-0.8 because of the varying illumination conditions. Figure 7(d) shows the processing result of Fig. 7(b) with a 0.55 segmentation threshold.

Identification of potential fruit
A total of 2825 samples were finally selected from 30 multispectral images, of which 1659 samples came from fruit and 1166 samples came from leaves. Figure 8 shows the distribution of all samples in the feature space structured by three slope parameters, which indicates there is obvious separability between leaves and fruit.
A total of 1952 samples were selected randomly as a training data set and the remaining 873 samples were used as a test data set. Three slope parameters were used as input variables to construct logistic regression, random forests and SVM classifiers. These classifiers could be employed to identify the potential fruit. Given a classifier and an instance, there are four possible outcomes. If the instance is fruit and it is classified as fruit, it is counted as a true positive; if it is classified as leaf, it is counted as a false negative. If the instance is leaf and it is classified as leaf, it is counted as a true negative; if it is classified as fruit, it is counted as a false positive. Table 2 shows the performance of the three classification models on the test data set. The purpose of this process is to remove leaves and to retain as many potential fruit as possible. The best classifier is one that has the minimum false positive. Because the false positive of SVM was just 19, the SVM model was chosen to identify the potential fruit. In this case, 516 leaves and 19 fruit were removed from images and 143 leaves and 195 fruit were retained. The retained parts were seen as potential fruit.

Wavelet watershed
To solve the problem of occlusion and prepare for further texture analysis, the marker-controlled watershed segmentation based on two-dimensional discrete wavelet transform was executed. Figure 9 shows the execution process. Figure 9(a) demonstrates the identification of potential fruit using three slope parameters, which include some parts of leaves. First, the low frequency sub-band of potential fruit images was obtained through two-dimensional discrete wavelet transform with Daubechies db4 function at the level four decomposition, which is shown in Fig. 9(b). Then the regional maxima, as shown in Fig. 9(c), were identified in low frequency sub-band images. Finally, Fig. 9(d) shows that using the regional maxima as a marker, the marker-controlled watershed segmentation was applied and Fig. 9(e) shows that the problem of occlusion was effectively resolved.

Texture features and classification
Based on the results of the marker-controlled watershed segmentation, 227 potential fruit areas were separated from 30 multispectral images. These images were manually counted, and it was found that there were 154 green leaves and 73 green fruit among 227 areas. Then the GLCM were calculated for each area, which included autocorrelation, cluster prominence, cluster shade, sum of squares, sum average and sum variance. A total of 150 areas were selected randomly as a set of training data and the remaining 77 areas as a test set. GLCM texture features were used as input variables to construct three classifiers of logistic regression, random forests and SVM. In addition to true positive (TP), false positive (FP), false negative (FN) and true negative (TN), a variety of evaluation measures can be calculated from confusion matrix. Accuracy is the rate of overall correct classification of samples. Sensitivity is the probability that the model will correctly classify fruit. Specificity is the probability that the model will correctly classify leaves.
The aim of a classifier is to minimize the FP and FN or to maximize the TP, TN, accuracy, sensitivity and specificity. Table 3 illustrates the performance of the three classification models on the test data set. It could be concluded that the SVM classifier had the best classification ability. Among 27 fruit samples and 50 leaves samples, eight fruit were misclassified as leaves and three leaves were misclassified as fruit. The accuracy, sensitivity and specificity score were 0.86, 0.94 and 0.70, respectively.   Hyperspectral imaging technology was employed in this study to detect green fruit within citrus trees under natural illumination conditions. The early yield information could be used for site-specific crop management to increase yield and profit. Based on the experimental results in this research, the following conclusions could be drawn: (1) The SPA algorithm was a feasible method to select wavebands, and can be used to construct multispectral images instead of the original hyperspectral images for further processing. This study selected 677, 804, 563, 962 and 405 nm wavebands according to the different spectral reflectances between green fruit and green leaves.
(2) The NDVI (804,677) threshold histogram segmentation was an effective and automatic method to remove the background from images. The experimental results demonstrated that the different image had different segmentation thresholds distributed in 0.5-0.8 due to the varying illumination conditions.
(3) Three slope parameters, K (405,563) , K (563,677) and K (804,962) , could be used to construct classifiers to identify the potential citrus fruit. It was shown that the SVM classifier had the best classification ability compared with logistic regression classifier and random forests classifier.
(4) Two-dimensional discrete wavelet transform with Daubechies db4 function at the level four decomposition could be used to find markers in potential fruit images and marker-controlled watershed segmentation could be used to resolve the problem of occlusion.
(5) The GLCM texture features could be input variables to construct the detection model for green fruit. The accuracy, sensitivity and specificity of SVM model were 0.86, 0.94 and 0.70, respectively.