Classifying cotton bark and grass extraneous matter using image analysis

Abstract

Cotton extraneous matter (EM) and special conditions are the only cotton quality attributes still determined manually by US Department of Agriculture Agricultural Marketing Service (USDA-AMS) classers. To develop a machine EM classing system, a better understanding of what triggers a classer EM call is needed. The goal of this work was to develop new information about cotton EM, such as bark and grass, and leaf particles, using machine measurements, to aid in the development of instrumentation for cotton quality measurements. AMS classers were tasked in identifying and denoting bark/grass in large-area color images of cotton samples. Image segmentation analysis was applied to detect non-cotton items, such as leaf particles, and the classer denoted bark/grass objects were segmented manually. Further image analysis was used to measure shape and color parameters of these bark/grass objects and leaf particles in the sample images. These measurements of the bark/grass objects and leaf particles were compared and logistical regression analyses conducted to evaluate classification. For every shape and color parameter, there were significant differences between the bark/grass objects and the detected leaf particles in the images. The differences were greater for the shape parameters than for the color parameters. A classification model with shape, color, and log-transformed shape parameters consistently classified the bark/grass objects and leaf particles most accurately with 99.5% and 97.6% correct classification rate, respectively. However, classification models that were 99% correct classifying manually segmented bark/grass were only about 77% correct when applied to the machine detected bark/grass particles.

Keywords

cotton fiber classification extraneous matter image analysis color shape

Almost all of the cotton grown in the USA is classed by the US Department of Agriculture Agricultural Marketing Service (USDA-AMS) using standardized procedures for measuring physical attributes that are related to the cotton’s quality and manufacturing performance.¹ Cotton quality attributes currently measured as part of AMS classification are fiber length, length uniformity, fiber strength, micronaire, color, trash, leaf, and extraneous matter (EM). Cotton classification relies heavily on instrumentation for cotton quality measurements. In fact, all attributes are determined using the high volume instrument (HVI, Uster Technologies, Knoxville, TN) with the exception of EM and special conditions (e.g. mixture of Upland and Pima, fire or water damaged, and reginned or repacked), which are determined manually by a human classer. These few quality attributes that are still determined manually suffer from the limitations of the human classer. Manual classing is subjective and can be influenced by fatigue and differences in skill level, visual acuity, and lighting.²

Cotton foreign matter refers to non-lint materials. Classing designations for non-lint content are leaf grade, trash, and EM.¹ EM is any substance in cotton other than the fiber or leaf and includes various types of materials and conditions such as bark, grass, seed coat fragments, dust, oil, and spindle twist. EM, when present, is noted by the classer along with the level as a coded number. For example, light and heavy bark are coded 11 and 12, respectively, and light grass is coded 21.

About 10% of the 2014 US Upland cotton crop of 15.3 million bales had EM calls.³ Bark and grass EM calls are common, with light bark accounting for 97% of all the Upland cotton EM calls from the 2014 crop. According to Bragg et al.⁴ and Bargeron et al.,⁵ bark and grass significantly reduce spinning efficiency by causing yarn breaks. Thus, the presence of bark and grass result not only in a price discount to the producer, ranging from −6.5 to −15.9 US cents per kg depending on level and location in the USA,⁶ but also in an additional cost to the textile mill.

Current instruments to determine the physical properties of cotton fall under two categories: gravimetric and imaging. Gravimetric foreign matter instruments currently used include the Shirley Analyzer⁷ and the Micro Dust and Trash Analyzer III (Uster Technologies, Knoxville, TN). These devices employ mechanical and pneumatic principles to separate foreign matter from a known mass of cotton lint for subsequent weighing and calculation of percentage of original sample mass. The HVI currently used by the AMS Classing Offices and the Advanced Fiber Information System (AFIS, Uster Technologies, Knoxville, TN) utilize optical sensors to determine fiber properties, including trash content. None of these methods differentiate EM.

Several methods have been investigated for classifying cotton foreign matter. Xu et al.^8,9 developed a system that utilized a charge-coupled device (CCD) camera and clustering analyses to categorize cotton foreign matter based on chromatic and geometric features. They concluded that color attributes were more reliable than geometric attributes in categorizing foreign matter. A neural network-based Cotton Trash Identification System, developed by Siddaiah et al.,^10–12 identified and categorized cotton foreign matter, including bark and grass, using a camera and a scanner, and small (58 cm²) and large (181 cm²) area images. The system categorized a much higher number of objects as bark/grass, mainly due to misclassification of buried objects and very small objects (normally ignored by the classer). Himmelsbach et al.¹³ utilized attenuated total reflectance/Fourier transform infrared measurements to identify cotton foreign matter based on chemical composition. The spectral method consistently identified the foreign matter, regardless of the particle size.

The US cotton classification system has been in place for almost 100 years, and for most of that time, has relied on human senses to classify cotton samples.¹ Current cotton grading, which is almost entirely determined by HVI measurements, is based on methods and standards that were developed from the historical manual classing system. Thus, new instrument measurements of cotton attributes like EM must be representative of the current standard, the manual classer. The goal of this work was to develop new information about cotton EM, such as bark and grass, and leaf particles, using machine measurements, to aid in the development of instrumentation for cotton quality measurements. The objectives of this work were to evaluate color and shape characteristics of bark and grass objects identified manually and leaf material detected by imaging techniques in cotton samples using imaging analysis and to assess the importance of these characteristics in differentiating between bark and grass and leaf material.

Methods

In an effort to categorize bark/grass EM, cotton lint samples with varying levels of bark/grass were selected from AMS Checklot samples from the 2007/2008 crop year. Checklot samples are classing samples randomly selected from AMS cotton classification facilities across the USA and retested at the AMS Quality Assurance Division in Memphis, TN, for quality assurance assessment.¹ Each cotton lint sample was split in two, lengthwise, and red, green, blue (RGB) color images were acquired from each of the four sample faces at 15.7 pixels (px) per mm (400 px/in.) resolution with an EPSON Perfection 3170 photo scanner (Epson America, Inc., Long Beach, CA) and saved in uncompressed tagged image file format (TIF). The scanner imaging window was fitted with a template that provided a 10.2 cm × 17.8 cm (4 in. × 7 in.) cotton image for analysis (Figure 1). Two AMS classers with the Standardization & Engineering Branch, Memphis, TN viewed each image on a PC monitor and by consensus denoted objects on the printed sample image that were either bark or grass (Figure 1), but did not differentiate between the two. The classers then assigned a call to the sample image for bark/grass: either “No EM” for no bark/grass, 11 or 12 for light or heavy bark, respectively, or 21 or 22 for light or heavy grass, respectively.

Figure 1.

Scanned image of cotton sample face (a) and image with bark/grass objects denoted by human classers (b).

Each of the scanned images of the cotton samples were then analyzed using ImageJ image processing and analysis software (v. 1.49j; National Institutes of Health, US Department of Health and Human Services, Bethesda, MD).¹⁴ The RGB color images were first copied, and the copy was then transformed to the L*a*b* color space using the ImageJ plugin Color Transformer 2 (v. 2.0; Maria E. Barilla-Perez, Birmingham University, UK).¹⁵ The L*a*b* color space was designed for defining color differences and to model human visual perception.^16,17 To describe an object’s color in the L*a*b* color space, the value of L* indicates the level of brightness (low values [0–50] indicate dark and high values [51–100] indicate light), the value of a* indicates the redness (positive value) or greenness (negative value), and the value of b* indicates the yellowness (positive value) or blueness (negative value). The transformation resulted in individual images for each L*, a*, and b* channel. A preliminary investigation showed that satisfactory item segmentation could be achieved by thresholding only the L* image. Thus, each L* image was then thresholded to segment the image into non-cotton particles and background using the maximum entropy automatic thresholding method in ImageJ with automatically set upper and lower threshold levels. These detected particles were also numbered and labeled automatically by the ImageJ software (Figure 2). To reduce the number of particles in the analysis and computational time, particles less than 12 px² (0.048 mm²) were ignored. These particles were smaller than criteria that define cotton trash particles as having an equivalent diameter greater than about 0.25–0.5 mm.^18,19

Figure 2.

L* image with detected and labeled particles.

In each of the analyzed images, detected particles that belonged to each bark/grass object denoted by the classers were identified (Figure 3). Also, the bark/grass objects denoted by the classers were outlined or segmented manually using a graphics pen and tablet (Intuos CTH680, Wacom Technology Corp., Vancouver, WA) with the ImageJ software and added to the list of particles in each image (Figure 3). These manually segmented objects were considered as the best representation of the bark/grass objects that the classers observed in the samples. After these operations, each image included three types of items: manually segmented bark/grass objects, detected particles belonging to denoted bark/grass objects, and detected leaf particles (all other particles not associated with the denoted bark/grass).

Figure 3.

Magnified image of a classer identified bark/grass object (a), detected particles belonging to the bark/grass object (b), and the manually segmented bark/grass object (c).

Using ImageJ, characteristics of all items (manually segmented bark/grass objects, detected bark/grass particles, and detected leaf particles) in the images were measured. These characteristics included the following.

Shape parameters (measurements in pixels [1 px length = 0.0635 mm; 1 px²= 0.004 mm²]):

area – area of items;

perimeter – length of the outside boundary of the items;

height, width, and BoxAR – height, width, and aspect ratio (height/width) of smallest bounding rectangle with sides parallel to the image axes enclosing the item (Figure 4);

major, minor, and EllipseAR – primary and secondary axes and aspect ratio (primary/secondary) of the ellipse that best fits (same area, orientation and centroid) the items (Figure 4);

MaxFeret, MinFeret, and FeretAR – the maximum and minimum distance between two parallel lines enclosing the item (Feret diameter or caliper diameter) and aspect ratio (MaxFeret/MinFeret) (Figure 4);

circularity – 4π × area/perimeter², a value approaching “0” indicates an elongated item;

roundness – 4 × area/(π × major²).

Figure 4.

Illustration of shape parameters – bounding rectangle height and width, best fit ellipse major and minor axes, and maximum and minimum Feret diameter.

Color parameters (measurements made for each channel in the RGB and L*a*b* color spaces):

mean and median of the color values of all the pixels in the item;

integrated density (IntDen) – sum of the color values of all the pixels in the item.

Due to the skewness of the shape parameter distributions, shape parameter data were also log-transformed and included in the subsequent statistical analyses.

Statistical analyses were conducted using JMP statistical software (v. 11.2.1, SAS Institute, Inc., Cary, NC) to explore differences in measured parameters between the bark/grass objects and leaf particles in 200 of the analyzed images from 50 of the checklot bale samples. Response screening was conducted to identify significant shape and/or color parameters.

To investigate how the measured shape and color parameters could be used to classify items in the images as bark/grass objects or leaf particles, classification models were constructed. Firstly, training and validation datasets were formed using the images utilized in the previous statistical analyses. From the entire set of 200 images, 615 out of 769 (80%) bark/grass objects and 615 leaf particles were randomly sampled for training data. The remaining bark/grass objects (154) and the remaining leaf particles (72,532 out of 73,147) from the 200 images were set aside for validation.

Using the training dataset, nominal logistic regression analyses were performed to construct the models in the JMP Fit Model platform. The logistic function estimates the probability of a response based on a set of independent factors. In this analysis, it described the probability that an object was bark/grass. The logistic function was defined by

P (y) = \frac{1}{1 - e^{- y}}

(1)

where y is a linear combination of measured parameters

y_{i} = β_{0} + β_{1} x_{1} + \dots + β_{i} x_{i}

(2)

For P(y) > 0.5, the object was bark/grass. The stepwise model selection method with three effects selection criteria (minimum Bayesian Information Criterion [BIC], minimum Corrected Akaike Information Criterion [AICc], and P-value) incorporating forward, backward, and mixed selection directions was used to develop candidate models. The values of alpha used for parameters entering and leaving were 0.05 and 0.01, respectively. The best fitting model from the stepwise methods was selected based on the number of correctly classified objects (classification rate), AICc, BIC, and R² (U) (uncertainty coefficient). This model building approach was used to construct best fitting models derived from each type of parameter (shape, log-transformed shape, and color) individually and their combinations for classifying items as bark/grass objects or leaf particles. Six models resulted: (1) shape parameter model; (2) color parameter model; (3) log-transformed shape parameter (log) model; (4) shape and color parameters (shape|color) model; (5) log-transformed shape and color parameters (log|color) model; and (6) shape, log-transformed shape, and color parameters (shape|log|color) model.

The six nominal logistic models were then applied to the remaining validation dataset and compared based on the number of correctly classified objects. To further evaluate the types of constructed models, the models were then used to classify bark/grass objects and leaf particles in a test dataset formed from 55 additional images from 15 checklot bale samples that was independent of the training and validation datasets.

Results

Measured parameters

In the 200 images analyzed statistically, there were more than 76,000 total detected particles (Table 1). There were 769 objects identified by the human classers as bark/grass with an average of about four classer bark/grass objects per sample (note: some samples did not have any bark/grass objects denoted by the classers). In contrast, there were 73,147 leaf particles detected in sample images and sample images averaged 368 leaf particles. Also, 3184 of the particles detected in the images were associated with the classer denoted bark/grass objects. Sample images that were given a bark/grass call by the classers averaged about six bark/grass objects with as few as two and as many as 11. Those sample images receiving a No EM call averaged about 1.7 bark/grass objects, with the number ranging from 0 to 5.

Table 1.

Classer identified bark/grass objects, detected particles associated with the bark/grass objects, and detected leaf particles in 200 sample images with assigned bark/grass and No extraneous matter (EM) calls.

Sample images	Stats	Classer bark/ grass objects	Particles associated with bark/ grass	Detected leaf particles
All	Total Cnt	769	3184	73,147
	Mean	3.86	16	368
	Min – Max	0–11	0–58	13–705
With bark/ grass call	Mean	6.2	26.3	314
	Min – Max	2–11	4–58	97–663
With No EM call	Mean	1.7	6.5	422
	Min – Max	0–5	0–37	13–705

Response screening in JMP showed that there were differences in the measured shape and color parameters between classer bark/grass objects and detected leaf particles. Figure 5 shows the transformed p-value, LogWorth (–log₁₀[p-value]), plotted against the effect size (extent response values differ between bark/grass objects and leaf particles) for the shape and color parameters. LogWorth gives a clearer representation of significance level when p-values are small. A LogWorth value greater than 2 corresponds to a p-value less than 0.01 significance level. The significance level for the difference between bark/grass and leaf was less than 0.01 or LogWorth > 2.0 for all measured parameters. Also, it is apparent in Figure 5 that the shape parameters had greater LogWorth values (greater significance) than nearly all color parameters. Only the color parameters that were related to object or particle area (–IntDen) had significance levels on the same magnitude as the shape parameters. Shape parameters would likely play a more significant role in classifying objects as bark/grass than color parameters.

Figure 5.

Significance (LogWorth = –log₁₀[p-value]) of difference in measured shape and color parameters between bark/grass objects and leaf particles. LogWorth > 2 corresponds to p-value < 0.01.

As seen in Table 2, shape parameters that described overall size (MaxFeret to Minor) were more significantly different between bark/grass objects and leaf particles (greater LogWorth) than those that described elongation or roundness (FeretAR to BoxAR). MaxFeret was the most significant shape parameter (LogWorth = 10,351) with bark/grass object mean (189 px) more than 10 times the mean for leaf particles (15.1 px). Histograms of MaxFeret illustrate the prominent difference between the bark/grass objects and leaf particles (Figure 6). Less than 1% of the bark/grass objects had MaxFeret less than 40 px, while MaxFeret was less than 40 px for more than 95% of the leaf particles.

Figure 6.

Distributions of MaxFeret for bark/grass objects () and leaf particles ().

Table 2.

Mean shape parameter values and analysis of variance observed significance probabilities of equal means for manually segmented bark/grass objects and detected leaf particles.

Parameter		Bark/grass	Leaf	Observed Significance
Parameter	Overall	objects	particles	LogWorth^a
MaxFeret, px	16.9	189	15.1	10,351
Major, px	14.1	124	12.9	8817
Width, px	13.2	135	11.9	8495
Height, px	13.0	137	11.7	8119
Perimeter, px	52.6	595	46.9	6315
Area, px²	150	3623	114	5519
MinFeret, px	9.13	62.6	8.57	4936
Minor, px	7.20	31.6	6.94	2402
FeretAR	1.85	3.96	1.83	1642
EllipseAR	2.02	5.08	1.99	1634
Roundness	0.57	0.28	0.57	452
Circularity	0.55	0.18	0.56	422
BoxAR	1.45	2.08	1.44	283

Bark/grass and leaf means were all significantly different at α = 0.01.

LogWorth = –log₁₀ (p-value) ≥ 2.0 corresponds to p-value ≤ 0.01.

Integrated color density parameters (sum of the color values of all the pixels in an item, Green-IntDen to a-IntDen) that reflect item size were more significant (LogWorth > 500) than the raw color measures (Red-mean to Blue-median, LogWorth ≤ 110) (Table 3). The difference in mean Green-IntDen values between bark/grass objects (579,993) and leaf particles (16,475) was the most significant (LogWorth = 5859) among color parameters. Red-mean was the most significant raw color measure (LogWorth = 110) with average values for bark/grass and leaf equal to 176 and 164, respectively. Due to the integration of item size, the histograms for Green-IntDen were similar to those of MaxFeret (Figure 7). Green-IntDen was greater than 35,000 for almost 98% of bark/grass objects and less than 35,000 for more than 90% of leaf particles. On the other hand, the distributions of Red-mean values for bark/grass objects and leaf particles did not show obvious differences and overlapped considerably (Figure 8).

Figure 7.

Distributions of Green-IntDen for bark/grass objects () and leaf particles ().

Figure 8.

Distributions of Red-mean for bark/grass objects () and leaf particles ().

Table 3.

Mean color parameter values and analysis of variance observed significance probabilities of equal means for manually segmented bark/grass objects and detected leaf particles.

Parameter		Bark/ grass	Leaf	Observed significance
Parameter	Overall	objects	particles	LogWorth^a
Green-IntDen	22,338	579,993	16,475	5859
Blue-IntDen	19,198	484,470	14,306	5841
L-IntDen	9319	240,149	6893	5831
Red-IntDen	24,244	630,560	17,869	5823
b-IntDen	1929	56,838	1351	5148
a-IntDen	204	3530	169	559
Red-mean	164	176	164	110
Red-median	164	176	164	96.9
Green-mean	151	162	151	78.0
L-Mean	63.0	66.9	63.0	76.8
b-Mean	11.9	15.0	11.8	68.5
L-Median	63.2	67.2	63.2	64.9
Green-median	152	162	152	61.3
b-Median	11.8	14.8	11.8	58.5
a-Median	1.59	0.97	1.60	11.1
Blue-mean	132	137	132	10.4
a-Mean	1.62	1.09	1.62	8.47
Blue-median	132	136	132	7.01

Bark/grass and leaf means were all significantly different at α = 0.01. LogWorth = −log₁₀(p-value) ≥ 2.0 corresponds to p-value ≤ 0.01.

Log-transformation reduced the overall spread of the shape parameter data without compromising the differences between the bark/grass objects and leaf particles. This is illustrated in Figure 9, which shows the distributions for the log-transformed MaxFeret.

Figure 9.

Distributions of log-transformed MaxFeret for bark/grass objects () and leaf particles ().

Model building

The model building procedures resulted in six logistic classification models constructed using parameters from each measured parameter type: shape, color, and log-transformed shape (log), and their combinations shape|color, log|color, and shape|log|color. Parameters included in the models with coefficients for Equation (2) are shown in Table 4. There were several parameters that were consistently in the models to classify the items in the images. Minor and MinFeret were in all three models that were derived from the shape parameters (shape, shape|color, and shape|log|color). Log-circularity was in all three models that were derived from the log parameters (log, log|color, and shape|log|color). Red-mean was included in three of the models that were derived from the color parameters (color, log|color, shape|log|color) with Red-IntDen included in a fourth (shape|color).

Table 4.

Parameters and coefficients for Equation (2) included in constructed models for classification of bark/grass objects and leaf particles.

	Parameter model
Type parameter	Shape	Shape\| Color	Log	Log\| Color	Color	Shape\| Log\| Color
Shape
Area	−0.00444	−0.0261
Circularity	18.121	15.532
EllipseAR	1.982	1.951
Major	0.317	0.362
Minor	−1.245	−1.413				−1.423
MinFeret	1.120	1.290				1.198
Perimeter		−0.0507
Log
Log-circularity			241.660	4.534		21.474
Log-EllipseAR			253.553
Log-major			−490.633	−10.518
Log-MaxFeret			14.607	19.636
Log-MinFeret			12.014
Log-minor						−20.604
Log-perimeter			473.187			28.545
Log-round				−6.080
Color
L-mean					−0.225
a-IntDen					0.00248
b-IntDen					−0.00577
b-Median					−0.954
Red-mean				0.0858	0.0973	0.0789
Red-IntDen		0.000179
Green-mean					0.934
Green-IntDen					0.00360
Green-median					−0.375
Blue-Mean					−0.542
Blue-IntDen					−0.00358
Intercept	−27.063	−24.971	−580.149	−51.538	−7.150	−75.505

Comparing the classification rate and fit statistics (AICc, BIC, and R²(U)), the best of the six models for classifying objects in the training dataset were the shape parameter model, shape|color parameter model, and shape|log|color parameter model (Table 5). In the training dataset, the shape|log|color model correctly classified the most (611 of 615) bark/grass objects and the shape and shape|color parameter models correctly classified the most (612 of 615) detected leaf particles. The shape, shape|color, and shape|log|color parameter models correctly classified 99.4% of the bark/grass and leaf combined (1222 out of 1230). Also, the model derived from shape|log|color parameters had the lowest AICc and BIC, and the second highest R²(U) values. The model with only color variables resulted in the poorest prediction results, but still correctly classified 96% of bark/grass and leaf combined.

Table 5.

Classification rates and model fit statistics from models constructed using shape, log-transformed shape (log), and color parameters for manually segmented bark/grass objects and detected leaf particles in the training dataset.

	Classification rate						Fit statistics^b
	Bark/grass objects		Leaf particles		Combined		AICc	BIC	R²(U)
Parameter model	Cnt^a	%	Cnt^a	%	Cnt^a	%
Shape	610	99.2	612	99.5	1222	99.4	64.5	100.21	0.970
Log	609	99.0	608	98.9	1217	98.9	79.1	114.79	0.962
Color	583	94.8	598	97.2	1181	96.0	332	388.01	0.818
Shape\|Color	610	99.2	612	99.5	1222	99.4	58.2	104.04	0.977
Log\| Color	609	99.0	610	99.2	1219	99.1	69.5	100.15	0.966
Shape\|Log\| Color	611	99.4	611	99.4	1222	99.4	56.4	92.11	0.975

Total count: bark/grass objects = 615, leaf particles = 615, and combined = 1230.

AICc = Corrected Akaike Information Criterion, BIC = Bayesian Information Criterion, R² (U) = uncertainty coefficient.

Model validation

The six classification models were used to classify the remaining validation subset with 154 classer denoted bark/grass objects and 72,532 detected leaf particles (Table 6). Shape, shape|color, and shape|log|color parameter models, which had 99.4% combined classification rate on the training data, all correctly classified 98.7% bark/grass objects in the validation dataset, but the log parameter model was slightly better with 99.4% (153 out of 154). For the leaf particles, the model constructed from log|color parameters correctly classified 98.3%. The shape|log|color model classified a similar number of leaf particles (98.3%). Overall, the log model correctly predicted a larger percentage of bark/grass and leaf combined (98.7%) than the other models. The shape|log|color model was the closest with 98.5%. Even the model with only color parameters correctly classified more than 96% of bark/grass and leaf.

Table 6.

Classification rates from models constructed using shape, log-transformed shape (log), and color parameters for manually segmented bark/grass objects and detected leaf particles in the validation dataset.

Parameter	Bark/grass objects		Leaf particles		Combined
model	Cnt^a	%	Cnt^a	%	Cnt^a	%^b
Shape	152	98.7	71,168	98.1	71,320	98.4
Log	153	99.4	71,045	98.0	71,198	98.7
Color	148	96.1	70,178	96.8	70,326	96.4
Shape\|Color	152	98.7	71,225	98.2	71,377	98.4
Log\|Color	151	98.1	71,314	98.3	71,465	98.2
Shape\| Log\|Color	152	98.7	71,284	98.3	71,436	98.5

Total count: bark/grass objects = 154, leaf particles = 72,532, and combined = 72,686.

Calculated as the average of the bark/grass object and the leaf particle percentages.

Model testing

The six classification models were applied to a test dataset of 55 images that were independent of the training and validation datasets and had 200 classer identified bark/grass objects and more than 11,000 detected leaf particles. The results, shown in Table 7, were similar to those of the validation data. The shape|log|color model correctly classified the most bark/grass objects and leaf particles at 99.5% and 97.6%, respectively. Log and shape|color models also had 99.5% bark/grass and higher than 97% leaf classification rates. Similar to earlier results, the model constructed from only color parameters had the worst classification rates of all the models, but still correctly classified on average 96% of bark/grass objects and leaf particles.

Table 7.

Classification rates from models constructed using shape, log-transformed shape (log), and color parameters for manually segmented bark/grass objects and detected leaf particles in the test dataset.

	Bark/grass objects			Leaf particles			Combined
Parameter model	Cnt^a	%	Cnt per image^b	Cnt^a	%	Cnt per image^b	Cnt^a	%^c
Shape	197	98.5	5.05	10,759	97.6	195.62	10,956	98.1
Log	199	99.5	5.10	10,736	97.4	195.20	10,935	98.4
Color	193	96.5	4.95	10,620	96.3	193.09	10,813	96.4
Shape\|Color	199	99.5	5.10	10,749	97.5	195.44	10,948	98.5
Log\|Color	198	99.0	5.08	10,759	97.6	195.62	10,957	98.3
Shape\| Log\|Color	199	99.5	5.10	10,760	97.6	195.64	10,959	98.6

Total count: bark/grass objects = 200, leaf particles = 11,024, and combined = 11,224.

Average total count per image: bark/grass objects = 5.13 (for images with bark/grass indicated by classer), leaf particles = 200.44.

Calculated as the average of the bark/grass object and the leaf particle percentages.

Analysis of only the images that had bark/grass objects indicated by the classers in the test dataset showed that the models derived from log, shape|color, and shape|log|color parameters had the highest average bark/grass classification rate per image at 5.10 (Table 7). The average number of correctly classified leaf particles per image was greater than 195 for all of the models, except for the color parameter model, which was 193 per image.

These results show that there are clear differences between bark/grass objects that the human classer sees in a cotton sample and leaf particles, and those differences can be detected with imaging analysis. Also, the classification model that included shape, log-transformed shape, and color parameters (shape|log|color parameter model) consistently had the highest classification rates of bark/grass objects and leaf particles in the training, validation, and test datasets.

Conclusions

Bark and grass objects were identified in and denoted on images of cotton samples by AMS classers. The shape and color characteristics of these bark/grass objects were then compared to the leaf particles detected in the images using image analysis techniques. There were significant differences in all the characteristics between the classer identified bark/grass objects and the detected leaf particles. These differences were greater for parameters describing shape than for color parameters. A nominal logistic classification model with shape, log-transformed shape, and color parameters (best fit ellipse minor axis; minimum Feret diameter; log-transformed circularity; log-transformed best fit ellipse minor axis; log-transformed perimeter; and Mean-red color value) consistently classified the bark/grass objects and leaf particles most accurately with 99.5% and 97.6% correct classification rate, respectively, for the test dataset.

Future research

Another issue still remains to be explored. The bark/grass objects identified by the classers in the images analyzed for this study were segmented by hand to represent the objects that the human classer sees. Human vision is excellent at recognizing objects, connecting hidden parts of objects and averaging across the area of an object. Machine vision systems analyze individual pixels or groups according to predetermined algorithms and see individual particles instead of objects. An example from this study is shown in Table 8. There were 634 individual particles detected by the imaging software in the test dataset that belonged to the bark/grass objects denoted by the classers (see Figure 3). The models that were shown in Table 7 to be 99% correct at classifying manually segmented bark/grass objects in the test dataset were only about 77% correct when applied to these 634 detected bark/grass particles. Interestingly, the model derived from only color parameters that had the lowest classification rate for the training, validation, and test datasets, instead had the highest classification rate (80.6%) for these bark/grass particles. Work is needed to relate these instrument-detected particles back to classer identified objects and then develop criteria for making an EM call for the cotton sample based on the instrument measurements.

Table 8.

Classification rates from models constructed using shape, log-transformed shape (log), and color parameters for detected particles associated with classer identified bark/grass objects in the test dataset.

	Bark/grass particles
Parameter model	Cnt^a	%
Shape	489	77.1
Log	492	77.6
Color	511	80.6
Shape\|Color	492	77.6
Log\|Color	485	76.5
Shape\|Log\|Color	490	77.3

Total bark/grass particle count = 634.

Disclaimer

Mention of trade names or commercial products in this publication is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the USDA. USDA is an equal opportunity provider and employer.

Footnotes

Acknowledgements

The authors would like to thank the technical staff and classers of the USDA-AMS, Cotton and Tobacco Programs, Standardization & Engineering Branch for their collaboration on this project.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

References

Cotton Incorporated. The classification of cotton, www.cottoninc.com/fiber/quality/Classification-Of-Cotton/Classing-booklet.pdf (2013, accessed 24 August 2015).

Dunavant B III. Advantages of the High Volume Instrument system USDA, AMS, Cotton Program. In: proceedings of the 61st plenary meeting of the International Cotton Advisory Committee - statements to plenary and open sessions, Cairo, Egypt, 20–25 October 2002, pp.38–40. Washington, DC: International Cotton Advisory Committee.

U.S. Department of Agriculture Agricultural Marketing Service. Cotton quality – United States 2014 crop, https://www.marketnews.usda.gov/mnp/cn-report?startIndex=1&rowDisplayMax=25&runReport=true&category=Upland&frequency=Season&locType=National&_loc=1&metricDaily=All&_metricDaily=1&metric=ExtraneousMatter&display=All&_display=1&repDate=&endDate=&repDateWeekly=&endDateWeekly=&repYear=2014&endYear=2014&run=Run (2015, accessed 3 December 2015).

Bragg

Simpson

Brashears

. Bark effect on spinning efficiency of cotton. Trans ASABE 1995; 38: 57–64.

Bargeron

JD III

Rayburn

Griffith

. Effects of grass contamination on yarn manufacturing. Trans ASABE 1988; 31: 2–4.

U.S. Department of Agriculture Farm Service Agency. 2015-Crop upland cotton schedule of premiums and discounts, https://www.fsa.usda.gov/Assets/USDA-FSA-Public/usdafiles/Price-Support/pdf/2015/2015ameruplcotnpd3.pdf (2015, accessed 16 September 2015).

ASTM International D2812 – 07:2007. Standard test method for non-lint content of cotton.

Fang

Huang

. Chromatic image analysis for cotton trash and color measurements. Text Res J 1997; 67: 881–890.

Fang

Watson

. Clustering for cotton trash classification. Text Res J 1999; 69: 656–662.

10.

Siddaiah M, Hughs SE and Lieberman M. Comparison of small trash measurements between imaging techniques and AFIS. In: proceedings of the beltwide cotton conferences, New Orleans, LA, 4–7 January 2005, pp. 2362–2372. Memphis, TN: National Cotton Council.

11.

Siddaiah M, Whitelock DP, Lieberman MA, et al. Categorization of extraneous matter in cotton using machine vision systems. In: proceedings of the beltwide cotton conferences, San Antonio, TX, 5–8 January 2009, pp. 1211–1216. Cordova, TN: National Cotton Council.

12.

Siddaiah M, Whitelock DP, Hughs SE, et al. Evaluation and implementation of a machine vision system to categorize extraneous matter in cotton. In: proceedings of the beltwide cotton conferences, Atlanta, GA, 4–7 January 2011, pp. 1304–1312. Cordova, TN: National Cotton Council.

13.

Himmelsbach

Hellgeth

McAllister

. Development and use of an attenuated total reflectance/Fourier transform infrared (ATR/FT-IR) spectral database to identify foreign matter in cotton. J Agric Food Chem 2006; 54: 7405–7412.

14.

National Institutes of Health. ImageJ image processing and analysis in Java, http://imagej.nih.gov/ij (2015, accessed 01 March 2016).

15.

Barilla-Perez ME. Color transformer 2, www.russellcottrell.com/photo/colorTransformer2.htm (2014, accessed 1 March 2016).

16.

Fairchild

. Color appearance models, 2nd ed. Chichester West Sussex: John Wiley & Sons, 2005.

17.

Baldevbhai

Anand

. Color image segmentation for medical images using L* a* b* color space. IOSR J Electron Commun Eng 2012; 1: 24–45.

18.

Baker RV, Brashears AD and Lalor WF. Effects of lint cleaning on pepper trash. In: proceedings of the beltwide cotton conferences, Nashville, TN, 6–10 January 1992, pp.1417–1419. Memphis, TN: National Cotton Council.

19.

Furter

Schneiter

. Methods of determining trash and dust content in cotton fibre. Text Technol Int 1993, pp. 196–201.