Abstract
Foreign body detection is an important aspect that affects the quality of tobacco production. This paper describes a direct foreign body detection scheme using machine vision, which uses three cameras arranged around a tobacco bale to record its multiple surfaces and directly identify foreign bodies. In this study, color sorting table method (CSTM) was first used to identify and remove color-sensitive foreign bodies; thereafter, gray threshold method and double threshold method were used to further identify and remove foreign bodies with similar colors. The experimental results indicate that the multi-step hybrid identification method proposed herein can effectively identify and remove various foreign bodies in the production process of tobacco packs, with an accuracy rate of 97.8%, which meets the industrial requirements for foreign body detection. Compared with various existing devices and methods, it has the advantages of high detection efficiency and low cost.
Introduction
In the tobacco industry, mixing tobacco with foreign substances can substantially affect its purity and quality. Figure 1 illustrates tobacco unpacking process. In the processing of tobacco, raw tobacco material is wrapped in paperboards, ribbons and plastic bags. A mechanical arm unpacks tobacco, and a conveyor belt transports raw tobacco to be further processed. However, due to various factors such as improper unpacking and packaging damage, paperboards, ribbons and plastic bags and other foreign materials may be left on the conveyor belt. If these foreign bodies are mixed with raw tobacco, the quality of tobacco will be significantly affected. Therefore, in order to improve the quality of tobacco, it is necessary to remove these foreign bodies with effective methods. Figure 2 illustrates three types of foreign bodies produced during processing: paperboard, plastic bags, and pieces of ribbon.

Unpacking a raw tobacco package. (a) Before unpacking; (b) After unpacking.

Foreign bodies that were not removed from tobacco during unpacking.
Previously, the foreign bodies present in tobacco were removed manually by arranging workers on the production line to observe with naked eye and manually isolate foreign materials from tobacco. Although this method was simple, it required substantial manpower and material resources, was time consuming, and relied heavily on subjective decision making of the people involved; therefore, its accuracy was low. Currently, intelligent tobacco foreign body rejection systems are widely used, with most of them adopting X-ray photography (Yao, 2011) and infrared detection methods (Zhou et al., 2010), and a few of them adopting hyperspectral imaging method (Zhou et al., 2013). X-ray detection of foreign body use X-rays to irradiate tobacco packet. Owing to the difference of X-ray absorption between tobacco packet and foreign bodies, the ray intensity after passing through tobacco also varies. The ray intensity is then analyzed to look for a presence of a foreign body. Due to the poor X-ray imaging and low contrast of the collected ray intensity, the sensitivity of this method is not high. Infrared detection uses the absorption of infrared light by tobacco packet and foreign body to determine whether there is a foreign body present. This range of foreign bodies that can be detected using this method is limited. When the color of a foreign body is similar to the color of the inspected material, it is extremely difficult to detect the foreign body. The most important characteristic of a hyperspectral image is the combination of spatial information and spectral information, which is used to detect external and internal features of an object. However, hyperspectral imaging cost is high. Furthermore, it requires a large amount of 3D data, which takes a long time to process, and only a few spectral bands of spectral imaging meet the requirements of rapid detection. Therefore, this method needs more in-depth research before it can be applied to practical production.
With the development of computer technology and machine vision (Xue et al., 2019), researchers have begun to apply these techniques to detect foreign bodies. The machine vision-based detection step first obtains an image through the device, then analyzes and measures the image, and finally identifies foreign bodies. An important feature of using machine vision for detection is its ability to quickly and efficiently identify foreign bodies.
The use of machine vision to detect foreign bodies is used in many industries, including the textile industry and the food industry, which is similar to the tobacco industry. In the textile industry. Zhao et al. (2019) extracted the color and texture, and realized the real-time evaluation of fabric quality by using two quantitative parameters, namely peak signal to noise ratio and structural similarity. This method not only requires less manpower, but is also easy to operate, and can improve efficiency. Mei et al. (2018) proposed an unsupervised learning-based automated approach to detect and localize fabric defects without any manual intervention. The proposed model is robust and yields good overall performance with high precision and acceptable recall rates. The application of machine vision has effectively promoted the automation and intelligence level of textile production equipment, instead of manpower, and improved the quality of textile products. In the food industry. Wang et al. (2018) applied image analysis to quantify the quality characteristics of common white mushrooms, so as to realize automatic detection and classification of mushrooms. The characteristics considered were color, shape and opening of umbrella cap, and the classification error was 8%∼56%. Xie et al. (2019) extracted the key parameters of surface detects such as green-shoulder, bending, fibrous root, surface cracked and broken by binarizing the HVS, so as to classify carrot. Khojastehnazhand (2020) combined different texture feature algorithms and different modeling methods, and investigated the quality of bulk raisin by the support vector machine (SVM) classifier using gray level run length matrix (GLRM). Their study results yielded more accurate classification results and the accuracy was obtained 85.55%. Therefore, in addition to low false detection rate, most of the reported visual detection methods only used the color and shape of the inspected objects to detect surface defects. In summary, we can see that detection of foreign bodies by machine vision not only has a high detection rate and a low false detection rate, but is also simple and efficient. However, the foreign bodies areas in the textile industry and the food industry is generally small and varied, so it is not applicable in this paper.
In the tobacco industry, with the increasing market demand and product iterations, the elimination rate by intelligent tobacco foreign material rejection equipment (Di et al., 2009) has also been increasing. In recent years, domestic scholars have also proposed a variety of foreign body recognition algorithms. At present, the main methods of tobacco foreign body detection based on machine vision fall into two categories: methods based on color sorting tables (Tang et al., 2004) and methods based on pattern recognition (Quan et al., 2011).
Color-based recognition algorithm is widely used because of its high speed, and establishing a perfect color model for tobacco leaf color is the key to its use in tobacco industry. By comparing RGB values of tobacco and foreign bodies with the standard color database, Zhang (2009) statistically obtained a color sorting table of tobacco and foreign bodies. Through the use of the sorting table, each pixel of tobacco image was tested and foreign bodies were detected. Zhang et al. (2007) established a typical foreign body database based on RGB values of foreign body color, and used the typical foreign body database to filter the color sorting table, which further improved the accuracy of foreign body detection. However, foreign bodies that have color similar to that of tobacco cannot be properly detected through the use of color sorting tables; hence, the universality is low.
Yao and Wang (2012) used a tobacco foreign body detection method based on pattern recognition. They used a 20×20 pixel block as a unit of tobacco image, and categorized blocks into two groups based on whether they contained a foreign body by manual method. Subsequently, the adaptive iterative self-organizing data analysis technique (ISODATA) clustering algorithm was used to cluster the units without foreign bodies in RGB space to obtain the clustering center. Finally, the cluster center is converted to the HSI color space, and all the cells in the tobacco image to be detected are traversed to detect whether there are foreign bodies. Liu et al. (2012) preprocessed tobacco image with Laws operator to obtain its texture region and edge region. Subsequently, support vector machine (SVM) is used to subdivide the texture region and edge region to realize the simultaneous detection of multiple foreign bodies. As a tobacco image consists only of background, tobacco and foreign body, pattern recognition method for such a simple image is too cumbersome and less stable.
In this paper, we present a machine vision setup for detecting foreign bodies. The algorithm used herein is a combination of color sorting table method (CSTM) and gray level double threshold method. The detection method is from coarse to fine. We first detect foreign bodies that are significantly different from tobacco in color, and use the color sorting table to detect foreign bodies like plastic bags and ribbons. We then detect foreign bodies that are similar to tobacco in color, and use the gray threshold method to detect foreign bodies like paperboard. This ensures that all foreign bodies can be detected. In addition, in terms of detection time, the algorithm adopted in this study is not significantly different from the method based on color sorting table; however, it has a high detection rate for foreign bodies with similar colors.
Overview of foreign body detection in tobacco
This section first describes workflow for unpacking raw tobacco packets and the tobacco foreign body detection device. Secondly, it analyzes the algorithm used for image processing, including image preprocessing, sorting table detection, gray threshold detection, and double threshold detection. Subsequently, in the experimental section, we will introduce the detection effect and detection rate. Finally, the algorithm is summarized and suggestions for improvements are proposed.
Tobacco foreign body detection device
We propose that the tobacco foreign body detecting device is an integrated automatic detecting system integrating lighting, machinery, and power. The device uses three HD cameras whose model are Aca 1660-20gc color and a millisecond photoelectric switch device, and the layout is shown in the Figure 3.

General layout of foreign body detection device.
The device is installed on the production line. The mounting height of each camera is higher than that of the tobacco packet, which is convenient for detecting the upper surface of the tobacco packet. Camera 0 captures A, E and D sides; Camera 1 captures sides B, C and E; and Camera 2 captures A, B and E sides of the tobacco packet. The upstream mechanical arm removes packaging from a tobacco packet before it reaches the detection device. When the unpacked tobacco arrives at the end of the conveyor belt, it will trigger a photoelectric switch, as shown in Figure 4.

Photoelectric switch layout.
Once the photoelectric switch is open, the three cameras around the packet take pictures. The cameras are at fixed positions; however, since photoelectric switching has high sensitivity and the cameras take photos immediately after a tobacco packet triggers the photoelectric switch, the tobacco packet remains at the same position in images. Therefore, the detection area is within the area division for the subsequent processing of the image. The tobacco pack is shown in Figure 5.

Tobacco packet captured by cameras.
Furthermore, as can be seen from the photos in Figure 5, the positions and shooting angles of the cameras were precisely adjusted so as to cover all areas around the tobacco packet. Once there is a foreign body detected in a photograph, the detection device will immediately alert the staff to remove it through the alarm device.
Methods and algorithms for foreign body detection
Flow chart
In this paper, the method based on color sorting table is referred to as CSTM, and the area of interest is referred to as ROI. Figure 6 shows the image processing flow chart.

Foreign matter detection flow chart.
Pretreatment
Preprocessing includes two steps: clipping ROI areas (Liu et al., 2009; Yang et al., 2004) and image filtering (Yan et al., 2001; Ya et al., 2015).
Figure 7 shows the areas where foreign bodies may appear in photographs taken by the three cameras. In order to improve the detection efficiency, we masked the background area, and the areas to be tested were displayed. These ROI areas for photographs taken by Camera 0 are shown in Figure 8. The ROIs for Camera 1 and Camera 2 are selected in a similar way.

The areas prone to foreign bodies.

Five regions of interest for Camera 0.
The purpose of image filtering is to eliminate interference points from images. During the process of transporting a tobacco packet on the conveyor belt, the camera holding rod undergoes some vibration, resulting in formation of an extremely small number of white noise points in photographs taken by the camera, which, if not removed, increase the risk of misjudgment. In order to remove these interference points, we use mean filtering. As shown in Figure 9, after filtering the picture as a whole is much smoother.

Image filtering.
Color-based sorting table method
To the naked eye, both a ribbon and a plastic bag are white, and thus significantly different from the color of tobacco. Therefore, we use color sorting table method to locate the feature points corresponding to a ribbon or a plastic bag. There are 256×256×256 kinds of RGB three colors. In order to speed up the computation, we adopt clustering algorithm to reduce the number of types of RGB colors to 512, which is 8×8×8, as shown in Figure 10.

Clustering on one of RGB channels.
The pixel values between 0 and 31 are merged to 0, and the pixel values between 32 and 63 are merged to 32, and so on.
The specific RGB values of ribbons and plastic bags were found with the method of picking points and recording RGB values, as shown in Table 1. When running the detection program, each pixel is filtered according to the RGB value in the sorting table. If a value in the sorting table is matched, the pixel is designated as a foreign body.
Foreign body sorting table.
However, we find that if the values of two pixels are close to the dividing line, such as (33,0,0) and (63,0,0), after clustering the values of two pixels may be classified into one class, causing misjudgment or missed detection. Therefore, the method does not apply to the case where foreign body and tobacco are similar in color.
Gray threshold method
The colors of paperboard and tobacco are highly similar. If the sorting table algorithm is still adopted, RGB values of tobacco and paperboard will cluster together, which will lead to a large number of misjudgments. Therefore, the sorting table is not useful for detection of paperboard. Owing to this, we adopt gray threshold mean standard deviation method here. The gray calculation formula is given by equation (1), and equation (1) is the commonly used gray calculation formula
In order to determine optimal parameters to use in equation (1) for our study, we need to make a histogram of RGB values for tobacco packets and paperboard, shown in Figures 11 and 12.

RGB histogram for tobacco.

RGB histogram of foreign body for paperboard.
The frequency differences for R and B between tobacco and paperboard are large, and the difference for G values is less pronounced. We make use of this by changing the parameters multiplying R and B in equation (1), in order to amplify the difference in calculated gray value for tobacco and paperboard. The parameters multiplying B and R are here set to 0.45, which makes the parameter multiplying G set to a smaller value, namely 0.1. The gray level formula is given by equation (2)
Here, a represents the total number of pixels of the image, and
After we calculate the gray values, we can use equations (3) and (4) to calculate the gray mean and standard deviation of a block area. As usual in the image processing, the gray level mean represents the brightness, and the standard deviation represents the contrast. If sufficient light is present, tobacco and paperboard reflect different degrees of light and we can use paperboard and tobacco brightness difference to detect paperboard foreign body. Figures 13 and 14 show calculated mean gray values and standard deviation for 100 paperboard and tobacco photos, respectively. For paperboard, the mean gray value and standard deviation range from 80 to 100. For tobacco, the mean value is between 40 and 60, and the standard deviation is between 60 and 90.

The mean and standard deviation of the tobacco.

The mean and standard deviation of the paperboard.
In addition, in the actual lighting situation, because the tobacco pack is a rectangular parallelepiped, the degree of illumination is different for each of its sides. As a result, during detecting process, the threshold for each area will be different, as shown in the Tables 2 and 3.
Gray mean and standard deviation for paperboard.
Gray mean and standard deviation for tobacco.
The above tables show average values calculated by region. We can use the product of standard deviation and mean value as to set a threshold for judging whether there is a foreign body. For Camera 0, we set the threshold of area 1 as 6000, area 2 as 5000, area 3 as 5500, area 4 as 6000 and area 5 as 5500. For Camera 1, we set the threshold of area 1 as 7200, area 2 as 6500, area 3 as 7000, area 4 as 5500 and area 5 as 6500.
Sliding window
When testing each region, we determine if a paperboard foreign body is present and its specific position and size by using a sliding 40×40 pixel window, as shown in Figure 15.

40×40 sliding window traverses an ROI.
First, we calculate the product of the gray mean and gray standard deviation for each window, and compare the calculated value with the threshold value we set. If it is above the threshold, we determine that the object in the sliding window is a paperboard foreign body. Otherwise, we judge it as tobacco. After traversing the whole image with the sliding window, we can detect the specific position of any present paperboard.
Double threshold detection
We can further improve the method described above. In the actual sliding process while detecting paperboard foreign body, as shown Figure 16, we noticed the deviation of the mean value and variance of the gray scale for paperboard, specifically the surrounding color intensity was stronger and the color intensity at the center was weaker.

Single threshold detection problem.
The outer edge of a piece of paperboard was detected, but the middle part was not; therefore, the detection of the paperboard was incomplete. In order to avoid this problem, we use a double threshold to detect foreign body.
We use a high threshold and a low threshold, denoted H-T and L-T, respectively
The calculation formulas are given by equations (5) and (6). When the sliding window traverses the image, if the product of gray mean and gray standard deviation in the sliding window is greater than the high threshold, it is determined that the sliding window must be a foreign body. If the product of gray mean and gray standard deviation in the sliding window is less than the low threshold, it is determined that the sliding window must be tobacco. If the product of gray mean and gray standard deviation in the sliding window is less than the high threshold and more than the low threshold, and the window is connected to the window that has been identified as a foreign body, the sliding window is identified as a foreign body. In Figure 17. A and C are a foreign body, and B is tobacco.

Double threshold detection principle.
The pseudo-code of the grayscale double threshold method is as follows. GP represents the product value of the gray mean and gray standard deviation in each window.
Results and discussions
In order to verify the detection effectiveness, we selected a specific picture of tobacco packet with foreign body, and the detection result was shown in Figures 18 to 25. Since there is no foreign body in the first area, the testing on the first area is omitted here.

Testing results of the second area for Camera 0. (a), (d) The test sample; (b), (e) the result of our method; and (c), (f) the result of color sorting table method.

Testing results of the third area for Camera 0. (a), (d), (g) The test sample; (b), (e), (h) the result of our method and (c), (f), (i) the result using CSTM.

Testing results of the fourth area for the Camera 0. (a), (d) The test sample; (b), (e) the result of our method and (c), (f) the result of CSTM.

Testing results of the fifth area for Camera 0. (a) The test sample; (b) the result of our method and (c) the result of CSTM.

Testing results of the second area for the Camera 1. (a), (d) The test sample; (b), (e) the result of our method and (c), (f) the result of CSTM.

Testing results of the third area for Camera 1. (a), (d), (g) The test sample; (b), (e), (h) the result of our method; and (c), (f), (i) the result using CSTM.

Testing results of the fourth area for Camera 1. (a), (d) The test sample; (b), (e) the result of our method; and (c), (f) the result using CSTM.

Testing results of the fifth area for Camera 1. (a) The test sample; (b) the result of our method; and (c) the result using CSTM.
Figure 18 shows the examples of detection of foreign bodies on the second region. We accurately detected paperboard and plastic bags with our method. We accurately detected paperboard with CSTM; however, the same method did not work properly for paperboard detection.
For the third area, all foreign bodies appeared in all the photos, and the results of the test are shown in Figure 19. Our method detected ribbon, paperboard, and plastic bags well. CSTM detected ribbon and plastic bag, but did not detect paperboard.
Similarly, from the detection results of the fourth surface (Figure 20), we can see that our method detects plastic bags and paperboard, and CSTM only accurately detect plastic bags.
From the detection results of the fifth area (Figure 21), it can be observed that the CSTM detection of foreign bodies not only can cause missed detection, but also may cause false detection.
The detection results for Camera 1 areas are shown in Figures 22–25.
Based on the experimental results, we can conclude that the CSTM detection is highly accurate for the foreign bodies whose color is different from that of tobacco, such as plastic bags and ribbons. For the foreign bodies whose color is similar to that of tobacco, the error detection rate and the omission rate are higher. However, by using CSTM algorithm with our gray level double threshold method, all three kinds of foreign bodies can be detected accurately.
We tested 500 sample images and counted the correct detection rate, false detection rate, and missed detection rate for three kinds of foreign bodies in each region, as shown in Table 4. The CSTM has detection rate of 99% when detecting plastic bags and ribbon foreign bodies, and a 30% false detection rate and missed detection rate when detecting paperboard. Our method not only has high detection rate in terms of plastic bags and ribbons, but also has a detection rate of 97.8% in terms of paperboard detection, which fully meets the requirements for tobacco production. In addition, we used the same industrial computer to detect the same image using the two methods. Our method took about 1.15s, and the CSTM took about 0.68s. In terms of time, our method takes a slightly longer time than the CSTM algorithm; however, the overall detection rate is much higher than using the CSTM algorithm alone. Therefore, our method is able to detect all foreign bodies at a high detection rate, and takes slightly longer time.
Comparison result of our method and CSTM on paperboard detection rate.
Figure 26 shows the area of false detection from a large number of samples. As can be expected, the color of individual tobacco and paperboard may change greatly. This combined with the influence of lighting makes false detections inevitable. By analyzing the images of missing detections, such as the image shown in Figure 27., we concluded that, when dividing the image into ROIs, we often miss the edge of the tobacco packet. Therefore, when the foreign body is too small or too close to the tobacco packet edge, it is often missed.

Error detection.

Missed detection.
To remedy the situation described above, the following improvements can be implemented. First, we can perform tobacco packet inspection in a well-lit room and position cameras so that interference with light sources is eliminated. Secondly, when dividing photographs into ROIs, the regions close to the bottom of the tobacco packet can be divided further into smaller areas, and a dynamic threshold method can be adopted to set the threshold in a range and change it with the position of the tobacco packet.
Conclusion
In view of the high cost and low real-time performance of foreign body detection in tobacco production industry, this paper proposes a low-cost scheme for fast foreign body detection based on machine vision. Firstly, three cameras arranged around the production line were used to photograph each region of the tobacco packet. Secondly, the color sorting method, gray threshold method and double threshold method were used to identify and facilitate removal of foreign bodies in each region. Finally, the actual samples collected in production were tested in this paper. The test results showed that the detection rate of this algorithm for detection of foreign bodies with similar colors was much higher than that of CSTM algorithm, and it could meet the real-time demand of production lines.
In summary, the fast and multi-step hybrid recognition method proposed in this paper has the advantages of high recognition rate and good real-time performance in the case of low cost, which can greatly improve the operation efficiency of tobacco packaging production lines and improve the quality of tobacco. It has certain theoretical research significance and practical application value. However, the method proposed in this paper is a low-cost solution. After considering the cost, there are some problems: (1) in addition to the three main foreign bodies studied in this paper, there may also be foreign bodies such as hemp rope in other links of tobacco transportation, and our algorithm has not analyzed these foreign bodies; (2) different external light environments are used in the whole transportation and production process of tobacco, and these external light environmental factors will cause false detections of tobacco foreign bodies; and (3) our method also has certain limitations, and cannot effectively detect foreign bodies with similar colors. Therefore, in future work, we can consider exploiting a high-computing computer acting in concert with more complex algorithms, which may be able to achieve better recognition accuracy, such as using deep learning algorithms to implement a more accurate and flexible classifier. In addition to this, other sensing methods, such as X-Ray, spectrometer, and so forth, can also be considered to detect foreign bodies. However, these methods are not economical. Therefore, methods that are more universal, will be necessary in the future to satisfy all the requirements of tobacco transportation and production.
This paper mainly completes a low-cost tobacco foreign body detection device and algorithm. Through the improved threshold algorithm, it can detect foreign bodies with similar color to tobacco, which basically meets the detection requirements for common foreign bodies in tobacco transportation.
Footnotes
Acknowledgements
The authors would like to thank editor and anonymous referees for their helpful and very delicate comments.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by the Science and Technology Commission of Shanghai Municipality (No.19595810700 and No.18295801100).
