Tomatoes are affected by different pests and diseases at different stages of growth, which is the main reason for reduced yield. Accurate identification and diagnosis of pests and diseases in tomato seedlings during the growing period, as well as early detection and treatment, will not only provide a healthy growing environment for tomatoes but will also effectively increase tomato yields to a large extent [1].In particular, the high relative temperature and humidity, poor lighting conditions, and poor circulation in the facility base provide excellent environmental conditions for the rapid spread of pathogens, greatly increasing the chances of pathogenic infestation and leading to a disaster. The most serious diseases of tomato seedlings during the growing period are Bacterial Spot [2], Early Blight [3], Late Blight [4], and Leaf Mold [5], which have a greater than 50% chance of developing [6, 7]. The disease mainly affects the leaves, starting with individual seedlings and then spreading rapidly in all directions with these plants as the center, infecting neighboring plants. The main pest species are Aphid [8], Helicoverpa Armigera [9], Spider Mite [10] and White Fly [11], all of which are highly reproductive, fast-growing and widespread when the environment is suitable, and in addition to direct damage, can also spread directly or promote secondary infection of the disease [12]. According to statistics, approximately 15% of global tomato production is affected by pests and diseases each year, with average yield reductions in severe regions capable of reaching 40–80% [13, 14]. Careful control of pests and diseases is a key task to reduce losses and increase crop yields. Once a pest or disease has invaded a field, it must be detected in time for farmers to treat it and prevent it from spreading [15]. Therefore, it is necessary to select pests and diseases that cause serious damage to tomatoes as research objects, to collect and collate relevant information, to achieve accurate identification and detection, and to provide a theoretical basis for targeted early warning and prevention.
Traditional detection methods no longer meet the needs of research and production in terms of identification efficiency, accuracy, and application scenarios. With the continuous development of the Internet, the application of information technology has provided new methods and ideas for the identification of crop pests and diseases. The successful application of deep learning in other fields has attracted the attention of many agricultural scholars and applied it to the agricultural field [16, 17]. Deep learning (DL) methods, especially those based on convolutional neural networks (CNN), are widely used in object detection and classification in the agricultural field, demonstrating excellent performance and classification in applications such as plant pest detection and plant identification [18].
Current mainstream target detection algorithms can be divided into two types. One is a two-stage target detection based on the candidate region method, which requires a proposal (a pre-selected box that may contain the object to be detected) followed by fine-grained object detection. Such as RCNN (Regions with CNN features), Fast RCNN, and Faster RCNN, etc. Deng et al. [19] proposed a multiple pest detection technology based on federated learning (FL) and improved fast regional convolutional neural network (R-CNN). The improved R-CNN has an average accuracy of 90.27% for multiple pest detection in orchards, and the detection time of each image is only 0.05 s, realizing the accurate identification of small pests and diseases in complex environments. Jiao et al. [20] proposed an anchor-free area proposal network (AFRPN) and combined it with Fast R-CNN to detect 24 types of pests in an end-to-end manner. The mAP and recall of the improved model are 7.5% and 15.3% higher than that of Faster R-CNN, the running time can reach 0.07 seconds per image, which meets real-time detection. Zhang et al. [21] proposed an improved Faster RCNN algorithm to detect tomato diseased leaves, using ResNet101 instead of VGG16 for feature extraction and k-means clustering algorithm for clustering bounding boxes, the accuracy rate increased by 2.71%, which can effectively detect and recognize tomato diseases. Xie et al. [22] proposed Faster DR-IACNN algorithm on the self-built grape leaf disease dataset (GLDD). Inception-v1 module, InceptionResNet v2 module and SE module are introduced, mAP is 81.1%, and detection speed reached 15.01 FPS. One is a single-stage target detection based on regression methods, which requires only one input to the network to predict all bounding boxes, and extracts feature directly from the network to predict the classification and location of targets. Such as Single Shot MultiBox Detector (SSD), You Only Look Once (YOLO), etc. Wang et al. [23] proposed a Deep Block Attention SSD (DBA_SSD) method for plant leaf disease identification by combining an improved VGG network and a channel attention mechanism, achieving 92.2% accuracy on the plant Village dataset. Sun et al. [24] built a new apple leaf disease detection model based on the Mobile AppleNet SSD algorithm using the MEAN module and the Inception module, which could achieve 83.12% mAP and 12.53 FPS in a complex background. Liu and Wang [25] constructed a dataset of tomato pests and diseases in a real natural environment, used image pyramids to optimize the feature layer of the YOLO V3 algorithm, realized multi-scale feature detection, and could accurately, quickly detect the location and type of tomato pests and diseases. Wang et al. [26] achieved early real-time detection of tomato pests and diseases with an F1 value of 94.77% and an AP value of 91.81%, with a false detection rate of only 2.1%, based on YOLOv3 with fused expanded convolution and convolution factor decomposition. Liu et al. [27] proposed a tomato pest identification algorithm based on the improved YOLOv4 fusion triple attention mechanism (YOLOv4-TAM) by introducing the focal loss function and the K-means + + clustering algorithm, with an average identification accuracy of 95.2%. Qi et al. [28] added the Squeeze and Excite (SE) module to the YOLOv5 model for the detection of tomato virus diseases in natural backgrounds, with an accuracy rate of 91.07%. Chen et al. [29] integrated the involute bottleneck module and SE module on the basis of the original YOLOv5 network algorithm. The detection accuracy of the algorithm for powdery mildew and anthracnose is 86.5% and 86.8%, respectively.
As mentioned above, deep learning-based target detection algorithms can better extract features from images and show good performance in the identification of agricultural objects, but relatively little research has been done on the identification of tomato pests and diseases in the growth period. In this paper, we have created a tomato growing period pest and disease dataset consisting of 2547 images by collecting images of tomato pests and diseases in a facility environment, using the Plant plant public dataset and Internet crawler technology for additions (The dataset has been shared at https://drive.google.com/file/d/1V1cRPVwtqrJuBJifGyQj0z2b0N0Dh_OS/view?usp=sharing). It can provide data to support research on tomato pests and diseases during the growing period. In order to further improve the efficiency and identification accuracy of tomato pests and diseases, the YOLOv5 algorithm was improved by introducing attention, weighted bi-directional feature pyramid network (BiFPN), and C3Ghost modules as a means to achieve high accuracy and algorithm lightweight. Based on the identification of the improved algorithm and the analysis of the results of the k-means clustering algorithm, an online diagnosis platform for pests and diseases was designed and developed, capable of real-time detection of images or videos, using the k-means clustering algorithm to select the detection results when detecting videos, and outputting the optimal identification results and confidence levels to the platform interface. The platform was applied at a tomato production facility and the experimental results met the need for rapid and accurate detection of pests and diseases during the growing period, which is of great significance for tomato pest and disease control research and the development of smart agriculture.