A new ball detection strategy for enhancing the performance of ball bees based on fuzzy inference engine

Sports video analysis has received much attention as it turned to be a hot research area in the field of image processing. This motivation offers opportunities that develop fascinating applications supported by analysis of different sports, especially soccer. Ball identification, in soccer images, is an essential task not only for goal‐scoring but also for performance evaluation. However, ball detection suffers from several hurdles such as occlusions, fast‐moving objects, shadows, poor lighting, color contrast, and other static background objects. Although several ball detection techniques have been introduced such as Frame Difference, Mixture of Gaussian (MoG), Optical Flow, and so forth; ball detection in soccer games is still an open research area. In this paper, a new Fuzzy Based Ball Detection (FB2D) strategy is proposed for identifying the ball through a set of image sequences extracted from a soccer match video. FB2D can accurately identify the ball even if it is attached to the white lines drawn on the playground or partially occluded behind players. FB2D is compared to recent ball detection techniques. Experimental results show that FB2D outperforms recent detection techniques as it introduces both the highest level of detection accuracy in the testing stage and the lowest possible error.


| BACKGROUND AND BASIC CONCEPTS
In this section, a brief explanation of the camera setup for ball detection and tracking will be introduced. Also, the general procedure for soccer ball detection will be introduced. Also, an explanation of the ball bee will be presented here. A ball bee is a distinctive type of quadcopter 11 ; it can track the moving ball. It was firstly introduced by Abulwafa et al. 12 and used to track the moving ball in soccer matches. A ball bee is a quadcopter with an integrated moving camera situated directly below it. Its camera can take high-quality photos from altitude and record clear shots. The camera frame is made of lightweight, composite materials to reduce weight and increase flight maneuverability. The operator uses a remote control to launch, navigate and land remotely. Controllers use radio waves to communicate with the bee via Wi-Fi. It has many working sensors onboard (e.g., speed and distance sensors, infrared and thermal sensors, image sensors, chemical sensors, GPS, etc.). 13,14 There exists an engagement between the used ball detection technique and the hosting ball bee in a mutually beneficial relationship. Confidently, the employed ball bee constitutes a source of high-quality images in which the ball is not occluded as possible, which simplifies the task of the driving ball detection technique. On the other hand, the ball detection technique is used to derive the ball bee itself. Hence, employing an accurate ball detection technique will promote the performance of the ball bee, which in turn increases the accuracy of the driving detection technique.
Ball detection is an essential step of action recognition and ball tracking. Simply, it is necessary to detect the ball position to track it. The localization of the ball is not only the main factor in tracking performance but also in detecting the main events of the game. Despite it looks simple, discovering the soccer ball in video frames is not a trivial task for the following reasons: (i) the ball, normally, has a spherical shape, but it might be observed as a blurred object due to its rapid motions and unpredictable acceleration. For the same reason, the static features of the ball might not be correctly observed, (ii) the ball state changes during the game as it is thrown, hit, or kicked by players, (iii) the ball is the smallest element on the game field; hence, it might be confused with other objects on the playing field such as penalty shooting point, field lines, or players parts etc., (vi) during the game, the ball is usually occluded by players as it disappears behind the feet of the players most of the game time, (v) sometimes when the ball is kicked high in the air, it suddenly gets out the camera field of view, (vi) it is difficult to detect the playfield due to the shadow from illumination sunlight, (vii) the features of the ball such as color, shape, and size are changing ones with the situation conditions like light and velocity, (viii) the difficulty of segmentation the ball from the player when the ball is possessed by players, and (ix) the fragments because of inappropriate segmentation of players or lines of the field may resemble the ball. Recently, many techniques for soccer ball detection in video frames have been introduced in the literature. Some of those detection techniques include single view mono cameras 15 while the others employ multiple and mostly dedicated cameras (e.g., offside cameras). 16 The most successful techniques in the literature rely on two separate sequential phases, namely: (i) ball candidate extraction, and (ii) ball candidate validation. The purpose of the first phase is to elect the regions that most probably contain the ball, while during the second phase, the candidate regions are accurately analyzed to recognize which of them contains the ball.

| Camera setup for ball detection and tracking
A pivotal part of any ball detection system is the setup of the camera as one with high image quality, and a clear field of view will make the detection process smooth. The position of the cameras must cover the complete playfield and never overlap with the game in any way. Frequently, other objects of interest such as the players and the court lines also have to be detected besides the ball, as well as the camera setup has to be considered.
As illustrated in Figure 1, four different approaches can be used for ball detection and tracking; they are as follows: (i) Single Camera Approach (SCA), (ii) Multiple Camera Approach (MCA), (iii) Broadcast TV Approach (BTVA), and (iv) Ball Bee Approach (B 2 A). According to SCA, only a single camera is employed in the detection and tracking process due to its easy use. Polceanu et al. 17 employed it, where a Raspberry Pi with a single wide-angle fish-eye lens camera is used to record the videos. Also, the work introduced by Yan et al. 18 used a single-camera view from behind the tennis court. On the other hand, MCA is mostly used to deal effectively with the occlusion or for mapping larger regions. Even though MCA offers a considerable advantage in defining the ball position, its setup is more complex than SCA. Owens et al. 19 used multiple cameras with high performance, fixed in the tennis court for broadcasting purposes. Nine IP cameras and one overhead camera were utilized by Conaire et al. 20 This setup covers the whole court area from all the court corners. The system introduced by Fazio et al. 21 utilizes two dual-camera smartphones placed behind the court along the sidelines to record the videos. Using accurate cameras of smartphones is an appropriate approach as they give high-quality imagery for ball detection.
Another way for obtaining a data set for both ball detection and tracking is BTVA, which utilizes the broadcast TV videos of the sport. Although BTVA makes it easier for obtaining game videos, it suffers from various camera angles and pan, tilt, zoom, and motion of the camera in a broadcast video. [22][23][24] The last approach that can be used for ball tracking is B 2 A, which utilizes a set of cooperative ball bees for tracking the ball across the football court. Abulwafa et al. 12 introduced it, where a quadcopter, called the ball bee, has been used for tracking the soccer ball. The camera of this bee is a moving camera fixed in the bottom body of it; also, the photos taken by this camera are high-quality photos.

| General procedure for soccer ball detection
As shown in Figure 2, ball detection can be carried out through three phases: (i) Pre-Processing Phase (PP), (ii) Ball Candidate Phase (BCP), and (iii) Ball Detection Phase (BDP). During PP, two essential tasks should be done for system setup; they are background and ball modeling. Concerning the background modeling, a set of input frames (initial frames) are employed to model the dominant background, while the employed ball (maybe a set of balls) is also considered to build the ball model, which will be employed through the following phases to identify the candidate balls. The generated ball model summarizes the ball features such as size, color, shape, and so forth. After system setup (e.g., PP termination), the system will be ready to start. BCP and BDP are the two sequential phases followed to detect the ball in each frame. The main target of BCP is to generate the ball candidates. During BCP, the background is subtracted from the input frame to detect the frame foreground. An appropriate segmentation technique is then applied to detect foreground shapes. After segmentation, morphological operations should be employed to bridge the gaps as well as eliminate the noise. To identify the movable objects, the background, an edge detection technique is exploited. The result of segmentation is a group of ball candidates and other objects that had looked like balls. Then, circularity measures are applied to identify candidate balls. These ball candidates should be tested in terms of the ball features extracted during the PP phase through the ball modeling process to identify the correct ball if exists. This can be carried out during BDP through the ball validation process. Finally, the ball can be localized within the frame to update the template and to guide later possible ball re-detection.

| RELATED WORK
In soccer videos, the most significant object is the ball where various applications stimulate ball detection; these applications are event detection, tactics analysis, automatic summarization, and object-based compression. 25 Fu and Han 26 update the background pixels using a recursive filter that only depends on the learning parameters, which may delay the background model updating. Yao and Ling 27 present an improvement of GMM, however, it cannot detect the object near the camouflage area. The fast-learning rate techniques may delay the object detection and make the accuracy low. On the other hand, techniques F I G U R E 2 General procedure for soccer ball detection [Color figure can be viewed at wileyonlinelibrary.com] depending on the low learning rate do not upgrade the background model effectively. Thus, it may be valuable to use the regional level of processing to update the changing background pixels and to find the exact moving ones. Kumar and Yadav 28 detect the true nonstationary pixels from the foreground pixels using the regional level of processing by implementing the entropy and the energy.
Many papers have presented different methods to form background models, as a perfect background subtraction method makes the results of object detection better. For example, Vo and Park 29 introduce a method to determine the variation of the background intensity for an image and eliminates this determination from the selected image. As the computational cost of this technique is high, it is not appropriate for real-time applications. Cheng et al. 30 present an image pyramid form consists of two layers to find the sum of absolute differences (SAD) value for images with low dimensions. Also, the color variations between the foreground and background are modified to distinguish between nonstationary objects and noise. Additionally, this technique cannot design the background if it is a changing one. Clustering is another way to create a background model.
A ball detection procedure for actual images under the conditions of changeable light and unstable backgrounds has been proposed by D'Orazio et al. 31 Both computational costs and the performance of the system have been considered in the applied technique. While Moyyila 32 introduces a SURF algorithm, it has fast and robust features to detect the ball and display its information. This algorithm shows the ability and robustness of detecting objects. Chiu et al. 33 propose a background subtraction technique called Entropy-Based Initial Background Extraction (EBBE), used for limited background conditions. Also, the production of false foregrounds has been decreased by this technique in situations of highly color interferences either due to complex conditions such as camera shake and shaking trees or due to a very low frame rate. A unified model technique called Yolov3-Improved Non-Maximum Suppression (INMS) is introduced by Bondalapati et al. 34 to improve the performance of the detection of the moving objects. Balaji et al. 35 propose a Metaheuristic algorithm, called the Cuckoo Search Algorithm, for the detection of the objects from the sports videos. This algorithm also deals with object detection challenges such as shadows and lighting; it achieves high accuracy and precision results.
An automatic background subtraction technique is presented by Huang et al., 36 where a histogram learning technique is used to detect the background pixels. The foreground and background pixels are determined using color models via a training set of soccer videos. Another algorithm is proposed by Ali et al., 37 where the background color is supposed to be a green color. Mudjirahardjo et al. 38 and Mudjirahardjo et al. 39 introduce a method in which the foreground is subtracted from the background by using the Euclidean distance function of the selected pixels. To decide whether the selected pixel is a foreground or not, the distance must be less than a specific threshold. Mudjirahardjo et al. 38 calculated the distance in RGB color space. This distance, on the other hand, is calculated in HSV color space by Mudjirahardjo et al. 39 Chakraborty and Meher 40 propose a background subtraction strategy and a frame difference strategy over three sequential frames to detect moving objects. The background extraction is carried out by Durus 41 using the dominant color distribution. The target object has been determined by Mudjirahardjo et al. 42 using the velocity histogram.
A deep learning strategy for 2D ball detection and tracking (DLBT) in soccer videos is reported by Kamble et al. 43 ; it faces several challenges. For blob recognition of moving objects, a new 2-stage buffer median filtering background model is applied. Initially, a deep learning strategy is proposed for classifying an image patch into three classes: ball, player, and background. To identify the ball from the earliest frames, DLBT does not require human intervention. A deep network-based object detector specialized for ball detection in long shot films is described by Komorowski et al. 44 The algorithm works on images of any size and creates a ball confidence map encoding the position of the identified ball due to its completely convolutional nature. Hiemann et al. 45 present a real-time ball detection strategy based on the YOLOv3 object detection model that addresses the detection of small and fast-moving balls in sport video data. To improve detection accuracy and speed, they make critical changes to the network design and training procedure.
As depicted in Table 1, the traditional ball detection strategies perfectly deal with circumstances where the ball is visible as a single object and separated from the player body. On the other hand, these strategies have issues for detecting the ball when it is possessed or partially occluded by a player.

| PROBLEM STATEMENT
Generally, the goal of motion detection is the determination of the moving points or regions in the image that have moved between the two-time instants. The motion of these points or regions is not directly recognized but rather through intensity changes. The changing of intensity through time may also be happened by camera noise or illumination changes. Moreover, the motion of the object may make minor intensity variations or even none at all. The automatic moving objects' detection is the first critical phase of different and distinguish applications such as pedestrian and vehicle tracking, action and event recognition, video annotation. Undoubtedly, moving objects' detection is a vital topic in the computer vision field. For instance, applications like intelligent surveillance systems, ball recognition and tracking, human-computer interface (HCI), and robot visions need moving objects' detection process. Many different strategies have been presented for the detection of moving objects in the case of stationary cameras. On the other hand, these strategies do not perform effectively when mobile or pan-tilt-zoom (PTZ) cameras are used, because of the unconsecrated factors of movable cameras.
The essential capability of any system is the ability of both accurate ball detection and tracking in a video sequence; it aims to an automatic analysis of football matches or players' progress. Unfortunately, ball detection through long-shot video frames of a soccer game is a significant task for the following reasons: (i) the object of interest (e.g., the ball) has a very small size compared to other visible objects in the observed scene. (ii) The size of the ball differs according to its location on the court due to the perspective projection. (iii) The ball is not perfectly in a circular shape. (iv) While kicking the ball, it moves with high velocity; consequently, it turns into a blur and elliptical shape.

| THE PROPOSED FB 2 D STRATEGY
Ball detection can be defined as the discovering of circular characteristics in images' sequences when the ball seems like a circular shape, which can be achieved by light conditions. In the case of using artificial lights during the evening matches or the shadows of the sunlight during the noon matches, the ball shape is more spherical. The ball shape and its direction differ based on the direction of the light in the playfield. Thus, the ball detection and recognition methods must be suitable to detect the different shapes of the ball in the image. There are many different methods for moving object detection: (i) optical flow technique, (ii) consecutive frame difference, and (iii) • Rectified the challenges in object detection due to shadows, lighting, and • Occlusion problem • Detected the fast-moving player while hitting the ball, etc.
• Efficiency Problem 2018 [47] Estimated the background intensity variation of an input image and Subtracted the estimate from the input to achieve a flat-background image.
Has great values in robust Image binarization and image segmentation.
• High computational cost • Not suitable for real-time systems.

2017
[45] Integrated the regional level processing by evaluating the entropy and energy that provided the actual moving pixels on the foreground.
Efficiently localizes the object in the scene.
Not suitable for multiple object detection and tracking in unconstrained videos.

2015
[48] SURF algorithm is used for each frame image, to find object quickly and display its information.
SURF is preferred over SIFT Cannot detect and differentiate player according to team.
2014 [44] Constructed an improved version of GMM with frame difference method to the foreground extracted.
• Used for very crowded situations.
Fails to detect the object near camouflage region. • Proposed a new shadow removal method based on RGB color space is.
2012 [37] Detected the soccer ball and players within a video footage.
Used especially when the ball is attached with lines in the ground Not suitable for real-time applications.
background subtraction. Automatic extraction of the object features has been performed using a clustering method in the Optical Flow technique. Thus, the features extraction process is done using an x-mean cluster, and then these extracted features will be classified based on their predefined motion parameters and labeled as moving objects. Besides, this technique needs some additional components to be employed in real-time systems. The second technique, called Consecutive Frame Difference, is a simple technique suitable in the dynamic environment, even though it cannot detect the moving object perfectly. The background subtraction technique is the most popular one in which input frames are compared with the background model to extract the moving object. Although this method is accurate and has low computational time, its implementation in changing environments is more difficult due to the illuminations and lights variations, shadows, colors similarity, and so forth. The proposed strategy in this paper is called FB 2 D strategy. It consists of two main steps, as shown in Figure 4; they are Background Identification and Object detection. These two steps will be explained in the following sections as shown in Algorithm 1.

| Background identification
The process of identifying the background is an essential step for detecting the objects. It has been applied in many studies and papers such as Monnet et al. 47 directly addressed the modeling of dynamic background by proposing an auto-regressive model to predict the behavior of such backgrounds for effective detection of moving objects. For modeling and classification of dynamic background, the dynamic texture modeling methods can also be utilized. Parameters of such a background model can be effectively used as a cue for moving object detection. Recently, Minematsu et al. 49 suggest an adaptive background model registration based on homography motion estimation to handle highly complex backgrounds for detecting moving objects for moving cameras. In general, the detection of moving objects in complex backgrounds requires good modeling of the background and precise estimation of camera motions.
In this paper, identifying the background model is essential due to the following: (i) simplifying the ball detection process in the case of a moving camera. (ii) taking into account the  robustness against illumination changes. In particular, soccer matches are played in outdoor playfields and the light intensity usually varies; therefore, a robust background modeling method must be able to adapt to gradual changes of the light in the environment. (iii) the playfield may be dynamic, namely some regions of the background may contain movement. An excellent background modeling method should effectively identify the periodical or irregular movement of objects. (iv) removing the shadows of foreground regions or ignore these irrelevant shadows. As the foreground objects often have shaded areas owing to the influence of light changing, which usually affect the separation of foreground objects and the performance of subsequent modules of a background modeling algorithm. (v) considering the effects of noise and being able to cope with degraded signals affected by different types of noise caused by camera shaking, lens ageing or sensor noise can cause image degradation.

Algorithm 1. Fuzzy based ball detection (FB 2 D) strategy
Soccer matches are normally played on a grassy playfield; it is defined as the part of the image on which the match is taking place. To detect the ball, the first and useful step is to detect the playfield pixels. The soccer playfield has one distinct and dominant color, a tone of green color that may vary from a playground to another, in addition to the weather conditions and lighting variations within the same playground. Thus, any specific value for the field color does not be assumed in the suggested algorithm. This section delineates a method for defining the values of the playfield color. As the standard RGB color representation is not suitable, the RGB values of the captured images are converted to the corresponding coefficients in the HSV color space 50 to avoid the influence of illumination before the analysis process. Furthermore, HSV represents three diverse parameters and they are as follows: hue, saturation, and brightness. The hue component is used to determine the dominant color wavelength, which has values from 0°to 360°. On the other hand, brightness illustrates the white light level ranging from 0 to 100, while saturation defines the chromatic element proportion in a color. The color depth of the soccer playfield is ranged from dark green to bright green.
In this step, it is assumed that the playfield has 4 lighting towers, as shown in Figure 5, and each tower has 16 lamps so the total number of lamps will be 64. Our proposed algorithm for playfield background detection consists of five steps: firstly, before the beginning of the soccer match when the playfield is empty, one of the bees, which will be called Scanner Bee, will fly and take a set of frames along the diagonal line between one corner and the center of the playfield as shown in Figure 6. This Scanner Bee is responsible for discovering the background model through the movement from the corner to the middle of the field, before the start of the match, and it can work again during the match if the lighting changes, especially when switching from natural lighting to artificial lighting. Secondly, the white pixels that may appear in the taken frames will be removed by using a threshold value. If the values of R (red) or G (green) or B (blue) of any pixel in the frame exceed the threshold value as in (1), these pixels will be removed from the frame.
where P is any pixel in the frame fe i , (x, y) is the pixel location-where the origin of the frame is in the top left corner-, R(x, y), G(x, y), and B(x, y) are the red, green and blue component of (x, y) pixel, respectively and β is the threshold value. In (1), "1" refers that the pixel has white color and "0" refers that the pixel is not white pixel. Consequently, any pixel has P equal to "1" will The soccer playfield [Color figure can be viewed at wileyonlinelibrary.com] be removed from the frame. On the other hand, any pixel has P equal to "0" will be saved to be used in the next steps. Thirdly, the filtered frames from the previous step will be converted from RGB to HSV color space. This conversion is important as the HSV color space is a device-independent color representation format. It has less noise than the RGB color images, and it is useful for detecting specific color types (e.g., skin color, fire color, grass color, etc.). Although different color variation can be seen in these areas, the hue for all these regions do not vary much, so hue values can be useful in soccer playfield identification. The formulae for converting RGB to HSV color space can be referred to [51]. Fourth, for each frame fe i in the frame set FE, the value of the hue component for each pixel P will be determined to identify the range of hues, which will be called Range_H(fe i ). This range will be added to the H_Vector, which is a vector containing all the values of the hue component in all frames of the frame set FE. And then, the duplicated values will be removed from this vector and they will be sorted ascendingly. Finally, the extreme values, at the fifth step, will be considered as outliers and they have to be removed from the H_Vector using the Interquartile Range IQR technique. 52 These outliers can come from anything seen in the playground. In Figure 7, the playfield looks like an empty field, but it has some unwanted objects, which will make some extreme values (outliers) in the H_Vector. These values are not green pixels, so they have to be removed from the H_Vector.
Any set of data can be described by its five-number summary. These five numbers, which give you the information you need to find patterns and outliers, ascendingly consist of: (1) the minimum or lowest value of the data set (L), (2) the first quartile Q1, which represents a quarter of the way through the list of all data, (3) the median of the data set, which represents the midpoint of | 9633 the whole list of data, (4) the third quartile Q3, which represents three-quarters of the way through the list of all data, (5) the maximum or highest value of the data set (H). These five numbers tell a person more about their data than looking at the numbers all at once could or at least make this much easier. Interquartile range (IQR) is a technique that helps to find outliers in the data continually distributed. IQR is used to divide the whole data set to build a boxplot, a simple graphical representation of interquartile range. In this method, the data set is divided into quartiles and orders the data set into four equal parts. The divided ranges are Q1, Q2, and Q3 named as first, second, and third quartiles respectively. Figure 8 shows how the boxplot is built. To find the IQR, the first quartile is subtracted from the third quartile (IQR = Q3 − Q1).
The IQR shows how the data is spread about the median. The interquartile range can be used to detect outliers, which is fulfilled using the following steps: (i) calculate the IQR for the data. (ii) calculate the highest value of the data set by multiplying the interquartile range (IQR) by 1.5 (a constant used to discern outliers), then add 1.5 × IQR to the third quartile Q3; consequently, any number greater than this is a suspected outlier. (iii) calculate the lowest value of the data set by subtracting 1.5 × IQR from the first quartile Q1, so any number less than this is a suspected outlier. By defining the normal data range with a lower limit as L = Q1 − 1.5 × IQR and upper limit as H = Q3 + 1.5 × IQR, any data point outside this range is considered as an outlier and should be removed for further analysis.

| Object detection
In this section, a new ball detection algorithm will be introduced. The proposed technique consists of four main steps, and they are as follows: Background Subtraction, Edge Detection, Ball Candidates Identification, and Classification using a Ball Fuzzy Inference Engine. These steps will be explained in detail in the following sections.

| Background subtraction
The background subtraction method is mainly employed to detect a moving object in an image by comparing two different frames. The difference value is compared with a specific threshold value calculated via the first few frames that are given. Object detection is used to check whether an object exists in a video or not as well as locating it. Moreover, the most popular challenge facing the background subtraction method is frequent changes due to illumination changes, motion changes, and background geometry changes. There are various background subtraction methods such as frame difference, Gaussian mixture model, kernel density estimation, codebook, and so forth.
In this step, the suggested background subtraction method consists of four steps. In the first step, when the soccer match starts, all the bees will fly to search for the ball in the playfield and take frames with their cameras. While in the second step, these taken frames will be converted into HSV color space. For each frame, the hue component of each pixel will be determined in the third step. And then, these values will be compared to the predefined H_Vector in the background model from the previous section. Finally, in the fourth step, the pixels with the same predefined values will be defined as the background pixels in the frame. Hence, these background pixels will be removed from the frame. As shown in Figure 10C, the remained pixels in the frame will be the moving objects, which are the players and the ball.

| Edge detection
The Edge detection process can be defined as the determination of the sharp color discontinuities, which are the sudden variation in pixel intensity of an image. This process is used for both image segmentation and data extraction in fields like image processing, computer vision, and machine vision. Edge detection has many applications in the image processing field such as object recognition, motion analysis, pattern recognition, medical image processing, and so forth. There are two basic standards, and the first standard determines the areas that have a magnitude of the first derivative of the intensity greater than a certain threshold. The second one locates the areas where the second derivative of the intensity has a zero crossing. Generally, the edge function can be calculated, as in the following equation: Where "I" is the input image, 'M' is one of the edge detection techniques, and "params" are additional ones. In this paper, the Canny edge detector 48 is employed to detect the soccer ball. Edges in this method can be determined by isolating noise from the image without affecting the edges' features and then applying the tendency to define the critical threshold.
In this section, the Canny edge detector will be used for defining the edges of the moving objects, as illustrated in Figure 9F. The algorithm of the Canny edge detector has three phases: first, the noise of the image has been filtered out by using a Gaussian smoothing filter. Second, the edge intensity in the image is defined by finding the gradient of the image, which determines the place of the actual edge. Thirdly, the algorithm uses the non-maximum suppression method, which traces all pixels in the edge and set any pixel that is not at the maximum to be 0, to thins down the edges. Eventually, pixel connectivity and double threshold are used to determine and connect the edges. This means that a pixel can be considered as an edge if its magnitude is above the specified high threshold. On the other hand, a pixel can be defined as a non-edge if the magnitude is below the specified low threshold. The maximum number of edges can be detected via this edge detector, and it can also discover both the corner edges and the circular edges; consequently, it is suitable for soccer ball detection.
The Canny edge detector is the most well-defined and commonly utilized of the edge detectors. Canny 53 proposed a set of edge detection goals and described the best way to achieve them. The first concern is good detection (low error rate), in which the edge detector should respond only to edges and should find all of them; no edges should be missed, according to Canny. The second problem is determining good spatial localization, which is defined as the distance between edge pixels. It also calculates the reciprocal of the root-mean-squared distance of the marked edge from the true edge's center as localization improves. The third concern is a high response rate; it means that the edge detector should detect several edge pixels even if only one is present. The first criterion assumes only one reaction to a single edge. However, it is made to explicitly reject multiple answers. The zero crossings of the image's Laplacian of Gaussian (LOG) can be used to verify that closed edge contours are obtained. As a result, the Canny edge detector employs probability to determine error rate, localization, and responsiveness, as well as to improve the signal to noise ratio, detection, and noise sensitivity.

| Ball candidates identification
After finishing the Edge Detection step, only the objects that need to be considered and to be examined in the playfield will be the ball, the referee, the players, and some noises. Thus, the Circle Hough Transform (CHT) technique 54 will be used to discover the circular objects with a predefined radius R. The CHT technique is used to detect imperfect circles in images. Each circular edge with radius R will be a point in an accumulator matrix. The circle candidates are generated by casting "votes" in the Hough parameter space. Later, the element with the greater number of "votes" will be the center of the required circular object. To use the CHT technique, the range of the expected ball radius values needs to be identified as shown in Figure 10. Consequently, before the soccer ball match starts, the Scanner Bee will take two frames of the ball. In one of these two frames, the ball has to be in the center of the frame to determine the maximum radius R max of the ball. On the other frame, the ball will be in the corner of the frame to find the minimum radius R min . From these two positions of the ball in the frame, the range of ball radius values will be determined as

| Ball Fuzzy Inference Engine (BFIE)
After ball candidates have been identified over the current frame, a fuzzy inference engine is used to decide whether the candidate is a ball or not. In this engine, the contour of each candidate is outlined and used to analyze and describe its shape features, which will be the Fuzzy Sets. In this paper, three features will be used to identify the ball candidates; they are Form Factor, Eccentricity and Area Ratio, which are explained through the following definitions.
Definition 1. Form Factor, denoted as FF (BC); measuring the degree of roundness of the ball candidate, and is defined as the ratio of 4 multiply pi multiplies the area of the ball candidate to the square of the perimeter of that candidate.
Definition 2. Eccentricity, denoted as EC (BC); measuring the compactness of the ball candidate, and is defined as the distance from the center to the focus divided by the major axis of the candidate object. Definition 3. Area Ratio, denoted as AR (BC); measuring the ratio of ball candidate area to its minimal bounding rectangular (MBR) area.
The first feature is the Form Factor, which will be calculated as in the following equation: where P and A constitute both the perimeter and the area of the object, respectively. Commonly, the more circular the shape is, the closer to one the form factor will be. For instance, in case the object is a circle with radius r as in Figure 11A, and then its perimeter will equal to 2 × π × r; its area will be π × r 2 . By substituting in (3), the form factor (FF) will be FF π π r π r π r π r = 4 × × × /(2 × × ) = 4 × × 4 × × = 1 . On the other hand, if the object is a square with length L as in Figure 11B, the form factor will be The second feature needed to be evaluated is the Eccentricity, which is calculated as shown in the following equation: where c is the distance between the center of the object and the focus and a is the major axis or the distance between the center and the vertex of the object. The larger the eccentricity of the object is, the less it likely to be a ball. For instance, if the object is a circle with radius r as in Figure 11A, it will be EC c a r = / = 0/ = 0. This is because the center of the circle is also its focus and its major axis is the radius; thus, the eccentricity of a circle is zero. Finally, the third feature is the Area Ratio that is computed as in the following equation: where A Obj and A MBR are the object area and its minimal bounding rectangle (MBR) area, respectively. For example, in case the object is a circle with radius r as in Figure 11A, the area of the object will be π × r 2 . For the bounding rectangle of the circle, it will be considered as a square with length 2 × r; the area of the bounding square will be (2 × r) 2 = 4 × r 2 . The Area Ratio of the circle will be AR π r r π = × /4 × = /4 = 0.785 2 2 . Consequently, the optimal Area Ratio of a circle is 0.8. As illustrated in the previous paragraphs, a candidate ball maybe a ball in case the Form Factor FF (BC) and the Area Ratio AR (BC) are high while the Eccentricity EC (BC) is low. On the other hand, the candidate may not be a ball when the form factor FF (BC) and the Area Ratio AR (BC) are low and the Eccentricity EC (BC) is high. In this paper, the previous three features can be employed to establish the BFIE, which consists of a group of fuzzy if-then rules with suitable membership functions to generate the specified input-output pairs. The membership functions and the fuzzy rules will be manually determined to model the outputs and explain the input-output relationship of the system. Generally, the architecture of BFIE includes three steps that have to be considered, and they are (i) Inputs Fuzzification, (ii) Fuzzy Rule Induction, and (iii) Defuzzification.

a. Inputs Fuzzification
The three features, which are FF (BC), AR (BC), and EC (BC), will be counted as the fuzzy sets of the BFIE. In the fuzzification step, a membership function for each used fuzzy set will be determined, which gives the similarity value for input according to each used fuzzy set. And then, the input values will be interpreted to the corresponding membership linguistic variables, which are "Low" and "High." Such a process returns a value between 0.0 and 1.0 for nonmembership and full-membership, respectively. The employed membership functions for the used fuzzy sets (e.g., FF, AR, and EC) are illustrated in Figure 12.

b. Fuzzy Rule Induction
In this step, a group of fuzzy rules, which are in the If-then patterns, are used to combine the fuzzified inputs from the former step. These rules can be formed as the following example if (x is A) AND (y is B) AND (z is C), THEN (w is D), where x, y, and z are the input values (e.g., FF, AR, and EC), while A, B, and C are the corresponding linguistic variables (e.g., low or high), thus, w is the rule output, and D is a linguistic variable (low or high). The "IF" part of this rule "x is A" is called the "antecedent" or "premise," while the "THEN" part "w is D" is called the "consequence" or "conclusion." In this paper, eight fuzzy rules will be used, which are listed in Table 2, where 'L' means "Low" and 'H' means "High." For more explanation, the first rule in Table 2 indicates that IF FF is Low AND AR is Low AND EC is Low, THEN Output is Low.
Fuzzy rules can be integrated by a process called "Composition." There are many common composition methods, which are max-product, max-min, drastic product, and sum-dot. In this paper, the max-min technique will be employed by finding the min value between all the premises of the rule, while the max value is used for aggregation. Hence, the premise and consequence are represented by the form of linguistic variables, the max-min inference rule can be illustrated in the following equation: c. Defuzzification In this step, the output from the former step has to be defuzzified to produce crisp numeric values, which will be suitable for real-time applications. The output of the BFIE will be also a fuzzy set. The most common defuzzification technique is the Center of Gravity (COG), which is used in this paper. In this technique, the average area under the curve of the membership function is calculated as in (7) to find the output crisp value. In our proposed paper, defuzzification is done using the output membership function illustrated in Figure 13. In case a Ball Candidate BC whose input parameters are FF(BC), AR(BC), and EC(EC); the crisp output value of the defuzzification step will be called the Similarity Value (SV) of the ball candidate BC, for example, SV(BC). Finally, the decision is taken whether ball candidate BC is the actual ball or not according to a simple rule, which is represented by the simple step identification function illustrated in Figure 14.

| Ball detection optimization: Illustrative example
In this section, an example of ball candidate classification will be illustrated and explained in detail in Figure 15. For example, a ball candidate BC i that has FF = 0.6, AR = 1 and EC = 1.4 will be classified as a ball or not in the following steps. These steps are the three steps of the Fuzzy Inference System that is used to decide whether the candidate is a ball or not. In the first step, the fuzzy set membership values for this ball candidate will be determined as shown in Figure 15 Step ( In the second step, the fuzzy rules will be applied to the membership values from the previous step. As shown in Figure 15 Step (2), all the expected fuzzy outputs for each rule will be determined, and then the max-min method will be used to calculate the output values of the fuzzy rules. Therefore, the output will be High with membership = Output H = MAX (0.6, 0.3, 0.3, 0.2) = 0.6 and Low with membership = Output L = MAX (0.4, 0.2, 0.2, 0.2) = 0.4. In the last step, these output values will be defuzzified using the output membership function as in Figure 15 Step (3). The weighted average of the area bounded by the membership function curve is computed using the COG method as in (7). This result crisp value will identify the Similarity Value of the ball candidate BC i that will be 9.05. From Figure 14, if the crisp value is greater than β, the candidate ball will be a ball and vice versa. Consequently, the Similarity Value is greater than the assumed threshold β = 7, and then the given candidate ball BC i will be a ball.

| PERFORMANCE ANALYSIS AND IMPLEMENTATION
In this section, the proposed FB 2 D strategy will be clearly evaluated. In this paper, Table 3 illustrates the adjustable parameters that will be employed with their corresponding values.

| Performance metrics
In this section, six strategies will be compared based on various metrics, which aim to determine which of these strategies can perfectly detect the ball. The notations, shown in Table 4, are the outcomes of the system used for the evaluation of the performance of each algorithm. In

Parameter description
Bee height The Bee altitude (height) from the ground.
Bee window size The coverage area of the Bee, which depends on the Bee height.

Bee speed
The speed of the Bee Bee camera resolution The resolution of the image taken by the camera. this paper, the different used metrics are precision, recall, accuracy, error, and F1 scores as in (8), (9), (10), (11), and (13), respectively.
In this paper, Precision (P) and Recall (R) can be combined in one metric called F-measure, which can be calculated as in the following equation: where n is a weighting factor, which can set various values to Precision (P) and Recall (R). The most commonly used F-measure is F1 as n = 1 for both precision and recall, which can be as in the following equation:

| Competitors and the evaluation image sequences
Experiments were carried out on different image sequences, which are taken in matches played in FIFA World Cup Russia 2018. Our algorithm has been compared to six different algorithms, as shown in Table 5; they are EBBE algorithm, 33 a deep learning approach for 2D ball detection and tracking (DLBT) in soccer videos, 43 a deep network-based object detector specialized for ball detection in long shot videos (DeepBall), 44 Improved Non-Maximum Suppression (INMS), 34 Cuckoo Search Algorithm 35 and a real-time ball detection approach based on the YOLOv3 object detection model (micro YOLOv3). 45 Supposing that the camera of the ball bee is at a fixed height, the ball will correspond to a circular region with radii in the range (R MIN = 24, R MAX = 28) depending on the distance of the ball with respect to the optical center of the camera. The whole image size was 1280 × 720 pixels for every image in the used sequences. Ten image sequences have been considered in our experiments, which have been taken with artificial light and the ball appears as a complete circular shape. Each sequence contains 100 images divided into 50 images that contain the complete ball and the other 50 images, which contain no ball.
T A B L E 5 Competitors used for evaluation

Classification technique Description
Entropy-Based Initial Background Extraction (EBBE) Algorithm 33 It was used to detect the initial backgrounds. First, through a color distance classifier. Second, the concept of color category entropy is proposed. Third, when the background is masked for a long time, the initial-background convergence time can be dynamically determined via the magnitudes of color category entropy to prevent erroneous detection results due to moving objects being mistaken for a background.
a deep learning approach for 2D ball detection and tracking (DLBT) in soccer videos 43 A new 2-stage buffer median filtering background modeling is used for moving objects blob detection. A deep learning approach for classification of an image patch into three classes, i.e., ball, player, and background are initially proposed.
A deep network-based object detector specialized for ball detection in long shot videos (DeepBall) 44 The network uses hypercolumn concept, where feature maps from different hierarchy levels of the deep convolutional network are combined and jointly fed to the convolutional classification layer.
Improved Non-Maximum Suppression (INMS) 34 INMS is used to detect the multiple objects in less time with higher accuracy. The main objective of the proposed model is to address as many challenges as possible along with detection of multiple objects in a single image.
Cuckoo Search Algorithm 35 In this algorithm, the fitness function evaluation is started by calculating the number of instances in total of each agent α and it also calculates the selection of number of instances in each agent β and the selection of number of boundary instances selected by agent γ. The boundary instances are selected by the algorithm and they are calculated and used for the evaluation of fitness value.
A real-time ball detection approach based on the YOLOv3 object detection model (micro YOLOv3) 45 It applies specific adjustments to the network architecture and training process to enhance the detection accuracy and speed: We facilitate an efficient integration of motion information, avoiding a complex modification of the network architecture.

| Evaluating the proposed FB 2 D strategy
Our proposed FB 2 D strategy will be compared to the other strategies illustrated in Section 6.2 and Table 5. FB 2 D succeeds in detecting the ball, whereas the other strategies may fail to detect it. The experimental results are illustrated in Figures 16-19; they present the precision, recall, accuracy, error, and F1-measure, respectively. Our strategy achieves the higher values for precision, recall, accuracy, and F-measure in comparison to the other strategies. On the other hand, it has the lowest error rate compared to the others. Our proposed algorithm's results of precision, recall, accuracy, error, and F1-measure for all the used ten image sequences, as shown in Table 6. Figure 16 shows  In our proposed paper, the average values for each of the precision, recall, accuracy, error, and F1-measure for 3, 5, 8, and 10 sequences will be computed using our proposed strategy FB 2 D. These results prove that the FB 2 D system is more effective and detects the ball perfectly. Figure 21 illustrates the results for all the used sequences, which are 0.93, 0.92, 0.93, 0.07, and 0.93 for average precision, average recall, average accuracy, average error, and average F1measure, respectively when only three sequences are used. On the other hand, when five sequences are used; average precision, average recall, average accuracy, average error and average F1-measure are 0.94, 0.92, 0.93, 0.07, and 0.93, respectively. In the case of using eight sequences, average precision, average recall, average accuracy, average error, and average F1measure are 0.94, 0.92, 0.93, 0.07 and 0.93 respectively. Finally, when the whole ten sequences are used; average precision, average recall, average accuracy, average error and average F1measure are 0.95, 0.92, 0.93, 0.07, and 0.93, respectively.
The experiments under study were conducted on various balls of different colors as shown in Figure 22. The proposed FB 2 D strategy can successfully detect balls with various colors. Hence, the results of the suggested algorithm, shown in Figures 16-20, prove that all balls are located in different positions on the field. The frame with the fast-moving ball suffers from motion blurring, which will make this ball larger and will reduce its white tone than the slower-moving or stationary ball. The white color percentage within the candidate ball must never be lower than 30% of the entire region. 46 Finally, the experimental results prove that the performance values of our proposed strategy exceed the other competitors' strategies with respect to precision, recall, and accuracy. There are three reasons for these high results: the first one is that the ball always has a circular shape because the camera is fixed orthogonally under the flying bee. While, the camera in the other competitors' strategies is static and faces only one side of the playfield, which makes the ball has distorted patterns. This reason makes it not easy to detect the ball professionally as it may have several patterns depends on the projection angle of the camera. The second reason is that our flying bees effectively cover the whole area of the playground. Furthermore, the majority of the traditional detection strategies have employed static cameras, which need complex zooming and calibration methods to perform ball detection efficiently. Also, strategies based on moving cameras still cannot efficiently detect the ball due to the randomness movement of the ball. The third and final reason is that FB 2 D relies on simple computations in comparison to the traditional detection systems, and it guarantees a fast response of the bee. Besides, there is no delay in identifying the ball as fog computing is employed. Consequently, our detection algorithm is suitable for real-time detection systems.

| FB 2 D PROS AND CONS
In this section, different pros and cons of the proposed FB 2 D strategy will be discussed. As depicted in Table 7, FB 2 D is a robust technique employing fuzzy logic, which is not very sensitive to changing environments and erroneous or forgotten rules. The proposed FB 2 D is also reliable because it relies on the engagement between the used ball detection technique and the hosting ball bee in a mutually beneficial relationship. Our proposed system is efficient in detecting the playfield color as it constructs an accurate range of green that represents the playing field. Also, the background model is continuously updated using the Scanner Bee. Thus, FB 2 D has the capability of online training and adaptive response. Besides, rules in the fuzzy logic system can be added and deleted due to the flexibility of fuzzy logic. In FB 2 D, the background modeling is done offline. Furthermore, any update in the background model is done in parallel during the system operation, which allows the system to be fast and consistent with the ball detection task. In FB 2 D, fewer values, rules, and decisions are Initialization time Need some time before the match starts to detect the playfield color.
∘ FB 2 D employs an effective fuzzy inference engine (FIE) which pretends the way in which humans interpret linguistic values.
∘ The employed BFIS relates output to input, without having to understand all the variables, permitting the design of a system to be more accurate and stable.
∘ FB 2 D has a fully convolutional architecture processing entire image at once in a ∘ single pass. This is much more computationally effective than the sliding window approach introduced in [4].
Adaptive ∘ The background model is updated continuously by updating the H_Vector using the scanner bee. Thus, FB 2 D has the capability of on-line training and adaptive response.
Expert guidance ∘ Fuzzy logic should be built with the complete guidance of experts.
∘ Setting exact, fuzzy rules and, membership functions is a difficult task Robust FB 2 D is a robust system as it employs fuzzy logic so that: Occlusion FB 2 D has problems to deal with the ball when it is partially occluded.
∘ No precise inputs are required. ∘ Fuzzy algorithms are often robust, in the sense that they are not very sensitive to changing environments and erroneous or forgotten rules.
Flexibility ∘ A fuzzy logic system is flexible and allow easily modification in the rules. ∘ Due to the flexibility of fuzzy logic, we can add and delete rules in FLS system. ∘ FB 2 D can be successfully employed to detect different kinds of balls not only the soccer ball. Fast ∘ In FB 2 D, the reasoning process is simple, compared to computationally precise systems, so computing power is saved. This is a very interesting feature, especially in ball detection as a real time system.
∘ Fuzzy methods have a shorter development time than conventional methods. ∘ In FB 2 D, the background modeling is done offline. Furthermore, any update in the ∘ background model is done in parallel during the system operation which allows the system to be fast and consistent with ball detection task.
Simplicity ∘ The structure of Fuzzy Logic System, which is the core of the proposed FB 2 D, is easy and understandable.
∘ In FB 2 D, fewer values, rules, and decisions are required.
∘ FB 2 D can be easily constructed and deployed.
Reliable ∘ FB 2 D is reliable due to the engagement between the used ball detection technique and the hosting ball bee in a mutually beneficial relationship. required, so the fuzzy logic system can be easily constructed and deployed. On the other hand, the proposed FB 2 D suffers from some drawbacks, and they are as follows: (i) initialization time, (ii) expert guidance, and (iii) occlusion. Concerning the first one, the system needs some time before the match starts to detect the playfield color. In the second drawback, fuzzy logic should be built with the complete guidance of experts. For the third one, our system has problems dealing with the ball when it is partially occluded. Details about the pros and cons of FB 2 D are illustrated in Table 7.

| CONCLUSION
This paper introduced a strategy for soccer ball detection using a Ball Bee. Our proposed FB 2 D Strategy is divided into two main phases: (a) background identification, and (b) object detection. In the first phase, the HSV color space had been used to find the hue values of the empty field before the game started. And then, these values are compared to the corresponding values in the current frame of interest. On the other hand, the second phase consists of four subphases: (i) background subtraction, (ii) edge detection, (iii) ball candidates identification, and (iv) ball fuzzy inference system (BFIS). In the first subphase, the matching hue values of the taken frame will be removed, and then the Canny edge detector has been applied in the second subphase. In the third subphase, the candidate ball from the objects found in the frame has been identified using the Circle Hough Transform technique. Finally, in the fourth subphase, a ball fuzzy inference engine has been employed with three fuzzy sets and eight rules to decide which candidate is the actual ball according to specific membership functions. Experimental results illustrate that soccer ball detection in real-time applications can be done even if the ball is occluded by other objects or players.
The system aims to help professional football analysts to evaluate the players' performance, by allowing automatic indexing and retrieval of interesting events. In spite of its effectiveness, the proposed FB 2 D consumes some time before the match starts, as the Scanner Bee takes a few frames of the empty playfield to determine its true color values. It also has problems detecting the ball when it's possessed or partially occluded by a player.