A Distributed Ant Colony Optimization in Edge Detection


 With the rise of image data and increased complexity of tasks in edge detection, conventional artificial intelligence techniques have been severely impacted. To be able to solve even greaterproblems of the future, learning algorithms must maintain high speed and accuracy through economical means. Traditional edge detection approaches cannot detect edges in images in a timely manner due to memory and computational time constraints. In this work, a novel parallelized ant colony optimization technique in a distributed framework provided by the Hadoop/Map-Reduce infrastructure is proposed to improve the edge detection capabilities. Moreover, a filtering technique is applied to reduce the noisy background of images to achieve significant improvement in the accuracy of edge detection. Close examinations of the implementation of the proposed algorithm are discussed and demonstrated through experiments. Results reveal high classification accuracy and significant improvementsin speedup, scaleup and sizeup compared to the standard algorithms.

levels or textural characteristics between an object and its background. Edge detection is one of the most frequently used techniques in computer vision since the early 1970s [1]. Edge detection is a fundamental preprocessing step for a variety of application, including object tracking [2], segmentation [4], and active contours [3]. The image edges are detected and grouped into contours or surfaces to represent the boundaries of the objects. Edge detection aims to segment an image by finding and placing sharp discontinuities in gray level images. Many edge detection techniques have been reported in previous works such as Canny [5], Sobel [6], Prewitt [7], Log [8] etc. However, without optimization techniques, most of the existing detection techniques use a huge search space [9] for the image edge detection. Hence, the edge detection task consumes considerable memory and time.
Ant colony optimization (ACO) [10] is an evolutionary computation-based algorithm inspired by the natural phenomenon of ants' foraging. In this optimization method, ants follow an intelligent way of information transfer about the food sources to the rest of the colony with the help of some natural chemical secretion (pheromone) [9]. The deposited pheromone slowly evaporates over time while the ants are walking to and from the food source. Other ants observe the presence of pheromone trails and follow paths with highest concentration of pheromone. The shortest path receives the greatest pheromone concentration as more ants travel over it. Thus, the ants eventually will travel the shorter path with this positive feedback mechanism. Several techniques have been proposed to handle the edge detection using ACO. In [9] proposed an ACO edge detection using F ratio-based technique. An ant colony optimization based mechanism to compensate broken edges was presented in citeLu. It adopted four moving policies to reduce the computation load. An ACO-based edge detection approach was introduced in [15] to tackle the image edge detection problem. All these approaches indicate that ACO is a promising technique for edge detection.
With the advent of the big data era, traditional edge detection technologies are facing problem of poor edge detection and long runtimes; thus, this technology needs to be further analyzed and studied. It is computationally expensive using conventional approaches to edge detection because each set of operations is conducted for each pixel. As the size of image increase, the computation time increases quickly. In addition, because of the sheer volume and increasing complexity of image data being collected and created in the field of computer vision, the conventional approaches are inadequate to handle very large image data in a timely manner. In order to process this big data, both increased computer performance and improved algorithms that allow the exploration of large parameter spaces are necessary. MapReduce [13] is one of the most efficient big data solutions, which enables the processing of a massive volume of data in parallel with many low-end computing nodes. This programming paradigm is a scalable and fault-tolerant data processing tool that was developed to provide significant improvements in large-scale dataintensive applications in clusters.Edges are a basic image feature and they are present between a target and a background and between two targets, two regions, or two primitives. Most of an images information is carried in the edges. Usually, an image edge is a set of pixels around which the gray-level values exhibit a step change. Thus, edge detection technology is based on the discontinuity or mutation of gray levels or textural characteristics between an object and its background.
This work proposes a novel algorithm in the field of edge detection in images using a MapReduce based ant colony optimization with filtering, to extract edge information from the pheromone matrix. The proposed algorithm takes advantage of both ACO and parallel computing. Th proposed algorithm is tested on a greyscale image set, a colored-greyscale image set and a color image set. The experimental results show that both statistically and visually this method performs better than the earlier reported techniques.
The rest of the paper is organized as follows. Section 2 reviews some related ACO work in literature. The proposed algorithm is demonstrated in Section 3. The experimental results and analysis are described in Section 4, and Section 5 concludes the paper.

Related Work
Different techniques of image edge detection have been presented in the literature. In [14], an ant colony optimization with enhanced equation of energy calculations for each state is proposed for edge detection. The edges of the images were extracted better than standard technique. An ACO-based edge detection approach is introduced to tackle the image edge detection problem in [15]. The study shows that the approach is able to establish a pheromone matrix that represents the edge information presented at each pixel position of the image, according to the movements of a number of ants which are dispatched to move on the image. Furthermore, the movements of these ants are driven by the local variation of the images intensity values. Superior performance of the proposed approach is provided in the experimental results.
In [16], a new algorithm for edge detection using ant colony search is proposed. The problem is represented by a directed graph in which nodes are the pixels of an image. Some modifications on original ant colony search algorithm (ACSA) are applied to adapt the problem. Algorithm parameters are determined with a large number of experiments. The results suggest the effectiveness of the proposed algorithm In [17] and [18] , a swarm intelligence technique is applied in image feature extraction. An ant colony system is applied to build the perceptual graph of digital images. The edge feature in digital images is extracted based on the proposed machine vision model. The experimental results show that the ant colony system can effectively extract image features.
For edge detection through MapReduce, [19] proposed a parallel design and implementation model that runs on the Hadoop platform. The proposed parallel Otsu-Canny edge detection algorithm performs better than other traditional edge detection algorithms. The parallel approach reduced the running time by approximately 67.2% on a Hadoop cluster architecture consisting of 5 nodes with a dataset of 60,000 images.
In [20], a MapReduce (MR) approach to perform edge detection of satellite images using a nature-inspired algorithm Artificial Bee Colony (ABC) is proposed. The study compares two edge detection approaches on Hadoop with respect to scalability parameters such as scaleup and speedup. The results suggest that the proposed algorithm scales comparatively better when compared to Canny Edge.
In [21] cloud services for high resolution video streams in order to perform line detection using Canny edge detection followed by Hough transform are proposed and evaluated. Both Canny edge detector and Hough transform algorithms in Hadoop and Spark are proposed. The experimental evaluation using Spark shows an excellent scalability and performance compared to Hadoop and standalone implementations for both Canny edge detection and Hough transform.
To summarize, research on ACO has been carried out from various dimensions. Improving the runtime efficiency of an ACO in edge detection still remains an open challenge. This motivates the design of a Mapreduce-based ACO, which is an efficient distributed ACO algorithm building on a highly scalable MapReduce implementation for edge detection. The aforementioned literatures demonstrate the effectiveness of parallel implementation of computer vision algorithms to achieve good scalability for real-world applications.

Proposed Approach
The proposed approach has three major steps: preprocessing step, map step and reduce step.

Preprocessing Step
The proposed algorithm starts with a preprocessing step. The images need to be preprocessed in order to convert them to computable formats which the map step can process. The images are converting into an image matrix pixel by pixel.
and for a color image, which has 24-bit color, a three dimensional array is created: where  is the number of pixel columns in the input image, and y is the number of pixel rows in the input image. 3 indicates that the array is 3 dimensional image matrix.

Map Step
The next step of the proposed algorithm is the map step. This is the crucial part of the algorithm. Each preprocessed image matrix is the input of the map step. The image is segmented with a grid of 5-pixel × 5-pixel, 10-pixel × 10-pixel, 15-pixel × 15-pixel, 20-pixel × 20-pixel, and 25-pixel × 25-pixel, respectively. The ant colony algorithm uses the segmentation results to detect the images edges. The edges are emitted as the output of the map step. The detail explanation is listed in Algorithm 1.
where  j represents the neighboring pixel at location (, j). P [ , j] represents the current pheromone level at the pixel with the pheromone matrix P. [ , j] represents the pixel value of the original image. α and β represent the weights given to the pheromone level and pixel values respectively within this equation.
Another function to consider is the pheromone deposit. The equation that determines the pheromone drop quantity is (see Eq. 4).
where d represents the pheromone drop amount after each ant step.  is the same as before. λ is the scaling factor of the pheromone deposit.
After every ant has completed a single step, the pheromone levels are then reduced to simulate evaporation found with ants in nature. This is to help reduce or remove pheromone deposits that may have been dropped in incorrect or less ideal locations. Hence, ants will be less likely to visit these undesirable locations. The equation for the pheromone evaporation is updated in Eq. 5.
where ϕ is the pheromone evaporation factor. Once all steps are completed, the pheromone matrix P gets filled with different levels of pheromone at each pixel. To reduce the amount of excess edge data that could be caused by lower amount of pheromones, a cutoff level is used to select the useful edge data. The cut off set to be the mean value of the entire matrix. The values found that are lower than the cutoff will be ignored and the values above the cutoff will be retained. a filter method is adopted to select the cutoff level (see Algorithm 3). The detailed procedures of the proposed filtered ant colony optimization (FACO) are listed in Algorithm 4. The final edge results are produced for the following reduce step.

Filtered Ant Colony Optimization (FACO)
Noise in an image can effect an image visually for a person as well as a computer. Where a person will see a slightly less clear image, a computer will read extra inaccurate data. For this experiment, two variations of the edge detection algorithm were implemented and applied. These two approaches differ in how they determine pheromone drop quantities. The first is has already been covered and will now be refered to as the unfiltered approach. The second, which can be refered to as the filtered approach, will add additional processing to each pheromone drop made. This change will have a major effect by removing unwanted noise resulting in a more accurately produced edge. The filtered version operates as shown in Algorithm 3.

Algorithm 3 Filtering pheromone drop amounts.
Input

Environmental Setup
The experiments are conducted both on a single-node Windows machine (Intel Core i7-7500U Kaby Lake (2.7 GHz, up to 3.5 GHz, 4 MB cache, 2 cores)) and m3.xlarge EC2 instances (Intel Xeon Processor E5-2670 v2 Ivy Bridge ( 2.50 GHz, 25MB Cache, 4 cores)) from AWS. The parameters values are listed in Tabel 1. 30 images [22] which can be categorized into greyscale, coloredgreyscale and color are used for experimental purpose. The proposed parallel algorithm is run using the following distributed computing environment: • Hadoop: The cloud compute cluster to assign namenode and datanodes, conduct work load balancing, resource scheduling and data replication.
• HDFS: The distributed file system that provides fault tolerance and high throughput access to large datasets.
• YARN: Provides job scheduling and resource management.
• SPARK: An extension of the MapReduce framework which makes use of Resilient Distributed Datasets (RDDs) as a fault tolerant data structure operated on in parallel.

Experimental Results with ACO in a Single Node Cluster
Detection results with the selective images of the proposed mapreduce based unfiltered ACO and the proposed FACO are illustrated in Fig. 1. As shown in the figure, the segmentation steps are 5, 10, 15, 20 and 25, respectively. The first image is a sample for the greyscale image set. The filtered edge contains less background noise as compared to the unfiltered edges on all segmentation  Fig. 1 Detection results using unfiltered ACO and filtered ACO.

Statistical Analysis
Mean Squared Error (MSE)(see Eq. 6) and Peak Signal-to-Noise Ratio(PSNR)(see Eq. 7) are used to compare the performance of the unfiltered ACO and the filtered ACO (FACO). Y is the vector of observed values andŶ is the predicted value. MAX  is the maximum possible pixel value of the image. It is set to 255 in this case.
The result with the 25 segmentation steps of the whole image sets is listed in Table 2. The result indicates that the proposed FACO consistently performances better as compared to the unfiltered ACO in terms of MSE and PSNR. Moreover, the unpaired t-test with a significance level of 0.5 is performed on the data set of the two approaches. The null hypothesis is applied to verify the effectiveness of the proposed FACO. The P-values produced in Table 3 show a very promising value to conclude that the improvements of the filtered edges over the unfiltered edges is not merely coincidental.

Detection and Scalability Results using Multi-Node Clusters
The proposed FACO was later implemented and run on the Amazon web service to explore its scalability. The detection results of the selective images are listed in Fig. 2. The results are consist with the experiments conducted on a single-node cluster. The filtered ACO performs better than the unfiltered ACO in terms of background noise. Later, we conduct a scalability analysis whereby the segmentation step is increased by 5 dimensions for the whole image set. We observe a normal linear trend of the execution time for increasing segmentation step given that the date set is fixed in Fig. 3. The last experiment conducts a scalability analysis whereby the number of data nodes are increased linearly with the same image set. A fixed segmentation step with value 10 is selected. The result is illustrated in Fig. 4. The execution time of the proposed algorithm follows a linear trend with the data node range from 2 to 16. However, it doesn't perform linearly when the data node reaches 16 due to the throughput.

Conclusion
With the rise of image data and increased complexity of tasks, the efficiency of conventional machine learning algorithms have been severely impacted by the nature of long training times and high computational cost. To be able to solve even greater problems of the future, conventional machine learning algorithms must maintain high speed and accuracy through economical means. MapReduce as one of the most efficient big data solutions enables the processing of a massive volume of images in parallel with many low-end computing nodes. This programming paradigm is a scalable and fault-tolerant data processing tool that was developed to provide significant improvements in large-scale data-intensive applications in clusters. In this paper, a novel ACO image edge detection with Hadoop MapReduce was successfully developed. The proposed FACO method takes advantage of the parallelism of ACO, its rapid convergence, and the virtue of the MapReduce paradigm. As shown in Section 4 the results demonstrate the effectiveness of applying ACO edge detection algorithm on the distributed system like Hadoop MapReduce. In the future, we propose to apply the algorithm in an image processing pipeline for object recognition and object tracking with improved speed and accuracy.

Statements and Declarations
Ethics approval and consent to participate The author Ethics approval and consent to participate.