Robust Semi-Automatic Annotation of object Data Sets with Bounding Rectangles.

—Object datasets used in the construction of object detectors are typically manually annotated with horizontal or rotated bounding rectangles. The optimality of an annotation is obtained by fulﬁlling two conditions (i) the rectangle covers the whole object (ii) the area of the rectangle is minimal. Building a large-scale object dataset requires annotators with equal manual dexterity to carry out this tedious work. When an object is horizontal, it is easy for the annotator to reach the optimal bounding box within a reasonable time. However, if the object is rotated, the annotator needs additional time to decide whether the object will be annotated with a horizontal rectangle or a rotated rectangle. Moreover, in both cases, the ﬁnal decision is not based on any objective argument, and the annotation is generally not optimal. In this study, we propose a new method of annotation by rectangles, called Robust Semi-Automatic Annotation, which combines speed and robustness. Our method has two phases. The ﬁrst phase consists in inviting the annotator to click on the most relevant points located on the contour of the object. The outputs of the ﬁrst phase are used by an algorithm to determine a rectangle enclosing these points. To carry out the second phase, we develop an algorithm called RANGE-MBR, which determines, from the selected points on the contour of the object, a rectangle enclosing these points in a linear time. The rectangle returned by RANGE-MBR always satisﬁes optimality condition (i). We prove that the optimality condition (ii) is always satisﬁed for objects with isotropic shapes. For objects with anisotropic shapes, we study the optimality condition (ii) by simulations. We show that the rectangle returned by RANGE-MBR is quasi-optimal for the condition (ii), and that its performance increases with dilated objects, which is the case for most of the objects appearing on images collected by


I. INTRODUCTION
The problem of detecting objects in an image consists in finding the position of all existing objects in that image and assigning a class to each object found. The first task is called object localization, while the second is called object classification. In computer vision, there are several ways to define the shape and location of the object, such as bounding rectangles, bounding polygons, segmentation mask, key-Point and landmark, and so on ( [1], [2]). However, the use of bounding rectangles is the simplest and most commonly used type for locating objects in computer vision applications. We say that an object is well detected if the bounding rectangle has a minimum area and covers the maximum of visible parts of the object.
The development of a robust object detector generally goes through two or even three phases which are learning, testing, and tuning phases. Each phase requires an independent set of meticulously prepared images by a team of human experts called annotators. The main function of the annotator is to draw, with the highest possible accuracy, a rectangle around the object of interest, and to choose from a list of classes the one that describes the object. These rectangles are commonly known as the ground truth bounding boxes. The combination of the raw image, the ground truth bounding box, and the object label constitutes the elements of an annotated image. The annotation is done using commercial or free software. Some software are optimized for a given type of annotation, and others offer many tools that allow the user to process various types of annotations.
To create a dataset for object detection, we need to collect a large number of images, annotated with the same method, and distributed according to a predefined list of object classes. For each object class, the dataset must contain several images corresponding to different instances of this class. We can divide the existing annotation methods into two main classes. The first class includes all manual annotation methods, while the second class groups all semi-automatic annotation methods.
Advanced object detectors are based on convolutional neural network approach. These detectors can be divided into two main classes [3]. The first class involves two-stage detectors (one for localization and the other for classification) such as Faster Region Based Convolutional Neural Networks (R-CNN) [4] and cascade R-CNN [5], while the second class includes one-stage detectors (for which localization and classification are carried out in a single-stage) such as You Only Look Once (YOLOv3) [6] and Single Shot Detector (SSD) [7]. The first class of detectors is distinguished by its accuracy rate, while the second class of detectors is distinguished by its inference speed [3]. A convolutional neural network-based detector uses the first subset of the annotated dataset in order to conduct the training stage.
When an object appearing on an image is aligned with the horizontal axis, it is natural to represent it by a horizontal rectangle. In contrast, if the object is aligned with an oblique axis, then the representation of the object by a horizontal rectangle certainly leads to a bad localization and identification of the object, since this rectangle includes a more or less important part of the image background. The worst situation is when the object of interest is aligned with an oblique axis making an angle of 45 degrees with the horizontal axis. Based on the orientation of the object of interest, the object detection problem can be divided into two sub-problems, which are Horizontal Object Detection (HOD) and Rotated Object Detection (ROD) [8]. The first paragraph was an introduction to HOD. In order to complete the description of the object detection problem, we give here a brief overview of the ROD and we specify the inherent difficulties with this topic.
Advanced rotated object detectors are based on convolutional neural networks approach. The learning and testing sets used in the different phases of these detectors are annotated with rotated rectangles. Most of these datasets contain images collected from aerial photographs where objects are rarely aligned horizontally. As examples of common use rotated datasets, we cite UCAS-AOD [9] whose objects are restricted to vehicles and planes, HRSC2016 [10] which contains only ship instances, and DOTA [11] which is considered to be the richest rotated dataset, with fifteen object classes and 2806 aerial images. All the above-mentioned object datasets are hand-annotated by experts with rotated rectangles. Like HOD methods, ROD methods are also divided into two main classes. The first class involves two-stage detectors such as CIN [12], Region of Interests (RoI) Transformer [13], and Small Cluttered and Rotated objects Detector (SCRDet) [14], while the second class includes one-stage detectors such as Efficient and Accurate Scene Text detector (EAST) [15] and Refined Rotation RetinaNet Detector (R 3 Det) [16]. In addition, the prediction bounding rectangles generated by rotated object detectors are also assumed to be rotated.

A. Manual annotation
There are two main methods of manually annotating objects with bounding boxes which are used in the construction of most large scale object datasets. The first one is called the consensus method and the second one is called the sequential tasks method. For each instance of an object in an image, the first method asks several annotators to draw a rectangle around the object, then defines the position of the object by the rectangle elected by the majority of annotators. To annotate an object by the second method, we need at least three annotators. The first one is asked to draw a rectangle around a single instance of the object. The task of the second annotator is to validate the drawn rectangle. The third person checks if there are other instances of the same object class that need to be annotated. Determining an accurate bounding rectangle requires much more human and material resources than verifying the annotation. Determining an accurate bounding rectangle requires much more human and material resources than checking the annotation, that is why the consensus method is more efficient than the sequential tasks method. The quality of the annotation has a direct impact on the development of precise object detectors. The determination of a large-scale object dataset requires expert annotators, significant working time, and remuneration in line with the desired accuracy. As examples of common use datasets, annotated with the manual methods, we cite PASCAL VOC [17], MS COCO [18], and IMAGENET [19].

B. Semi-automatic annotation
The traditional semi-automatic approach of annotating objects with rectangles consists of four phases.
First phase: in this phase, the images of the object dataset are divided into two subsets of unequal sizes. The smallest subset is annotated with rectangles by the manual method. Second phase: the second phase consists of choosing an object detector and training this detector with the images from the dataset already annotated. Third phase: once the detector is ready, it will be used to annotate the images of the second object dataset with the prediction rectangles. Fourth phase: during this phase, the annotator validates the correctly annotated objects during the third phase, and manually draws the bounding rectangles of the poorly annotated objects.
As this study is limited to locating objects by bounding boxes, we cite the Faster Bounding Box (FBB) as an example of semi-automatic annotation method [20]. This method was used to generate the Tampere University of Technology (TUT) indoor dataset. The largest subset is annotated with prediction rectangles, generated by the Faster R-CNN object detector [4], trained on the smallest annotated subset.

C. Performance measure of detectors using rectangles
The evaluation of the effectiveness of the object detector is then performed on the second subset of the annotated dataset. Since we are only dealing here with annotations using bounding boxes, the most used criterion to compare two rectangles is the Jaccard's similarity index, also known as the Intersection over Union (IoU). For each image of the testing dataset, the IoU measures the percentage of overlap between the prediction bounding rectangle A generated by the detector, and the ground truth bounding box A, as follows: Note that the IoU score lies in the interval [0, 1]. The closer this index gets to 1, the better the detection of the object. We say that an object is well detected (or true positive) if the IoU score is greater than or equal to a threshold discussed by experts. In most studies, the threshold value is set at 0.5 [21]. The IoU score causes a problem when its value is zero. This value is embarrassing since it does not explain how far is the prediction bounding rectangle from the ground truth bounding box. To work around this problem, Rezatofighi et al [21] suggest replacing the IoU with the Generalized Intersection over Union (GIoU) index. The GIoU between two rectangles A and A is defined as follows: where C is the smallest convex set enclosing both A and A.
To the best of our knowledge, in the literature of ROD, the IoU between the ground-truth box A and the predicted bounding rectangles B, has never been calculated explicitly as in the case of HOD. Liu et al proposed in [22] to approximate IoU(A, B) by the angle-related IoU defined by the relation (3): where α and β are the angles of A and B respectively, and A ′ coincides with the rotation of A, round the center A 0 , by the angle θ = (β − α) measured counterclockwise from the positive x-axis (as shown in Figure 1). If A and B are parametrized as described in Section IV, then α (resp. β) is the angle between the positive x-axis and vector is the horizontal rectangle obtained by rotating A (resp. B) round the center A 0 (resp, B 0 ) as shown in Figure 2.

D. Real-life use of annotation with bounding rectangles
We present in this paragraph a non-exhaustive list of applications using annotation with bounding rectangles.
1) Perform missions with drones: object detectors, used by drones, are also trained with rotated object datasets containing images collected from aerial photographs. So, these object detectors help drones to locate and identify their target during a future mission. 2) Autonomous vehicles: the development of drivers for autonomous vehicles uses images of road traffic annotated with bounding rectangles to train detectors of objects commonly encountered on the roads, such as pedestrians, cars, motorcycles, bicycles, traffic signs and so on. Thus, these object detectors can help autonomous vehicles to move safely while avoiding all types of obstacles. 3) Online retail: web search engines used by online sales sites are trained with images of clothing and accessories, annotated with bounding rectangles. Thus, customers can easily find the items they want to purchase using images containing those items. For more details on annotation methods, performance metrics, object datasets, and real-life applications, we refer the reader to [24] and [25].

III. MOTIVATION FOR THE STUDY
The manual or semi-automatic annotation methods described in Section II, have made it possible to build many large-scale object datasets. These datasets have given rise to very powerful object detectors. However, we can not ignore the following concerns: 1) According to paragraph II-A, accurate manual annotation is expensive, time consuming, and requires annotation experts. 2) In many situations, the annotator can be in front of an object with an ambiguous orientation as shown in Figure  3. In this case, he will take a considerable time to decide whether the object will be annotated with a horizontal rectangle or a rotated rectangle. Moreover, in both cases, the final decision is not based on any objective argument and depends solely on the dexterity of the annotator.
3) The semi-automatic annotation defined in paragraph II-B, does not guarantee the optimality of the annotation rectangle, in the sense that it must have a minimum area and cover the whole of the object. 4) According to paragraph II-C, the use of any approximation methods of IoU(A, B), whether to measure the accuracy of a rotated object detector or to compare the performance of two rotated object detectors, could lead to biased results. Indeed, the function g(x) = x a+b−x is strictly increasing over the interval [0, a + b[ over the interval, where a and b denotes the areas of A and B respectively. Consequently, an underestimation (resp. overestimation) of the overlapping area between A and B directly induces an underestimation (resp. overestimation) of IoU(A, B).

Contribution of the this study
In this paper, • we develop an algorithm called RANGE-MBR, which determines from a set M n of the n most relevant (in the sense given by Definition 3.1 below) points picked on the object outline, a rectangle enclosing M n and having a quasi-minimal area, in O(n) time, • we propose a new approach to simultaneously build HOD and ROD datasets from a large-scale image bank, based on both the RANGE-MBR algorithm and threshold angles, • we conduct a large experimental study to quantify the performance of the RANGE-MBR algorithm. • we compare the performance of RANGE-MBR to that of the benchmark algorithm, RC-MBR which determines the minimum rectangle enclosing M n in O(n ln(n)) time. Definition 3.1 (relevant point of an object): Any point located on the contour of the object is said to be relevant if it is a local maximum, a local minimum, the most right point, the most left point, a cusp of the first type, a cusp of the second type, an inflection point, and so on. The remainder of this manuscript is structured as follows. Section IV deals with the parametrization of rectangles. Section V is devoted to the development of the RANGE-MBR algorithm. We explain in section VI how to use the RANGE-MBR (or RC-MBR) algorithm to simultaneously generate horizontal and rotated datasets from a large-scale image bank. In section VII, we perform various numerical experiments to evaluate the performance of the RANGE-MBR algorithm.  Figure 4. This angle can be determined using the operation mod(atan 2 (y − y, x − x), 2π), where mod designates the modulo symbol and atan 2 is a mathematical function available in most programming languages.

A. Sorting the vertices of a rectangle
We use the notation R = [R 1 , R 2 , R 3 , R 4 ] to denote a rotated rectangle whose vertices are R 1 , R 2 , R 3 , and R 4 . We assume that these vertices are sorted in counterclockwise order, such that R 1 is the left vertex with the smallest vertical coordinate.
To sort the vertices set {A 1 , A 2 , A 3 , A 4 } of the rectangle A as described above, we first consider the barycenter G of the rectangle A. Then, we calculate the polar angle of each vertex relative to the barycenter G. Denote by A (1) , . . . , A (4) the sorted vertices in ascending order of their polar angle.
. Algorithm 1 summarizes the steps for sorting the vertices of A.

B. Parameterization of horizontal rectangles
The minimum parameterization of a horizontal rectangle Figure 5). The object dataset MS COCO [18] annotates the rectangle A with (x 4 , y 4 , w, h), where w and h denote the lengths of the line segments [A 1 , A 2 ] and [A 1 , A 4 ] respectively, and (x 4 , y 4 ) are the coordinates of the vertex A 4 (cf. Figure 5). The object dataset IMAGENET [19] annotates the rectangle A with (x 0 , y 0 , w, h), where w and h denote the lengths of the line segments [A 1 , A 2 ] and [A 1 , A 4 ] respectively, and (x 0 , y 0 ) are the coordinates of the center A 0 of A (cf. Figure 5).

C. Parameterization of rotated rectangles
The minimum parameterization of a rotated rectangle B   Figure  5). The parameter θ ∈ [0, π 2 ] define the acute angle between the lines ∆ and (B 1 , B 2 ), where ∆ is the horizontal line passing through B 1 . This parametrization is used to annotate the object datasets HRSC2016 [10] and UCAS-AOD [9]. For the sake of simplicity, other object datasets, such as DOTA [11], parameterize a rectangle using the eight coordinates of its vertices. This over-parameterization avoids additional calculations to plot the rectangle, or to find the image of this rectangle by an affine transformation.
In this study, we opt for the parametrization of a rectangle B with the eigh-tuple (x 1 , y 1 , . . . , x 4 , y 4 ), where (x i , y i ) t are the coordinates of the vertex B i , i = 1, . . . , 4, knowing that one can easily switch from one parametrization to another.

V. DETERMINATION OF THE BOUNDING RECTANGLE
It is natural to annotate the object with the minimum rectangle enclosing the set of relevant points. This problem is formulated as follows: Problem 5.1: Given a set of points M n = {M i = (x i , y i ) t ∈ R 2 ; i = 1, . . . , n}, find a rectangle with the smallest area, enclosing M n . Such a rectangle is called a Minimum Bounding Rectangle of M n and denoted by MBR(M n ). The solution of Problem 5.1 is not unique as shown in Figure  6. So, MBR(M n ) denotes any solution of Problem 5.1.
Freeman and Shapira proved in 1975 that one edge of MBR(M n ) must be collinear with an edge of the convex hull (CH(M n )) of M n . They proposed in [27] a natural algorithm to find MBR(M n ) in O(n 2 ) time, based on sweeping all minimum rectangles, enclosing M n and having an edge collinear with an edge of CH(M n ). In 1978 Shamos proposed in [28] the famous rotating calipers algorithm, which return all pairs of antipodal vertices of a n-sided convex polygon in O(n) time. In 1983 Toussaint used the rotating calipers technique, and developed the algorithm RC-MBR to find MBR(M n ) in O(n ln(n)) time [29]. In 2006, Dimitrov et al proposed in ([30], [31]) the algorithm PCA-MBR, to approximate MBR(M n ), in O(n ln(n)) time, by the minimum bounding rectangle which is aligned with the eigenvectors of the covariance matrix of CH(M n ). They also proved that the relative error between PCA-MBR(M n ) and MBR(M n ) is bounded from above by 8 The main drawback of the algorithm PCA-MBR is that it admits an infinite number of solutions if the covariance matrix of CH(M n ) has a double eigenvalue. To overcome the problem of non-uniqueness inherent in algorithms RC-MBR and PCA-MBR, we propose the method RANGE-MBR to approximate MBR(M n ) in O(n) time.

A. The RANGE-MBR algorithm
Let B = (O, i, j) be a Cartesian frame of the two dimensional affine plane. All angles we refer to here are measured counterclockwise from the positive x-axis. Let ∆ be the line passing through O and making an angle θ with the axis (O, i). Let ∆ ⊥ be perpendicular to ∆ and passing through O. Let P 1 and Q 1 (resp. P 2 and Q 2 ) be the extreme points of the orthogonal projection of M n on ∆ (resp. ∆ ⊥ ). Let A = [A 1 , A 2 , A 3 , A 4 ] be the minimum bounding rectangle having an edge collinear with ∆. Then A 1 , . . . , A 4 are the intersection points of the two parallel lines to ∆ ⊥ and passing through P 1 and Q 1 respectively, with the two parallel lines to ∆ and passing through P 2 and Q 2 respectively. The area of A is given by the product of lengths of the line segments [P 1 , Q 1 ] and [P 2 , Q 2 ]. From now on, the MBR A enclosing M n and having an edge colllinear with the direction making an angle θ from the positive x-axis, will be denoted by MBR(M n , θ). Let u = (cos(θ), sin(θ)) t (resp. v = (− sin(θ), cos(θ)) t ) be a unit direction vector of ∆ (resp. ∆ ⊥ ). Then, B = (O, i, j) and B ′ = (O, u, v) are two Cartesian frames of the two dimensional affine plane. For all i = 1, . . . , n, denote by (x i , y i ) t (resp. (s i , t i ) t ) the coordinates of M i with respect to the frame B (resp. B ′ ). Let R θ be the rotation matrix of angle θ, then ( where u t is the transpose of u. Let X be a random variable with mean µ and standard deviation σ. Let x 1:n = (x 1 , . . . , x n ) be n observations on the variable X. Let x (1) ≤ . . . ≤ x (n) be the order statistics of x 1:n . We define the range of x 1:n by Range(x 1:n ) = x (n) − x (1) . Therefore, the lengths of [P 1 , Q 1 ] and [P 2 , Q 2 ] are given by: Denote byσ n any estimation of σ obtained from x 1:n . On the basis of numerous works carried out on the estimation of the standard deviation from the sample range ( [32], [33], [34]), we can assume that there is a real constant α(X, n) such thatσ n = Range(x1:n) α(X,n) . Besides, it is well known that Var(x 1:n ) = 1 x i . It follows that there is a real constant α(X, n) such that: Assume that the x-coordinates (resp. the y-coordinates) of the set M n are n observations of a random variable X (resp. Y ).
Since the Area(A) = P 1 Q 1 · P 2 Q 2 , then combining equations (5) and (4) gives an approximationÂ 2 n (θ) of Area(A), satisfying the following relation: where (S, To alleviate notions, we will also use V x and C x,y to designate Var(x 1:n ) and Cov(x 1:n , y 1:n ). We can easily Since (x i , y i ) t = R θ (s i , t i ) t , we obtain: Using (7) and some trigonometric identities, we prove that: x,y and Combining equations (6), (8) and (9) gives: Thus, the area of MBR(M n , θ) is approximately equal to α(S,n) 2 α(T,n) 2 8 (K − f (θ)). Therefore, we propose to approximate MBR(M n ) by MBR(M n , θ * ) such that: The first derivative of f with respect to θ is given by: where, The function f has a unique critical point θ * such that: Lemma 5.1: The second derivative of f at the critical point θ * has the same sign as −λ. Proof : The second derivative of f with respect to θ is given by: Fig. 7. When θ is fixed, the minimum bounding rectangle enclosing the blue polygon (object) is the rectangle A whose vertices A 1 , . . . , A 4 are the intersection points of the two parallel lines to ∆ ⊥ and passing through P 1 and Q 1 respectively, with the two parallel lines to ∆ and passing through P 2 and Q 2 respectively. A is the rectangle whose edges pass through the red dotted segments.
Proposition 5.1: If the elements of M n are the vertices of a regular n-sided polygon, then V x = V y and C x,y = 0. Proof : In the general case, the vertices of a regular polygon are uniformly distributed over a circle with radius r and center G equal to the barycenter of M n . Thus, without loss of generality, we can assume that r = 1, G = (0, 0) t , and the k th vertex of M n is Consider the complex sequence (z k ) 0≤k≤n−1 , where z = exp( 2π n i) and i is the imaginary unit. Then Since cos(2x) = 2 cos 2 (x) − 1 = 1 − 2 sin 2 (x), then Besides, Thus V x = V y = 1 2 . Moreover, using 2 cos(x) sin(x) = sin(2x) we deduce that C x,y = 1 2n • For case 2, the application of an affine isometry to the set of points M n makes it possible to migrate to another case for which the angle θ is well determined. We then obtain the solution of the initial problem by applying the inverse isometry to the solution of the transformed problem, since isometry preserves the areas. • For case 6, the use of an extra point M n+1 belonging to the convex hull of the set M n , allows to migrate to another case for which the angle θ is well determined. Another alternative consists in asking the user to click on a point of the object's outline until we get out this state.
x,y is negative because of C 2 x,y . To get rid of C 2 x,y , we have just to follow the same reasoning on a set of points whose coordinates x ′ 1:n = (x 1 , , . . . , x ′ n ) and y ′ 1:n = (y 1 , , . . . , y ′ n ) are uncorrelated. Let be the covariance matrix of the elements of M n , where M = (x, y) t . Let Γ = U ∆U t be an eigendecomposition of Γ, where U ∈ O 2 (R) (the set of 2-orthogonal matrices), and  (14), the function f has a unique critical point θ * = 0. Using the Lemma 5.1, f (0) is the maximum of f (θ). Case 6: to get out of the indeterminacy of case 6, we propose to add to the set M n an artificial point M n+1 located in the convex hull of M n , so that the empirical covariance matrix of the new set M n+1 = M n ∪ M n+1 is different from a scalar matrix. Assume that (x n+1 , y n+1 ) t are the Cartesian coordinates of M n+1 with respect to B G = (G, i, j), where G = (x, y) t is the barycenter of M n . Without any loss of generality, we assume that x 1:n+1 (resp. y 1:n+1 ) is the vector of the x-coordinates (resp. y-coordinates ) of the points of M n+1 with respect to B G = (G, i, j) . Denote by V x = Var(x 1:n+1 ) and V y = Var(y 1:n+1 ). Lemma 5.2 gives the relation between V x , V y , C x,y and V x , V y , C x,y respectively.
Lemma 5.2: If V x = V y and x = y = C x,y = 0, then n+1 and y = yn+1 n+1 . On the one hand, since x = 0, we have: On the other hand, since n+1 i=1 x i y i = 0, we have: Let f (θ) be the function defined from M n+1 in the same way that f (θ) was defined from M n in (9). Then, where, Using the identities (19), we have: It follows that: . . , D 4 are the lines whose Cartesian equations are given by: In summary, any point M n+1 = (x + x n+1 , y + y n+1 ) t defined as a convex combination of M 1 , . . . , M n , such that x n+1 y n+1 = 0 overcomes the indeterminacy problem posed by case 6. However, the resulting RANGE-MBR(M n ) depends on the values of x n+1 and y n+1 . We tested this technique on a set M n composed of vertices of a regular n-sided polygon, and we observed that the choice

VI. ROBUST SEMI-AUTOMATIC ANNOTATION
In this section, we provide a new approach for building object detection datasets based on both MBR algorithm and threshold angles. This method is semi-automatic because it consists of a manual step followed by a computer-assisted step. In addition, this method is robust because the bounding rectangle generated by our algorithm is insensitive to the dexterity of the annotator.
• The problem of MBR was dealt with in Section V. If the user needs an optimal annotation in the sense given by Definition 7.1, then he calls the RC-MBR algorithm. Otherwise, he calls the RANGE-MBR algorithm for quasioptimal annotation. However, optimality and complexity are inversely proportional. • By default the threshold angle is equal to zero. It can also be adjusted by experts in object detection, or it can be defined experimentally as the largest angle between the ground truth bounding box and the positive x-axis that gives no significant difference between the performance of horizontal and rotated detectors when tested on rotated objects. The experimental determination of the threshold angle, require HOD and ROD object detectors, as well as the already existing ROD datasets.

A. Properties of the robust semi-automatic annotation method
By construction, the robust semi-automatic annotation method ensures the following properties: 1) The bounding rectangle generated by our approach is quasi-optimal in the sense that it covers the whole object, and its area is close to the are of the MBR enclosing the object.
2) The angle of the rectangle is determined by an algorithm based on some relevant points collected on the contour of the object.
3) The bounding rectangle is insensitive to annotators provided all relevant points on the object have been selected. 4) The determination of the bounding rectangle requires O(n) elementary operations, where n is the number of relevant points. 5) Allows the user to build simultaneously, from an image bank, two databases: one for horizontal objects and another for rotated objects.
In summary, the robust semi-automatic annotation provides a simple solution to all the drawbacks mentioned in paragraph III, which are inherent in the old annotation methods. Let: • θ 0 be a threshold angle fixed by the user (by default, θ 0 = 0) • M n be the set of relevant points selected on the contour of the object,

VII. EXPERIMENTAL STUDY
This experimental study was carried out exclusively with the MATLAB R2007b software. We implemented the RC-MBR, PCA-MBR, RANGE-MBR algorithms in Matlab language, and we wrote a script (see the Appendix) which: 1) reads the image then displays it 2) ask the annotator to click on the most relevant points of the object (M n is a 2 × n matrix) 3) determine the rectangle A corresponding to RC-MBR(M n ) 4) determine the rectangle B corresponding to RANGE-MBR(M n ) 5) draw the two rectangles with the colors red and green, respectively The images used in experiments VII-B,. . . ,VII-D are free of rights, and collected on the net. Moreover, the optimality criterion of an annotation is given in Definition 7.1.
Definition 7.1 (optimal annotation): A rectangle A = [A 1 , . . . , A 4 ] enclosing an object is said to be optimal, if it fulfills the conditions (i) and (ii) below: (i) the rectangle A covers the whole object, (ii) the area of A is minimal.

A. Experiment 1
This experiment consists of studying the optimality condition (ii) of the RANGE-MBR algorithm. Equation (25) describes how to generate the coordinates of the vertices {M n = (x k , y k ) ; k = 1 . . . , n} of a random n-sided polygon Fig. 9. P 1 and P 2 are two random heptagons, generated according to Equation 25 with vy = 4. P 1 is concave while P 2 is convex. where each rand is call of a generator of uniform random numbers on the interval [0, 1]. The parameter v y , called the factor of dilation, controls the aspect ratio of the polygon as shown in Figure 10. The more v y is greater than 1, the more the polygon is dilated in the direction of the y-axis. Note that such a polygon is not necessarily convex as shown in Figure  9. This is also the case for any polygon whose vertices are defined by the relevant points of an object.
Since, 1) the optimal annotation criterion that we have chosen results in a rectangle which has a minimum area and which covers the maximum of visible parts of the object, 2) RC-MBR is the fastest algorithm that determines the smallest rectangle enclosing a set of points, it seems natural to consider this algorithm as a reference in the comparative study that we carried out.
For each value of the pair (n, v y ), we generate r = 10000 random n-sided polygons. For each polygon M (i) n , i = 1, . . . , r, we determine the relative error e i , between the areas of RANGE-MBR(M (i) n ) and RC-MBR(M (i) n ), as well as the CPU timest i and t i , used by the algorithms RANGE-MBR and RC-MBR to compute RANGE-MBR(M (i) n ) and RC-MBR(M (i) n ) respectively. Finally, we denote by e,t, t, and  Std(e), the means and the standard deviation of the sequences (e i ) 1≤i≤r , (t i ) 1≤i≤r , (t i ) 1≤i≤r , and (e i ) 1≤i≤r respectively. Figures 11 and 12 represent e and Std(e) versus n and v y for n ∈ {4, . . . , 20} and v y ∈ {1, . . . , 4}. We have retained the ranges of values [4,20] for n and [1,4] for v y , because in our estimation, they are those which correspond most to reality. In practice, rare are objects that have more than 20 relevant points or less than 4 relevant points.
We deduce from Figures 11 and 12 that: 1) the more the polygon is dilated in one direction, the more the algorithm RANGE-MBR is accurate and precise 2) the more vertices the polygon has, the more the algorithm RANGE-MBR is accurate and precise We ran other simulations with larger values of n and v y , and we got the same previous conclusion. The parameter v y controls the dilation of the polygon in the vertical direction. In fact, we could choose any other direction, since the RANGE-MBR algorithm is not sensitive to the direction of expansion. On the other hand, it is sensitive to the number of vertices of the polygon and to its dilation.
Out of the 680000 generated polygons, we do not encounter any case for which V x = V y and C x,y = 0. In addition, The algorithm RANGE-MBR is 9 times faster than algorithm RC-MBR. The mean CPU times of RANGE-MBR and CR-MBR are both independent of n and v y . It is about 1.38 × 10 −4 seconds for the algorithm RANGE-MBR, and 1.20 × 10 −3 seconds for the algorithm RC-MBR. Since the complexity of the algorithm RANGE-MBR is O(n), and that of the algorithm RC-MBR is O(n ln(n)), then the difference in CPU time, for each algorithm, is only observed if n is large enough. Based on the response time per click given in [35] and considering the mean CPU time of the RANGE-MBR algorithm, we can state that determining a bounding rectangle requires 2.5+1.5(n−1) seconds, where n is the number of relevant points collected  on the contour of the image.

B. Experiment 2
We made a comparison between RANGE-MBR and PCA-MBR based on sets of vertices of regular n-sided polygons, with n = 4, . . . , 20. We observe that the relative error of the algorithm RANGE-MBR is always equal to 0, while that of the algorithm PCA-MBR is different from 0 for the values of n reported in Table I. Although case 6 does not contain only regular polygons, this experiment shows that the RANGE-MBR method achieves optimality on regular polygons, but for the other shapes (which go in case 6) it offers a better solution than that obtained by PCA-MBR.

C. Experiment 3
This experiment consists in studying the effect of the dilation factor on the performance of the RANGE-MBR algorithm. Figure 13 represents a basic experiment. The relevant points M n are colored in yellow. We observe that RANG-MBR(M n ) is close to RC-MBR(M n ). Although the bird on the left is smaller than the bird on the right, the relative error e equals 0.65% for the bird on the right, and 3.12% for the bird on the left. These scores are expected, since the bird on the right has a more elongated shape than the bird on the left, and RANGE-MBR is more effective on dilated objects.

D. Experiment 4
This experiment consists in confirming the conclusion obtained in the experiment VII-D. Figure 14 contains two mangoes. The one on the left is almost circular, while the one on the right is clearly elongated. The relevant points M n are colored in yellow. The red rectangle corresponds to RANG-MBR(M n ), while the green rectangle corresponds to RC-MBR(M n ). The relative error e equals 6.93% for the left mango, and 1.86% for the right mango. This real example agrees with the simulation results shown in Figure 11. The  mango on the right is more dilated than the one on the left, that is why the relative error for the circular mango is greater than relative error for the oval mango.

E. Experiment 5
In order to verify the robustness of our annotation method with respect to the annotator, we asked a colleague to click on the relevant points of the two mangoes. Figure 15 illustrates the result of this experiment. The relative error e equals 8.92% for the left mango, and 2.32% for the right mango. The relative errors reported in Figure 15 are larger than those reported in Figure 14. We explain this difference by the number of points used in each experiment: the more points we use, the more the RANGE-MBR algorithm reduces the relative error. This real example agrees with the simulation results shown in Figure  11.

F. Experiment 6
This experiment consists in comparing our annotation method to the FBB method introduced in Section II-B. For this, we have chosen a random image from the TUT indoor dataset, and we have annotated the objects it contains with FBB, RC-MBR, and RANGE-MBR. Note that the FBB annotation uses blue rectangles, RC-MBR annotation uses green rectangles, RANGE-MBR annotation uses red rectangles, and the relevant points are marked in yellow.
Based on Figure 16 and Table II, it can be seen that the annotation by the FBB method is not optimal. Indeed, • for the upper extinguisher and the exit sign, the condition (i) is violated. • for the lower extinguisher, the condition (ii) is violated. In addition, the annotation by the RANGE-MBR method satisfies condition (i) and the relative error between the area of RANGE-MBR and RC-MBR is 12% for the exit sign, 3% for the upper extinguisher, and 0.3% for the lower extinguisher. We can conclude that condition (ii) is almost satisfied by the RANGE-MBR method.
• The values of the relative error are in agreement with the simulation results presented in Figure 11. Indeed, in terms of dilation, the lower extinguisher is the most dilated, followed by the upper extinguisher, then the exit sign. • The exit sign and the upper extinguisher have regular shapes, while the lower extinguisher has an irregular shape. We have already underlined in Section V-A1, that the RANGE-MBR method is sensitive to regular forms. This example then illustrates the situation of case 6. For all these reasons, the annotation of the lower extinguisher by the RANGE-MBR method is the best.

G. Experiment 7
This experiment highlights the importance of the annotation method on the calculation of the IoU. Figure 17 corresponds to image P 1128 from the DOTA dataset. The objects of interest in these images are the airplanes. We use the BBAvectors detector [25] to generate the red prediction bounding boxes. Ground truth bounding boxes used in [25] are colored green. The blue rectangles correspond to the annotations of the airplanes by the RANGE-MBR method. For all i = 1, 2, 3, we denote by IoU g i (resp. IoU b i ) the Intersection over Union between the green (resp. blue) rectangle enclosing the airplane A i , and the corresponding red rectangle. The results of this experiment are reported in Table III. The use of minimum ground truth rectangles best reflects the performance of an object detector.
In view of the results of Table III, it is reasonable to rely on the results of the second row rather than those of the first row.

VIII. CONCLUSION
In this article, we have developed a new robust and semiautomatic object annotation method. Based on the experimental study, we can state that Robust Semi-Automatic Annotation, 1) is quasi-optimal in the sense given by Definition 7.1, and that its optimality increases with dilated objects, which is the case for most objects appearing on images collected by aerial photography, 2) is fast and robust in the sense that the bounding rectangle is insensitive to the dexterity of the annotator, 3) is easy to implement and could be easily integrated into annotation platforms, 4) is sensitive to the annotation of objects exhibiting symmetry, with respect to one or more directions. In this case, the relevant points should not follow the same symmetry so that the generated rectangle is not far from the optimal rectangle, The threshold angle as defined in Section VI, could be the subject of an experimental study based on a large-scale rotated object dataset and advanced horizontal and rotated object detectors. Once the threshold angle is well estimated, we can use Robust Semi-Automatic Annotation algorithm to simultaneously build, from an image bank, two datasets: one for horizontal objects and another for rotated objects.

IX. DECLARATIONS
1) The author declares that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
2) The author confirms having obtained the agreement of the scientific committee of Qassim University, to carry out all the experiments of the experimental study.
3) The Author confirms that all the experiments were carried out with relevant guidelines and regulations. 4) The author states that the scientific committee of Qassim University approves all the experiments in this section. 5) The author confirms that informed consent was obtained from all subjects. 6) The author assumes his responsibility in all his statements. Here, we assume that we have already implemented the RANGE-MBR and RC-MBR functions, in the MATLAB language. 10