Macaque neuron instance segmentation only with point annotations based on multiscale fully convolutional regression neural network

In the field of biomedicine, instance segmentation / individualization is important in analyzing the number, the morphology and the distribution of neurons for the whole slide images. Traditionally, biologists apply the stereology technique to manually count the number of neurons in the regions of interest and estimate the number in anatomical regions or the entire brain. This is very tedious and time-consuming. In this paper, we propose a multiscale fully convolutional regression neural network combined with a competitive region growing technique to individualize size-varying and touching neurons in the major anatomical regions of the macaque brain. Given that neuron instance or contour annotations are infeasible to obtain in certain regions, such as the dentate gyrus where thousands of touching neurons are present, we ask an expert to perform point annotations in the center location of neurons (noted as neuron centroids) for training. Thanks to the multiscale resolution achieved by parallel multiple receptive fields and different network depths, our proposed network succeeds in detecting the centroids of size-varying and touching neurons. Competitive region growing is then applied on these centroids to achieve neuron instance segmentation. Experiments on the macaque brain data suggest that our proposed method outperform the state-of-the-art methods in terms of neuron instance segmentation performance. To our knowledge, this is the first deep learning research work to individualize size-varying and touching neurons only using point annotations in major anatomical regions of the macaque brain.


Introduction
Accurate neuron instance segmentation / individualization is important for neuron analysis, providing useful information of neurons, such as the number, the morphology and the distribution of neurons, which helps to study brain development, aging and cytoarchitecture [3,4,17,19,28,30,45,48]. In the pathological brains, for example, it helps to examine the neuron loss [39,42,44]. In practice, stereology technique [11,15,47] as the gold standard in the field of neuroscience is used to estimate the number of neurons in anatomical regions. However, this manual operation is cumbersome and time-consuming, and the accuracy of estimation depends on the expert's experience. Moreover, it is difficult to estimate the accuracy in certain heterogeneous regions, such as hippocampus region. Therefore, accurate automated methods are requested. So far, a large number of methods have been proposed, but most of them have not addressed the individualization problem of large number of size-varying (Fig. 1a1) and touching neurons (Fig. 1 a2). Mathematical morphology [27,37] often leads to under-individualization of numerous touching neurons because of the lack of background pixels within the neuron pixels. For instance, in the situation where different sizes of neurons are aggregated, neuron missing or neuron under-individualization problems are often present due to the inappropriate structuring element size. Approaches based on concavity detection [5,18,31,35,55] cannot give accurate individualization results either, because the noise present in neuron contours hamper the computation of clear concavity points. Graph cut methods [2,9,13,23] often cause over-and underindividualization problems since the relation between interpixels and intra-pixels in numerous touching neurons is difficult to investigate. Region growing methods [1,58] can correctly separate touching cells if seeds are correctly defined. Watershed algorithm [8,51] is often used to separate neurons, but it can also lead to over-and underindividualization due to the noise. In the past years, the emergence of deep learning techniques has led to several applications, such as cell counting [49,54] and segmentation [20,41,52] from histology sections. Since it is almost impossible to obtain highly accurate artificial labels in the field of cell segmentation, researchers tend to use point annotations for the cell training. Xie et al. [49] proposed FCRN (Fully Convolutional Regression Network) to regress a cell spatial density map to detect cell centroids, but in roughly rounded cell clump, this method cannot give correct detection results. Yoo et al. [52] proposed Pseu-doEdgeNet to guide the network to recognize nuclei edges with point annotations. However, the nuclei edge information used for training is not optimal for numerous touching neurons. The examples showed in their paper merely concerned single individual nuclei. Li et al. [20] proposed a weakly supervised method for mitosis detection based on expansion of point annotations. They used FCN [22] to extract mitosis features and designed a concentric loss function which considers mitosis pixels inside a disk of radius r and background pixels outside a disk of radius R (R [ r) for training. But this method cannot separate touching cells and tuning r and R is difficult for neurons with various sizes. U-net [10,36], U-net ? ? [57] can be applied for cell counting, detection and segmentation, but only with point annotations, their accuracies are limited in the case of size-varying and large number of touching neurons.
In this paper, we propose a new image processing protocol which aims at automatically separating size-varying and touching neurons. First, the point annotations in practice are disks with a predefined radius in the center of neurons, as shown in Fig. 1b1 & b2. They are then processed by a Gaussian filter to obtain a probability map which describes the probability of pixels being neuron centroids. This map is used as ground truth to train the neural network, as shown in Fig. 1c1 & c2. The higher the intensity of a pixel in the probability map, the more probable it is the neuron centroid. Then, the predicted probability map of neuron centroids is obtained by applying the proposed multiscale fully convolutional regression neural network. Next, a post-processing step with a series of mathematical morphology operations (erosion, reconstruction, closing) is applied to locate neuron centroids. Finally, based on the neuron centroids, the competitive region growing method proposed by You et al. [53] is applied to obtain neuron individualization results. We applied our method to the major anatomical regions of the macaque brain. The results demonstrate that our method outperforms the state-of-the-art methods, such as FCRN [49], U-net [36], U-net ? ? [57] and multiscale CNN [54].  2 Related work

NeuN-stained neurons
NeuN expression is observed in most neuronal cell types throughout the nervous system [25]. An image of NeuNstained neurons in the cortex of a macaque brain is shown in Fig. 2a and the contours based on the intensity of this image is shown in Fig. 2b. Since the distribution of neuron pixel intensity is similar to the Gaussian distribution, a Gaussian filter is used for denoising (Fig. 2c). It can be clearly seen that the neurons show obvious dark areas in their center positions. Meanwhile, from the edge of the neuron to its center, the pixel intensity varies from high to low accordingly. This kind of images allows studying the neuron number, the morphology and the distribution at microscopic level of the macaque brain, which is a model closer to the human brain compared to murine models [29].
In order to perform this study, neuron individualization is crucial and numerous algorithms have been proposed, mainly based on unsupervised segmentation methods, the fully supervised and weakly supervised learning methods.

Unsupervised cell segmentation methods
Unsupervised methods perform instance segmentation through analyzing cell features (intensity, shape, texture, etc.). He et al. [13] proposed to segment touching cells based on the features including cell size, spatial location, intervening contours and concavity detection. Zhang et al. [56] used the spatial relationship of the non-overlapping and overlapping areas as well as Overlapping Translucency Light Transmission Model to segment overlapping cervical cells. Ma et al. [24] proposed a saliency-based method to detect oligodendrocyte progenitor cells and then used a marker-controlled watershed algorithm to segment them. You et al.
[53] applied a multi-scale series of Gaussian filter for denoising and then performed the neuron instance segmentation by applying a min-max filter and a competitive region growing technique. Through analyzing Dice values among a series of results, they selected the optimal Gaussian filter and reiterated the instance segmentation.

Fully supervised learning-based methods
For highly accurate cell segmentation, fully-supervised learning-based methods are the most important methods. U-net is a classical method that once achieved top ranks in biomedical data-analysis benchmarks [10,36,40]. The network can learn the pixels relying on cell instance segmentation marked by expert or contour pixels relying on cell outlines drawn by expert. Due to its classical U-shaped and skip connection design, numerous methods have been derived. Rad et al. [32] proposed a multi-resolution ensemble of stacked dilated U-net for inner cell mass segmentation. Vuola et al. [43] performed the nuclei segmentation which combines the predictions of an ensemble model of U-net and Mask R-CNN. Huang et al. [14] used U-net and improved level-set to segment overlapped cervical smear cells. There are also studies which learn cell characteristics from the perspective of the transformation of the ground truth. Naylor et al. [26] addressed the cell segmentation as a regression task of the distance map, and calculated for each foreground pixel as the chessboard distance to the closest background pixel. However, these methods require cell instance segmentation masks or cell contour masks for training which is time-consuming and labor-intensive work. Moreover, in the region where cells are highly aggregated, experts cannot identify cell instance or contours.

Weakly supervised learning-based methods
The staining of cells often shows a certain pattern, such as neurons, whose intensity of NeuN staining gradually increases from the center location to the edge of the neuron. Therefore, experts can relatively easily mark the center location of the neurons, i.e., neuron centroids. Once centroid marked, researchers can apply weakly supervised learning-based methods to cell segmentation. Li et al. [20] expanded the weak label of mitosis centroid to a novel label with concentric circles and designed a concentric loss to train the network to perform the mitosis segmentation. The architecture proposed by Yoo et al. [52] is composed of a segmentation network (FPN [21] ? ResNet-50 [12]) and PseudoEdgeNet. The former is used to localize nuclei as blobs and the latter is used to extract fine boundary information without edge annotations by applying four convolution layers and an attention module of an FPN with a Resnet-18. The methods mentioned above are more semantic segmentation than instance segmentation. It is very difficult to achieve cell instance segmentation using weakly supervised learning-based methods, so there are not many methods. However, methods based on weakly supervised learning [6,38,46] in the domain of natural image processing can be used as references in the domain of medical image processing to achieve cell instance segmentation.

Cell detection with point annotations
Instead of directly performing cell instance segmentation using point annotations, cell detection research can be first conducted. Xue et al. [50] applied CNN to regress a fixed length vector which is encoded by a random Gaussian matrix, and a k-sparse signal is acquired by sensing cell locations via the random Gaussian matrix. Cell locations are then recovered through L 1 minimization / compressed sensing theory. Kainz et al. [16], Xie et al. [49], Raza et al. [33] and You et al. [54] regarded cell detection as a regression problem. The former one used Random Forest to regress a function of the distance to the center of the closest cell. The last three designed, respectively, a FCRN, a MapDe and a multiscale CNN to regress cell spatial density / heat map. Then they identified cell centers by calculating the local extrema or performing deconvolution operation in the generated score map. With the detected cell centroids, methods like region growing can be applied to achieve cell instance segmentation.

Biological datasets
NeuN-stained 40-lm-thick serial coronal sections of a healthy macaque brain were digitized using an AxioScan.Z1 (Zeiss) with an in-plane resolution of 0.22 lm/pixel (9 20 magnification). Two datasets from You et al. [53] were used in this study: (1) segmentation dataset and (2) individualization dataset. The segmentation dataset is used to obtain neuron mask that can stand up to staining intensity difference existing in various brain sections. The individualization dataset is used to learn to detect neuron centroids and perform neuron individualization. In order to use deep learning methods, each image of size 5000 9 5000 pixels in the individualization dataset was cropped to 144 small images of size 512 9 512 pixels as shown in Fig. 3. There are 104 overlapped pixels between any two adjacent images in both horizontal and vertical directions. Fifty images of the individualization dataset were then cropped to a total of 7200 images. In this work, a first training data of 152 representative images (i.e., about 2% of the individualization dataset) from different anatomical regions were used to train the deep learning network and 50 images reconstructed from the 7200 predicted images were used to validate the network. Each neuron was identified by an expert by manually marking a disk (5 pixels in radius) in practice in its center location whose intensity is the lowest (darkest) among its neighbors, as shown in Fig. 1b1 & b2.

Preprocessing of point annotations
Expert manually marked a disk (5 pixels in radius) in the center of neurons as shown in Fig. 1b1 & b2. In order to obtain more accurate detection results, the expert annotations were preprocessed by applying a Gaussian filter There are 104 pixels overlapped between each two adjacent images in both horizontal and vertical directions (r = 3 pixels). A probability map was then obtained describing the probability of a pixel being the centroid, as shown in Fig. 1 c1 & c2. These images are used as the ground truth of the proposed network. The input of the proposed network is the normalized grayscale images converted from the original color images, calculated as follows [7], where I is the input of the proposed network, R, G and B are the red, green and blue channels of the original image, respectively.

Neuron centroids prediction
The neuron diameter is between 5 and 30 lm (i.e., between 22 and 137 pixels with the resolution of 0.22 lm/pixel) [3]. By using the classical U-net or its derivatives with deeper layers, neurons of various sizes can be detected. However, as the network deepens, the features of small-sized neurons gradually disappear. Especially for a large number of aggregated small-sized neurons that are usually present in the region of dentate gyrus, a network with deeper layers will lead to numerous missed detections of such neurons.
Research on the automated detection of highly aggregated neurons can effectively capture the features of small-sized neurons [54]. Therefore, the module that learn the features of small-sized neurons through parallel multiple receptive fields is integrated into the shallow structure of deep learning model. In addition, using different decoders to the encoders in different network depths favorize the supervision of the neurons of different sizes. This motivates us to propose a network with parallel multiple receptive fields, optimal layer depth and multiple decoders to detect single individual neurons, size-varying and touching neurons. Figure 4 presents the architecture of the proposed network which is composed of convolution, ReLU (Rectified Linear Unit), max pooling, and sigmoid function. All the kernels used for convolution operations are of size 1 9 1, 2 9 2 or 3 9 3 pixels. Lxy represents the layer where x represents the depth of the vertical direction and y represents the depth of the horizontal direction.
The proposed network consists of two parts: the parallel multiscale encoder network and the decoder network. L11 and L21 are the modules of parallel multiple receptive fields to better learn features of numerous aggregated neurons which are inspired by You et al. [54]. The path L11 L21 L31 L22 L13 with skip connections can supervise neuron features in 17 receptive fields of sizes from 1 to 5 pixels in steps of 1 pixel, from 6 to 14 pixels and from 20 to 32 pixels in steps of 2 pixels. As parallel multiple receptive field modules are constructed in the shallow structure, the deep structures are affected and also contain parallel multiple receptive field modules. The path L11 L21 L31 L41 L32 L23 L14 with skip connections supervises larger neuron features in 7 more receptive fields of sizes from 56 to 68 pixels in steps of 2 pixels. The path L11 L21 L31 L41 L51 L42 L33 L24 L15 with skip connections supervises even larger neuron features in 7 more receptive fields of sizes from 128 to 140 pixels in steps of 2 pixels. The path L11 L21 L31 L41 L51 L61 L52 L43 L34 L25 L16 with skip connections supervises the largest neuron features in 7 more receptive fields of sizes from 272 to 284 pixels in steps of 2 pixels. The maximum size of receptive field is more than twice of the largest neuron size, which helps to supervise the relation between adjacent touching neurons. This network was trained by combining four separate loss functions associated with four paths. We applied a sigmoid function to the outputs (i.e., L13, L14, L15 and L16) of the 4 paths. The detection loss was defined as their sum as follows: where y i;n is the target labels and p i;n is the predicted probability for the n th pixel being neuron centroid class in the i th path, N indicates the number of pixels in one batch. At the inference phase, the prediction from the 4 paths was averaged. Once the prediction results of 7200 images were obtained, we reconstructed 50 images of size 5000 9 5000 pixels from the corresponding 144 images of size 512 9 512 pixels by summing the prediction maps in the corresponding pixel positions (i.e., reverse process of Fig. 3).

Neuron individualization
Based on the segmentation dataset, the U-net model was generated (the learning rate was set to 1e-4, Adam optimizer was selected, cross entropy was chosen for the loss function and 300 epochs was performed for training). The neuron mask of the individualization dataset noted by I m (Fig. 5 b) was then obtained by using this U-net model. On the probability map (Fig. 5 c) predicted by the proposed network, the pixels with intensity greater than 0 were extracted as the neuron centroid candidates, noted by I p (Fig. 5 d). Since numerous small receptive fields were applied in the proposed network, mathematical operations should be used to reduce the noise. I p was first eroded by using a structuring element with 5 pixels in radius corresponding to the size of disks annotated by the expert. After the erosion, the reconstruction and the closing operations, the sparse pixels were removed (Fig. 5e). In order to remove all the non-neuronal class pixels, I p was masked by I m (Fig. 5f), and then neuron centroids can be calculated as the mass center of the connected components (Fig. 5 g).

Evaluation metrics
The precision of neuron counting was evaluated by using the relative error e which is defend as follows: where N a is the number of neurons detected by the automated method, and N e is the number of neurons marked by experts. The smaller the relative error, the better the performance of the automated method. In addition, another criterion considering the position of the individualized neurons and expert centroids was applied. For each individualized neuron, the number of expert centroids contained in the neuron is computed. If exactly one expert centroid is contained, this neuron is considered as correctly individualized. Else, it is either over-individualized (zeros expert centroids) or under-individualized (more than 1 expert centroids, considered as 1 correctly individualized thanks to one-to-one correspondence). Recall (R), Precision (P) and F-score (F) are defined as follows: where N t is the number of correctly individualized neurons, N e and N a are, respectively, the number of neurons annotated by expert and computed by automated method. The larger the value of F-score, the better the performance of the automated method.

Implementation
In terms of the parameters used in the proposed network, the learning rate was set to 1e-5, Adam optimizer was selected and cross entropy loss function was defined. The proposed network was trained on a computer with a NVIDIA GeForce RTX 2080 Ti GPU with 11 GB memory. It cost about 10 h in performing 500 epochs for training.

Qualitative visual inspection
We applied the proposed network, unsupervised method proposed by You et al.
[53], deep learning methods of U-net [36], U-net ? ? [57], FCRN [49] and multiscale CNN [54] to obtain the prediction map of centroids, respectively. The final individualization results were calculated using the same postprocessing protocol as described in Sect. 3.4. Figure 6 shows the typical results obtained on 9 representative images by applying different automated methods. From the 1st to the 9th row, the images are obtained in the anatomical regions of caudate, claustrum, cortex, hippocampus CA1 (cornu ammonis), hippocampus CA3, hippocampus dentate gyrus, putamen, subiculum and thalamus, respectively. Red points in the 1 st column overlapped with original color images are expert annotations of neuron centroids. It is obvious that the neuron characteristics in various anatomical regions are different. Figure 6 a1 represents simple images with a few isolated neurons of different sizes in caudate. Figure 6 a2 represents some elongated neurons in claustrum. The neurons in Fig. 6 a3 from cortex are generally bigger than those in caudate and claustrum. Figure 6 a4 and a5 represents two different subregions of hippocampus in which neuron expressions are different from normal NeuN-stained neurons (darker in the center and lighter on the edges). Figure 6 a6 represents an extremely complex case where large numbers of neurons are aggregated and their edges are difficult to distinguish. The neurons in Fig. 6 a7 and a8 are distributed sparsely and relatively small. Moreover, the staining of neurons in Fig. 6 a7 is lighter than that in Fig. 6 a8. The neurons in Fig. 6 a9 are distributed more sparsely and are with a wide range of sizes.

Comparison experiments of manual and automated neuron counting
Neuron counting is important in the field of biomedicine, so we count the number of neurons on the individualization dataset using unsupervised methods proposed by You et al.

Comparison of automated neuron individualization
Besides the number of neurons, the quality of neuron individualization is important. Therefore, an evaluation by using different methods based on F-score (Eq. (4)) was performed. Figure 7 shows the F-scores obtained by different automated methods applied to the individualization dataset. We observe that in the regions of caudate and subiculum, all the automated methods have good performance with almost all the F-scores greater than 0.8. In the regions of claustrum, putamen, thalamus and cortex, the performance of different automated methods differs because both light-and dark-stained neurons, size-varying and touching neurons and neurons of different shapes are present which increase the difficulty in correctly detecting neurons. As for the challenging heterogeneous hippocampus region in neuroscience, the performance of all the methods is decreased. However, in general, we can still find that the proposed network can provide good individualization performance for any anatomical region.
The average F-score for each anatomical region is presented in Table 2. We observe that the average F-score of the proposed method is the highest in any anatomical region. Except for the complex heterogenous hippocampus region, the average F-score values are all greater than 0.8577. Particularly, in the regions of caudate, putamen and subiculum, the average F-score values are greater than 0.9. The F-scores of hippocampus region varying from 0.7793 to 0.8896 are lower than other anatomical regions, which is due to the fact that hippocampus is a very complex heterogenous region. It contains different kinds and distributions of neurons which do not follow general NeuNstained pattern. As presented in Fig. 6 a4 & a5, the different staining of these kinds of neurons (lighter staining in neuron center position) may be due to unsuitable staining marker [25] corresponding to different levels of antigen expression. Nevertheless, the proposed network successfully individualized most neurons in the hippocampus, and the calculated average F-score 0.8287 was improved by 1.5%, 28.4%, 27.9%, 1.6% and 15.3% compared to the unsupervised method proposed by You et al. [53], deep learning techniques of U-net [36], U-net ? ? [57], FCRN [49] and multiscale CNN [54], respectively. In addition, the average F-score value of all the anatomical regions obtained by the proposed network is also the highest (1.6%, 19.1%, 15.1%, 1.5%, and 7.7% improvement compared to the reference methods). Moreover, we calculated the p-value to evaluate the significant difference among these methods. A p-value as low as 1.935e-04 was calculated (inferior to 0.01), which indicates that the results obtained by the different methods are significantly different. Further considering the largest average F-score value and the smallest standard deviation value of the proposed network, we can conclude that our proposed network provides the best and the most robust performance for neuron individualization in both simple and complex cases in the macaque brain data.

Individualization results with different numbers of images in the training set
The influence of different numbers of images in the training set to the individualization performance of automated   increased with the number of training set. The F-score of multiscale CNN first increased then decreased, which is because this method is specially designed for the detection of highly aggregated neurons. It cannot correctly extract features of neurons of different sizes. The individualization dataset is composed of the major anatomical regions of the macaque brain data which concerns not only highly aggregated neurons but also size-varying neurons. Therefore, it leads to a performance decrease when the number of training set increases. Overall, we observe that among all the methods in Table 3, the proposed network consistently provides the best (the largest average F-score) and the most robust (the smallest standard deviation) individualization performance.

Discussion
We proposed a novel scheme of multiscale fully convolutional regression neural network combined with a competitive region growing technique to individualize sizevarying and touching neurons in the major anatomical regions of the macaque brain using only point annotations for training. Thanks to the multiscale resolution achieved by parallel multiple receptive fields and different network depths, our proposed network succeeds in detecting the centroids of size-varying and touching neurons. Competitive region growing applied on these centroids achieves neuron individualization which will help study the morphology and the distribution of neurons in the entire brain.
The proposed method provides satisfying individualization performance both qualitatively and quantitatively. Figure 6 shows examples of excellent individualization results obtained by the proposed method in different anatomical regions. Based on Fig. 7, the proposed method can best individualize neurons in most images compared to the reference methods. However, in the images of hip-pocampus_L_2, hippocampus_L_6, hippocampus_L_7, hippocampus_L_8, hippocampus_R_6, hippocampus_R_7 and hippocampus_R_8, neurons individualized by the proposed method are not as good as unsupervised method proposed by You et al. [53]. That is because the dentate gyrus, a special anatomical region containing thousands of small-sized touching neurons, exist in these images. The unsupervised method proposed by You et al.
[53] concluded that the size of the neurons in the dentate gyrus is stable, so a parameter is specially fixed for neuron individualization in this region resulting in better individualization results than those obtained by deep learning methods which use the unique architecture for all anatomical regions. Nevertheless, the F-scores of the proposed method are still competitive in regard of those of the unsupervised method. They are 0.8548 vs. 0.8748 for hippocampus_L_2, 0.8680 vs. 0.8856 for hippocam-pus_L_6, 0.8618 vs. 0.8954 for hippocampus_L_7, 0.8271 vs. 0.8404 for hippocampus_L_8, 0.8896 vs. 0.8976 for hippocampus_R_6, 0.8259 vs. 0.8845 for hippocam-pus_R_7 and 0.8784 vs. 0.8782 for hippocampus_R_8, respectively. The average F-score 0.8287 in the hippocampus region calculated by the proposed method reaches a 1.5% performance improvement compared to the unsupervised method, demonstrating better individualization results in the other hippocampus subregions (CA1-CA4) despite containing different kinds and distributions of neurons. In any case, the hippocampus is always the most challenging anatomical region in neuroscience. It is still worth studying this region.
According to Table 2, the average F-scores of the proposed method is the highest in any anatomical region. Such satisfying results must be attributed to the design of parallel multiple receptive fields and appropriate network depth. The proposed network implementing a total of 38 receptive fields with the size from 1 9 1 to 284 9 284 pixels successfully extracts the features of different sizes of neurons (30 lm in diameter in maximum, i.e., 137 pixels). The extraction of richer features of neurons succeeds in individualizing single individual, size-varying and touching neurons. Table 3 reflects the influence of different numbers of training set to the individualization performance of automated methods. The proposed method always presents the best and the most robust individualization performances. As the number of training set doubles, neuron individualization performance increases by 0.55% at most. Therefore, we can use small samples to train the network model to perform satisfying neuron individualization of the entire brain and/or perform data augmentation to improve the individualization performance. This can alleviate the marking work of experts.
Neuron individualization is still an essential and challenging research in neuroscience. Various unsupervised methods and deep learning methods have demonstrated their efficiency over the past years. However, most of them only focused on special anatomical regions where both of the neuron expression and distribution are simple. This is because the 2D images are obtained through a single scanning focus plan which makes it difficult to individualize touching neurons, especially touching neurons of different sizes. In order to get a better and more general individualization method, it may be better to work in 3D. But there will be new problems, that is, 3D imaging modalities are limited in the field of view of light-sheet imaging and the performance of most computers cannot afford to handle such large 3D volume data [34].
The proposed network can help address major biological challenges, such as improving our understanding of brain development or aging, deciphering pathology mechanisms, or evaluating novel therapies in neurodegenerative diseases.

Conclusion
This paper presents a new image processing protocol to individualize size-varying and touching neurons in the major anatomical regions in the macaque brain based on deep learning approach. The performance of neuron detection and individualization obtained by the proposed protocol outperforms the state-of-the-art methods in both qualitative and quantitative assessments. Since the automated neuron individualization in the macaque brain is challenging, an in-depth comparison of strengths and complementarities between stereology and image processing methods should be carefully addressed in future. In terms of perspectives, this work will be exploited to perform neuron individualization at large scale in the entire brain section in order to analyze the neuron morphology and distribution. It will provide new tools to explore the brain development, the brain aging and the neurodegenerative diseases. Data availability The datasets applied in this paper come from MIRCen CEA France (Molecular Imaging Research Center, The French Alternative Energies and Atomic Energy Commission). The data will be available later in the form of an article and an access because it corresponds to a large amount of information.

Declarations
Conflicts of interest The authors declare that they have no conflict of interest.
Ethical approval The animal study was reviewed and approved by the Comité d'éthique agréé par le MESR dont relève l'EU: CETEA DSV -Comité n AE 44.