Sunspot extraction and hemispheric statistics of YNAO sunspot drawings using deep learning

Sunspot drawings around the globe provide long historical records for understanding the long-term trends in the solar activity cycle. Yunnan Astronomical Observatory (YNAO) in China contributes to the relatively continuous sunspot drawings from 1957 to 2015. This paper proposes a new deep learning method named SPR-mask to extract pores, spots, umbrae and penumbrae in the YNAO sunspot drawings. SPR-mask consists of three parts: backbone, shared head and mask branch. It especially adopts a scale-aware attention network (SAAN) and a PointRend module in the mask branch to improve the accuracy of target edge segmentation. Besides that, each sunspot belonging to the northern or southern (N-S) hemisphere is determined by transforming its cartesian coordinates to spherical coordinates after extracting P\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$P$\end{document}, B0\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$B_{0}$\end{document} and L0\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$L_{0}$\end{document} handwritten in sunspot drawings using a revised Lenet-5 deep learning method. The precision, recall and AP\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$AP$\end{document} of SPR-mask are 0.92, 0.93, and 0.92, respectively. The test results show the SPR-mask method has a good performance. The numbers and areas of pores, spots, umbrae and penumbrae for the N-S hemisphere are presented and analyzed separately. The YNAO data are also compared with Royal Greenwich Observatory (RGO), Kanzelhöhe Observatory (KSO) and Purple Mountain Astronomical Observatory (PMO) data. The results show similar trends, high correlations, and N-S asymmetries. All data of YNAO are publicly shared at https://github.com/yzs64/YNAO_sd/, which are abundant and complementary to the other sunspot catalogs in the world.


Introduction
Sunspots have a close relationship with all other solar active phenomena, such as prominences, filaments, flares, coronal mass ejection, and so on (Li et al. 2004;Yan and Qu 2007;Çakmak 2014b;Yan et al. 2016Yan et al. , 2018a. The physics behind the solar activity is believed that large-scale solar magnetic fields are generated due to the magnetic dynamo from the depth of the solar convection zone, which in turn are responsible for sunspots appearing on the solar surface (Charbonneau 2010). Therefore, it is best practicable to study sunspots deeply to understand and predict these solar activities.
The sunspot numbers and sunspot areas are used to represent the level of solar activities, thus becoming the most foundational indexes, especially in the field of historical long-term solar activity cycle research (Zirin 1988;Hathaway and Wilson 2004;Li et al. 2004;de Toma et al. 2013;Çakmak 2014a;Clette et al. 2014;Lefevre and Clette 2014;Tang 2014Tang , 2015Lin et al. 2019). In addition, the sunspot numbers and sunspot areas in the northern and southern (N-S) hemispheres provide important evidence to support N-S asymmetry, which the magnetic field systems originating in N-S hemispheres and their evolution in the course of solar cycles are only weakly coupled (Newton and Milsom 1955;Antonucci et al. 1990;Vizoso and Ballester 1990;Carbonell et al. 1993;Oliver and Ballester 1994;Krivova and Solanki 2002;Vernova et al. 2002;Ballester et al. 2005). The size and morphology of sunspots are indications of the complexity of magnetic fields in which the umbral and penumbral areas are believed to be proportional to the magni-Y. Yang yangyf@kust.edu.cn tude of the magnetic field strength separately (Zirin 1988;Schlichenmaier et al. 2010). Umbrae and penumbrae are also suggested that have different mechanisms of formation and persisting (Li et al. 2018). Therefore, to study historical long-term solar activity cycles in multi-aspect, not only the sunspot numbers and sunspot areas in the full sun are needed, but also more detailed information is required, such as the sunspot numbers and sunspot areas in the N-S hemispheres, the numbers and areas of each component of sunspots. These sunspot characteristics can be derived from historical sunspot drawings around the globe in recent centuries, in which the sunspots were drawn and recorded in detail.
China seems to be the first country to have long-term text records of sunspots (traceable around 364 B.C., Leo 2011). The systematic record-keeping of sunspots on the charts began during the Galileo Galilei time by Harriot, Johann Fabricius in 1611 (Ravindra et al. 2020). Staudacher made solar drawings starting from 1749 to 1796 (Arlt 2008). The activities of drawing sunspots were continued in the 19th century, e.g., Herschel and Carrington (Carrington 1863), Spörer (Spoerer 1861), Christian Horrebow (Jørgensen et al. 2019), etc. In the 20th century, observatories around the world recorded sunspot drawings for a long time, e.g., Mount Wilson Observatory (MWO), Royal Greenwich Observatory (RGO), Royal Greenwich Observatory (RGO), Kanzelhöhe Observatory (KSO) and Sunspot Index and Long-term Solar Observations (SILSO), six stations in China, etc.
The valuable information about global drawn sunspots has been derived and shared with public (see Table 1 in Lin et al. 2019). The Chinese sunspot database is shared at http:// Sun.bao.ac.cn/SHDA_data/ (Lin et al. 2019), which data were extracted from the handwritten characters recorded in the drawings using a handwritten character recognition method (Zheng et al. 2016). Xu et al. (2021) proposed a deep learning method to classify and segment the drawn sunspots recorded in the Purple Mountain Astronomical Observatory (PMO); however, they presented that the segmentation edges of sunspots need to be improved. This paper proposes a new deep learning method named SPR-mask to detect, classify, and segment sunspots in the Yunnan Observatory (YNAO) sunspot drawings at the pixel level. The SPR-mask method is designed to improve the accuracy of target edge segmentation, which adopts a residual network and a Feature Pyramid Network (FPN) as the backbone, adopts a center-ness technology in the shared neck, and adopts a scale-aware attention network (SAAN) and a PointRend module in mask branch. Meanwhile, the sunspots are classified into pores and spots, and the spots are subdivided into holes, umbrae, and penumbrae. Besides that, this paper revises the Lenet-5 deep learning method as 13 categories to extract the values of P , B 0 and L 0 recorded in sunspot drawings. Therefore, each sunspot can be determined to belong to the N-S hemisphere by transforming its cartesian coordinates to spherical coordinates. Finally, the number and area of each component of sunspots are counted and measured separately. These YNAO detailed data of sunspots can complement the Chinese sunspot database and other sunspot observations worldwide, which also supplement more detailed information on long-term solar dynamo and solar activity cycle research.
The structure of this paper is as follows. Section 2 introduces the data source and the data set of the deep learning method. Section 3 describes the methods for detecting, classifying, segmenting sunspots, and determining their N-S hemispherical belonging. Section 4 presents the detection, classification, and segmentation results of some instances, and statistical results in detail. Section 5 discusses the performance of this method by comparing it with other methods and other public data. Section 6 concludes this work briefly.

Data
The YNAO sunspot drawings contribute the largest part to the Chinese sunspots drawings archive due to its continuous observation from 1957 to 2015 (Lin et al. 2019). They have good quality to supplement the global sunspot data because its daily sunspots area data were found to have the smallest random error in the world (Baranyi et al. 2016). Figure 1 shows a scanned YNAO sunspot drawing on 24 October 2013.
YNAO used the traditional projection method, wherein sunspots were drawn on the preprinted recording paper by projecting an enlarged image of the sun onto a projection plate after ensuring that the solar projection overlapped with the solar limb preprinted on the recording paper. After the sunspot drawing, other conventional observing information was recorded and measured. The information in the upper left corner is related to observation time, including the serial number of the observing day within this year, observation date, local time, and UTC. The upper right corner records P (the position angle between the geocentric north pole and the solar rotational north pole), B 0 (heliographic latitude of the center of the solar disk observed at the universal time of zero on that day), L 0 (heliographic longitude of the center of the solar disk at zero universal time on the observation day), L (heliographic longitude of the center of the solar disk at the time of the observation). The information described in the lower left corner is related to the sunspot number, including g representing a number of sunspot groups, f representing a number of sunspots surrounding sunspot groups, R representing the Wolf number (10g + f ). The columns N and S mean the number in the northern and southern hemispheres, respectively. The column NS means the total number in the whole solar surface. The K represents the normalized coefficient of sunspot relative number, which is related to the site Here, the K 2 means a normalized coefficient of the sunspot area. Besides that, the information of each sunspot group is handwritten around the disk at a random position, including serial number, longitude, latitude, McIntosh classification, α (total area of the sunspot group in units of the millionth of the solar hemisphere (μHem)), α (the area of the largest sunspot in the sunspot group (μHem)), r (linear distance between the center of mass of the sunspot group and the center of the solar disk in units of mm).
There are 15 752 scanned YNAO sunspot drawings shared at https://sun.bao.ac.cn/SHDA_data/ynao_sd/, spanning from 1957 to 2015. A total of 15 634 sunspot drawings were reserved after deleting 118 redundant drawings that there are several records in a day. The majority of retained 15 634 sunspot drawings were observed at 1:30 UTC, and a few were around 1:30 UTC. This means that in 59 years there were a total of 15 634 days of drawings, of which 27% of the daily drawings were missing. Table 1 lists all periods without sunspot drawings longer than 15 days. Of all 12 periods, only one is before 2009.
Deep learning methods need data sets, including a training set and a testing set. Fortunately, owing to the similarity of the sunspot drawing styles of YNAO and the Purple  Mountain Astronomical Observatory (PMO), the data set built by Xu et al. (2021) was adopted directly without any change in this work. They labeled a total of 39 055 samples as the training set. We labeled 3 000 samples using YNAO data as a new test set. Note that the ground truths were labeled according to human drawings entirely. For the convenience of extraction for different components of sunspots, these samples were labeled as four types: pore, spot, um- For easier extraction of different components of sunspots, they are labeled as four types: pore, spot, umbra, and hole bra, and hole. Figure 2 shows a subregion cut from a YNAO drawing with some pores and a spot consisting of umbrae, penumbrae, and a hole. A pore is a single small spot without penumbrae. A spot is a sunspot with clear penumbra.
Umbrae are black regions within the spot. A hole represents a hollow region within the spot, but it does not belong to the spot. Holes are relatively rare within the spot. Note that, penumbra corresponds to the spot subtracting its umbrae and holes. Labeling the four types makes us easy to calculate the numbers and areas of pores, spots, umbrae and penumbrae separately.

Preprocessing
To focus on the drawn sunspots within the solar disk, the solar disk of each drawing was extracted by the method used by Xu et al. (2021). In brief, the reprinted information of the papers was extracted using a method based on an RGB threshold (Otsu 1978). The solar disk, including the center and diameter, was then derived using the Hough transform (Ho and Chen 1995). The extracted solar disk will be input to the network shown in Fig. 4 described in Sect. 3.2.
The data, such as P , B 0 , L 0 and L recorded at the upper right region in the drawings, are required to extract, which are used for calculating the latitude and longitude of each sunspot for judging its N-S belonging. First, the subregion, including P , B 0 , L 0 and L (see Fig. 3 (a)), is cropped according to its fixed position printed in drawings. Then, each line is segmented by projection analysis (Baskan et al. 2002), see Fig. 3 (b). And then, each character is segmented by connected components analysis (Kim and Cho 2013) and projection analysis (Baskan et al. 2002), see Fig. 3 (c). Note that the character of degree is filtered by a combination of its position, relatively small area, and hole. Finally, each segmented character is identified by a new revised Lenet-5 model, seen in Fig. 3 (d). The new revised Lenet-5 model was modified based on the Lenet-5 deep learning method (Lecun et al. 1998). Its aim is to detect 13 classes, including 10 digits, plus-minus signs, and decimal point. After a total of 600 handwriting plus-minus signs and decimal point samples were added to MNIST dataset, the new revised Lenet-5 model was retrained and retested.
Given each sunspot in the drawing is extracted following the method described in Sect. 3.2, the cartesian coordinate of the centroid of a sunspot, x p and y p can be determined easily. Some parameters have been derived in the above step, such as cartesian coordinate of the solar disk center, x 0 and y 0 , and the radius of the solar disk, r. With the addition of P , B 0 , and L 0 , the heliographic coordinates of the centroid of a sunspot, B p and L p , can be derived approximately by equations (1)-(8), which represent the relation between the three-dimensional cartesian and spherical coordinates (Çakmak 2014c). The main structure of the SPR-mask, which includes three main parts: backbone, shared head, and mask branch. The SPR-mask is designed based on traditional mask networks, with the addition of a scale-aware attention network (SAAN) and a PointRend technology Finally, a sunspot is estimated in the northern hemisphere if B p >=0; otherwise, in the southern hemisphere.

Extracting sunspots
We built a new deep learning model to extract, classify, and segment sunspots in drawings. The model is called an SPR-mask because it is designed based on traditional mask networks, with the addition of a Scale-aware attention network and PointRend technology. Figure 4 shows the main structure of the SPR-mask, which includes three main parts: backbone, shared head, and mask branch.
The backbone is a network for extracting features in the input images. Some outstanding backbones have been proposed in recent years, such as VGG (Simonyan and Zisserman 2015), GoogLeNet (Szegedy et al. 2015), DenseNet (Huang et al. 2017), ResNet (He et al. 2016), and so on. Considering the huge differences in sizes and characteristics of sunspots in one drawing, we adopted a backbone structure based on residual-network-50 (ResNet-50, He et al. 2016) and Feature Pyramid Network (FPN, Lin et al. 2017a). The shortcut connection of ResNet can increase its information flow and alleviate the disappearance of gradient caused by too deep depth during backpropagation. And FPN can combine the low-level detail information and high-level semantic information, which increases the receptive field of eachlevel layer, so that each-level layer can obtain more contextual information, especially low-level layer plays a crucial role when detecting small objects.
ResNet-50 is constructed by five stages for extracting five level feature maps with different sizes and channels, e.g., stages from 1 to 5 generate the feature maps whose size is from 1/4 to 1/32 of the input image separately, and the number of feature maps is from 64 to 2 048 separately. Stage 1 is a normal convolution operator, which consists of a 7 × 7 convolution and a 7×7 max pooling. Stages from 2 to 5 consist of 3, 4, 6, and 3 residual building blocks, respectively. The residual block is the basic module of ResNet, which includes three common convolution layers, and an identity shortcut connection that builds from input to output to overcome the degradation problem of deep networks (He et al. 2016). There are a total of 50 layers after adding a final full connected layer.
Due to the very large size of C1, only the feature maps C5, C4, C3, and C2 are selected to fuse by FPN from top to bottom. This step will generate different sizes of objects in feature maps at different levels, which are named P6 to P2. All channels of P6 to P2 are 256, which can be unified by a 1 × 1 convolution named lateral connection. P6 is obtained by upsampling C5 layer. P5 is directly copped from C5. From P4, the feature map C5 is up-sampled first and then added with its next layer C4 for feature fusion to generate P4, and then P3 and P2 are in turn. From P5 to P2, each fused feature map is applied with a 3 × 3 convolution after fusing, which reduces the aliasing effect of upsampling. As a result of the backbone, P3, P4, P5, and P6 will be fed to the shared head, and P2, P3, and P4 will be fed to the mask branch.
The shared head is shared the same weights between different-level feature maps from the backbone, which makes the parameter efficient, thereby improving the detection performance. There are two paths in the shared head; one is for classification, and another is for regression and center-ness.
The fused feature maps of P3, P4, P5, and P6 are input into the classification tower separately, with shared weights. The classification tower consists of four sequential convolutional layers, a group normalization, and a ReLu activation layer to generate further refined and normalized features. This is followed by a convolution layer and four binary classifiers to predict four probability maps, each corresponding to a category, such as pores, spots, umbrae and holes.
Another path refines P3, P4, P5, and P6 with shared weights using the bounding box tower, which has the same structure as the classification tower but with different parameters. After that, it is subdivided into two sub-branches for regressing the bounding box and predicting center-ness separately.
The regression branch is a multi-level prediction, which directly regresses the target bounding box at each location. A location is considered a positive sample if it falls into any ground-truth box and its class label agrees with that of the ground-truth box. Besides that, a 4D vector (l, t, r, b) encoding the location of a bounding box at each foreground pixel is regressed, which the vector represents the distance from the location of a target to the left, top, right, and bottom boundary of the bounding box, respectively. If a location falls into multiple bounding boxes, the bounding box with the minimal area is chosen as its regression target. In addition, most overlapping will happen between objects of different sizes. So, if a location is assigned to more than one bounding boxes, the bounding box with minimal area is also simply chosen as its target.
The center-ness branch is designed in parallel with the regression branch to suppress the low-quality detected bounding boxes produced by locations far away from the center of an object, without increasing any hyper-parameters (Tian et al. 2019). Given a regression target with (l, t, r, b), the center-ness value, cn, is defined as: The final score of a target is obtained by multiplying the corresponding classification score with cn. Only those targets closing to the center of one ground-truth box will be considered positive samples. Then those low-quality bounding boxes might be filtered out by the final non-maximum suppression (NMS, Neubeck and Gool 2006) process. As a result, the classification labels and bounding box coordinates of the detected targets closing to the center of any ground-truths will be fed to the mask branch.
In the mask branch, we adopted a scale-aware attention network (SAAN, Liang et al. 2020), which uses spatial attention mechanism (SAM, Woo et al. 2018) and channel attention mechanism (CAM, Woo et al. 2018) to aggregate features at different levels for ensuring the robustness of instance masks with different sizes. SAM focuses on 'where' is an informative part, and CAM focuses on 'what' is meaningful given an input image. The details of SAM and CAM are followed at below.
The aim of SAM is to enhance the relevant features of the target and suppress the background noise, which structure is shown in Fig. 4 (a). The feature maps are first applied maxpooling and average-pooling operations along the channel axis. They are then concatenated and convolved by a standard convolution layer, finally activated by a sigmoid function.
The CAM aims to exploit the inter-channel relationship of features, which structure is shown in Fig. 4 (b). The spatial information of a feature map is aggregated using both global average-pooling and global max-pooling operations, generating two different spatial context descriptors. Then they are forwarded to a 1 × 1 convolution layer and a fully connected layer separately, finally merged by an elementwise addition following a sigmoid function.
The structure of SAAN is shown in Fig. 5 (c). P2, P3, and P4 are chosen to input into the SAAN due to their relatively large sizes. The feature maps of P2, P3, and P4 are upsampled to 1/4 of the original size separately. Then they are processed by SAM to generate spatial attention maps separately. These spatial attention maps and the feature maps are sequential element-wise multiplied and element-wise added. After that, the outputs of P2, P3, and P4 are concatenated, having 768 channels. With the sequential arrangement of SAM and CAM, the concatenated feature maps are input into CAM to generate channel attention maps. And then, these channel attention maps are applied on the features by element-wise multiplication and element-wise addition sequentially. So far, the attention maps fused of P2, P3, and P4 with 768 channels are produced.
After SAAN, these attention maps are convolved again to change the channels to 256. Then they are fed to the final step, the Point-based Rendering (PointRend) module, which performs point-based segmentation prediction at adaptively selected locations using an iterative subdivision algorithm (Kirillov et al. 2020). The details of PointRend are shown in Fig. 6. There are two inputs: the attention feature maps coming from SAAN, and the classification labels and bounding . The feature maps are aggregated by using global average-pooling and global maxpooling separately, and then they are forwarded to a 1 × 1 convolution layer and a fully connected layer separately, finally merged by an element-wise addition following a sigmoid function. (c) The struc-ture of SAAN. The feature maps of P2, P3 and P4 are upsampled to 1/4 of the original size separately. Then they are processed by SAM to generate spatial attention maps separately. These spatial attention maps and the feature maps are sequential element-wise multiplied and element-wise added. After that, these outputs are concatenated, having 768 channels. And then they are forwarded into CAM to generate channel attention maps, finally merged by an element-wise multiplication and an element-wise addition sequentially. So far, the attention maps fused of P2, P3 and P4 are produced Based on its coarse-to-fine fashion, two feature types are used in PointRend. One is fine-grained feature, and another is coarse prediction feature. The detected bounding boxes corresponding to the attention feature maps are directly adopted as fine-grained features for extracting fine segmentation details. The coarse predictions masks are generated by a standard lightweight segmentation head, which applies to each bounding box in the fine-grained features. The size of coarse prediction masks is only 7 × 7 pixels 2 .
Each detected bounding box coming from shared head is processed as below in turn.
To refine the coarse prediction, each detected bounding box is rendered iteratively in a coarse-to-fine fashion. In each iteration of adaptive subdivision, the coarse prediction is upsampled using bilinear interpolation. A set of uncertain points in the upsampled map are selected. Most are at locations with a high chance that the value is significantly different from its neighbors, e.g., the points located at object boundaries. For each selected point, its point-wise feature representation is computed, which is obtained by concate-nating its coarse prediction features and corresponding finegrained features. And then, the point-wise feature representation is forwarded to a small multi-layer perceptron (MLP) that shares weights across all points. As a result, each point is predicted its segmentation label independently. The prediction will refine uncertain regions of the predicted mask on the finer grid. This process is repeated until the segmentation is upsampled to a desired grid.
The loss function is crucial for continuously adjusting the weight of each parameter of a model to minimize the loss during training. The loss function of SPR-mask is defined as a multitask loss: (10) where L cls , L reg and L center−ness are loss of classification, loss of regression, and loss of center-ness, respectively (Lin et al. 2017b), and L point is the loss of PointRend module (Kirillov et al. 2020). The λ 1 , λ 2 , and λ 3 are coefficients for balancing different losses, which all are set to 1 after experiments.

Training & testing
Some hyperparameters of deep learning methods must be manually set. After lots of trial and error, the hyperparameters are fine-tuned to segment sunspots in the YNAO drawings for optimal efficiency. Details are as follows: batch size is set as 1, momentum as 0.9, Pos-Radius as 2.0, weight_decay as 0.0001, and optimizer as SGD. The learning rate increases gradually up to 0.001 with a warm-up factor of 0.001 at the first 1 000 steps, which is called the liner warm-up strategy.
Besides that, the point selection strategy of PointRend module is crucial. We choose a non-iterative strategy based on randomly sampling, which ensures that these random sampled points should be more densely located near the instance edges.
After training the network, the testing set was fed to the trained network. The precision, recall, and average precision (AP ) are used to evaluate the overall performance of this method. The definitions are as follows: where T P is true positive, which represents the number of positive classes predicted by the model in positive samples. F P is false positives and represents the number of classes that were incorrectly labeled as positives in negative samples. False negative F N is the number of positive samples predicted by the model as negative classes. In this work, only predicted targets with a difference between IoU and ground truth greater than 0.6 were considered correct predictions. AP is the average of all recall values, with a range from 0 to 1. P (r) represents the precision-recall curve. The precision, recall, and AP of SPR-mask are 0.92, 0.93, and 0.92, respectively. That means SPR-mask has a good performance in detection, classification, and segmentation. Based on the satisfactory evaluation, a total of 15 634 YNAO drawings were fed to the trained model to detect, classify, and segment the sunspots. It takes less than 1 minute for one drawing on average. Figure 7 shows the detection, classification and segmentation results of the drawing shown in Fig. 1. Each type is highlighted in a different color: pore in red, spot in yellow, umbra in green, and hole in blue. For the convenience of viewing, the dividing line between the N-S hemispheres is overdrawn in the image. According to the sunspot groups circled in the hand drawing, we labeled them from R1 to R6. Most sunspots show fitted contours in the zoom-in view. Table 2 lists the detected numbers of four sunspot types, the detected sunspot numbers in the N-S hemispheres, and the hand records in the drawing. The comparison results are satisfactory, and only the largest region R1 has a sunspot difference. We also checked the other detected drawings and found that such a difference usually exists in dense sunspot groups or early drawings. This might be due to the blurring drawing with pencils over time. Table 3 lists the sunspot area of each sunspot group. The differences between humanlabeled data and ours are less than 0.1 μHem, except for R4. The relatively large difference of R4 is mainly due to the less-than-perfect edge of biggest sunspot in R4. Figure 8 shows some results of other typical sunspots. It can be seen that the edges of sunspots really coincide with those drawn in the hand drawings, especially in obviously sharp regions. Note that there is some fault detection and missing detection, e.g., the arrowed two umbrae in panel (a) are detected as one, the arrowed pores in panel (b) and (c) are missing detected, the arrowed location in panel (d) has a fault detection as an umbra, etc. Fig. 7 The detection, classification, and segmentation results of the drawing shown in Fig. 1. Each type is highlighted in a different color: pore in red, spot in yellow, umbra in green, and hole in blue. The dividing line between the N-S hemispheres is overdrawn in the image. The sunspot groups are labeled from R1 to R6   Table 4 lists the sunspot numbers and sunspot areas of each panel in Fig. 8. All differences between human-labeled and detected sunspot numbers are less than 1, except for panel (b). The sunspot number difference of panel (b) is 4; however, their sunspot areas are close because those missing sunspots are tiny pores. Note that the human-labeled sunspot area of panel (a) was blank in the drawing in which it was observed on 19 February 1957. Except for that, all sunspot areas are very close.

Statistics
The numbers of pores, spots, umbrae, and holes in the N-S hemispheres are counted automatically after segmentation. The sunspot numbers include pore numbers and umbra numbers. The area is calculated as SA = A × 10 6 / πR 2 , where A is the pixel number of the mask, and R is the radius of the Solar disk. The unit is a millionth of the solar hemisphere (μHem). This study adopts the area for correcting the solar projection effect, which is calculated as SAp = SA sec[arcsin(r/R)], where SA and R are described like above, r is the radial distance. Note that the area of a spot equals the area of the entire spot region after subtracting the area of its internal holes. The area of a penumbra equals the spot area subtracting the sum of its internal umbrae. The daily sunspot area equals the sum of the pore areas and the spot areas in one drawing. All data are publicly shared at https://github.com/yzs64/YNAO_sd/. Figure 9 (a) shows the daily sunspot numbers of the YNAO drawings from 1957 to 2015. The green curve represents the 13-month smoothing curve of the YNAO sunspot numbers. The orange curve represents that of the PMO sunspot numbers from 1957 to 2011 for comparison (the data come from Xu et al. (2021)). The vertical dashed lines  represent the minimum interval between the two solar cycles, meanwhile, the solar cycle numbers are labeled in the figure. Figures 9 (b) and (c) show the sunspot numbers in the N-S hemispheres, respectively. Both the red and blue curves represent 13-month smooth curves. Figure 9 (d) compares them in the same panel using an enlarged vertical coordinate. All trends of the total sunspot numbers, the northern hemispheric sunspot numbers, and the southern hemispheric sunspot numbers are consistent with the solar cycles. However, it can be seen that N-S asymmetries are obvious at the declining and increasing phases around the maxima in all solar cycles. The N-S peaks are shifted before or later in time from several months up to over two years randomly. The declining phase in solar cycle 19 reveals an excess for the activity of the northern hemisphere, which the most striking asymmetry lasting about two years. This kind of extreme asymmetry dominated by the northern hemisphere is sustained before the solar maximum in solar cycle 20 and then dominated by the southern hemisphere after the solar maximum, which is not striking. From solar cycle 21 to 23, N-S asymmetries are similar to solar cycle 20, and all of them are relatively striking. The northern hemisphere starts major activity during the increasing phases before solar maximum, whereas the south starts major activity during the declining phases after solar maximum. Figure 10 shows the daily sunspot areas after correcting the solar projection effect and the 13-month smoothing curves in which the contents in each panel are the same as in Fig. 9. In panel (a), the YNAO sunspot areas are very close to the PMO sunspot areas. The trends of sunspot areas also show obvious rules of the 11-year solar activity cycle (Li et al. 2011). Comparing with the hemispheric sunspot numbers in Fig. 9 (d), Fig. 10 (d) shows that the N-S asymmetries are obvious in the same way. The trends of hemispheric sunspot areas are completely synchronized with the hemispheric sunspot numbers. The details of these solar cycles  1965, June 1976, September 1986, May 1996, November 2008 are very consistent with the results using sunspot relative numbers by Temmer et al. (2006).
In detail, Fig. 11 shows the 13-month smoothed curves of sunspot numbers of pores, spots, umbrae, and holes separately. Figure 11 (a) compares the sunspot numbers of YNAO with those of PMO in each category. The YNAO sunspot numbers of pores are larger than that of PMO, and those of the other categories have a little difference.  show the sunspot numbers of four categories in the northern and southern hemispheres, respectively. The numbers of pores are distinctly higher than that of the other categories, no matter which hemisphere. By contrast, the numbers of holes are very low, of which 97.9% is zero in the full disk, even though the maximum is only 3. All trends of pores, spots, and umbrae show the rule of the solar cycle, only differing in their amplitudes. It can be seen that  1965, June 1976, September 1986, May 1996, November 2008 1965, June 1976, September 1986, May 1996, November 2008 N-S asymmetries are very obvious in each category in all solar cycles. The times of peaks and the numbers of peaks show large differences. Figure 12 shows the 13-month smoothed curves of sunspot areas after correcting the solar projection effect of pores, penumbrae, umbrae, and holes separately, and the contents in each panel are the same as in Fig. 11. In panel (a), the areas of each category of YNAO differ slightly from those of PMO. In these three panels, the areas of penumbrae account for a large proportion, regardless of hemisphere. The areas of umbrae and pores are very close, low and flat. The areas of holes practically drop to zero, which is consistent with the number of holes.
It is found that even though the pore numbers are overwhelming, their area proportion is very small. Differing in numbers, the penumbra areas show a very clear rule of solar cycle, while the areas of umbrae and pores also show a less obvious 11-year cycle. The N-S asymmetries are also evident in areas of each category in all solar cycles. Peak times and numbers of peaks also show clear differences.
We believe that some differences between YNAO and PMO seen in Fig. 9 to Fig. 12 are mainly due to different human drawings and labeling after comparing lots of sunspot drawings obtained from YNAO and PMO. Although both are located in China, they have different observation conditions, such as longitude and latitude (YNAO: N 25 • E102 • , PMO: N 32 • E118 • ), weather, observer habits, and so on. In addition, applying various processing methods will result in some differences.

Discussion
We carefully compared the segmentation results obtained by the HTC method (Xu et al. 2021), CondInst method (Tian et al. 2020), and SPR-mask method. The HTC method was used to segment sunspots in PMO drawings, which is based on a hybrid task cascade (HTC) model. Another is called CondInst, which in recent years has been evaluated as a more suitable segmentation method for handling instances with different aspect ratios and irregular shapes. Table 5 lists the metrics for evaluating these methods using the same testing set. The precision, recall, and AP of SPR-mask are slightly higher than others. Figure 13 shows some instances with large-spanning recorded times, which are segmented by HTC (second row), CondInst (third row), and SPR-mask (fourth row) separately. Meanwhile, Table 6 lists the sunspot numbers and sunspot areas of each panel in Fig. 13. The detected sunspot numbers of SPR-mask are closest to human-labeled data with fewer missed detections, the second are those of HTC, and the worst are those of CondInst with more missed detections. Fortunately, the missed detections are tiny pores with small sizes. Therefore, these sunspot areas in each panel are still very close, even though some existing detections are missed. In the field of segmentation, the SPR-mask works best followed by the CondInst method. The HTC has the worst performance, smoothing out sunspot roughness. This can also explain why both the sunspot areas of HTC and CondInst in panel (c) are larger than that of human-labeled and SPR-mask. In the field of classification, all three methods have excellent performance in which only an umbra at the boundary of a sunspot in Panel (b) is an error classified as a pore by HTC method.
In short, the SPR-mask performs well in detecting, classifying, and segmenting drawn sunspots. The PointRend technology in the SPR-mask plays an important role in which most points located at object boundaries are refined to their accurate segmentation labels iteratively in a coarse-to-fine fashion. Besides that, the combination of ResNet and FPN can extract multi-level features, which is very helpful for detecting different-size sunspots; the SAAN structure can aggregate features at different levels to ensure the robustness of instance masks with different sizes. All of them contribute to the satisfying segmentation results of SPR-mask. The HTC method also has a satisfactory evaluation but gives relatively rough edges. The sunspot edges extracted by the CondInst method are better than those of HTC, but it is unsatisfied in small pores for which false and missed detections are prominent. Of course, there are still rooms for improvement in SPR-mask, such as missed and fault detections.
In Sect. 4.2, the YNAO data has been compared with the PMO data. In addition, we compared the YNAO sunspot numbers with the sunspot relative numbers derived from sunspot drawings made at the Kanzelhöhe Observatory (KSO) in Austria (the Central European Solar Archive, http://cesar.kso.ac.at/, Steinegger et al. 2001). Figure 14 1965, June 1976, September 1986, May 1996, November 2008 sunspot relative numbers are represented by black add mark and blue cross mark, respectively. The solid red and dashed green lines represent the 13-month smoothed YNAO and KSO data, respectively. The KSO monthly-averaged data are higher than YNAO data in solar cycles 19-21, especially before 1960. On the contrary, the YNAO data are slightly higher than KSO data in solar cycles 22 and 23. The possible reason may be the large numbers of pores recorded during these two cycles, which can be seen in Fig. 11. The 13-month smoothed YNAO and KSO data are close, regardless of the full disk, northern or southern hemisphere. The YNAO sunspot areas also were compared with RGO (the Royal Greenwich Observatory) sunspot areas (https://solarscience.msfc.nasa.gov/greenwch.shtml). Figure 15 shows the monthly-averaged YNAO and RGO sunspot areas from 1957 to 2015 of full disk, northern, and southern hemispheres in panel (a), (b), and (c), respectively. The 13-months smoothed curves show similar double or sev-eral peaks at corresponding positions. Unlike the sunspot numbers, significantly less than KSO data before 1960, the sunspot areas are slightly greater than that of RGO.

Conclusion
This paper presents a deep learning method, SPR-Mask, for segmenting the components of each sunspot in the YNAO hand drawings spanning from 1957 to 2015. Sunspots are classified into four types: pores, spots, umbrae, and holes. The SPR-Mask method consists of three main parts: backbone, shared head and mask branch. The backbone combines ResNet-50 and FPN, which can better extract sunspot features with different resolution-level. The shared head performs classification, regression and center-ness tasks. The mask branch applies the SAAN technology, which fuses and enhances multi-layer features by spatial and channel atten- Fig. 15 (a) shows the monthly-averaged sunspot areas after correcting the solar projection effect (SAp) of YNAO and RGO over the entire solar surface from 1957 to 2015. Panel (b) and (c) show the monthly-averaged sunspot areas for the northern and southern hemispheres, respectively. The YNAO and RGO sunspot areas are represented by black add mark and blue cross mark, respectively. The solid red and dashed green lines represent the 13-month smoothed YNAO and RGO data, respectively. The vertical dashed lines indicate the minima of solar cycles in October 1965, June 1976, September 1986, May 1996, November 2008 tion mechanisms. A PointRend technology is adopted to perform mask prediction for instances using an adaptive subdivision algorithm iteratively in a coarse-to-fine fashion. After training and testing, the evaluation metrics indicate that the method has a good performance, with precision, recall, and AP values of 0.92, 0.93, and 0.92, respectively. The results show that the SPR-Mask method is good at segmenting more precise contours of sunspots. Besides that, we modified the lenet-5 deep learning method to identify the values of P , B 0 , L 0 and L characters, for determining each sunspot belonging to the northern or southern hemisphere. As a result, the daily sunspot numbers and areas, divided into pores, spots, umbrae, and holes, for the full solar disk, the northern hemisphere and the southern hemisphere are all publicly shared at https://github.com/yzs64/YNAO_sd/. We also compared YNAO sunspot numbers with that of Kanzelhöhe Observatory (KSO), YNAO sunspot areas with that of Royal Greenwich Observatory (RGO), YNAO sunspot numbers and areas with Purple Mountain Astronomical Observatory (PMO) data, respectively. The comparison results show that the YNAO data extracted by the SPR-Mask method are valuable for enriching the global historical sunspot databases. It is helpful for a better understanding of long-term solar activity cycles and N-S asymmetry analysis.