Network design
For automated gold particle detection in EM images, we chose to employ a conditional generative adversarial network (cGAN) architecture (Fig. 1), based on a previously developed approach named Pix2Pix 6. A cGAN uses variant convolutional neural networks for both generator and discriminator networks. During training, an input image (Fig. 1a) is assigned both human-generated annotations (Fig. 1b), and annotations assigned by the generator network (Fig. 1c). Ground truth annotations are applied as a square pixel mask over each gold particle (Fig. 1b). The annotated images are then fed to the discriminator network, which is tasked with discerning between the outcome image created by the generator (“fake”), and the ground truth created by a human expert (“real”) (Fig 1d). The weights of both the generator and discriminator networks are then updated to increase overall annotation performance by directing the generator network to improve in annotating the images that the discriminator network is able to detect as false. The constant process of weight re-adjustments improves overall network performance, resulting in its ability to generate human-level image annotations7.
Generating training data for Gold Digger required identifying colloidal gold particles, overlaying them with a colored mask, and removing the background (Fig. 2 a-c). Gold Digger was trained on 6nm and 12nm particles. The training dataset was balanced to include images which contained both particle sizes to avoid labeling bias by the algorithm. Thus, in our implementation, Gold Digger creates a mask proportional to the size of the gold particle (Fig. 2b). By building in this functionality, we could increase the utility of the algorithm to enable automated analysis of tissue with multiple immunogold labels, such as when different protein distributions need to be assessed in the same sample using different sized gold particles for each target of interest. We achieved this using connected component analysis, where neighboring pixels in a mask are connected to form object components. By finding the center of mass and surface area of every identified component, Gold Digger can discern clusters from the list of component surface areas in an analyzed image (Fig 2c). This allows Gold Digger to assign annotations to size groups among the population of identified gold particles (Fig. 2d). Once Gold Digger has assigned gold particles to a group size, it creates a file with the pixel coordinates for every gold particle within that group.
The cGAN was trained to identify colloidal gold particles in FRIL images. As colloidal gold is regular in shape, we reasoned that its detection via an algorithm was feasible. We purposefully attempted to train the network with a maximally heterogeneous representation of FRIL images to avoid overfitting 8. FRIL datasets are especially large and have characteristically varied background features (Fig. 3a), thus, gold particles in our FRIL training images were populated in the varied backgrounds of our samples that included uneven gray scale and contrast (Fig. 3b) that shifted with the topological structures in the sample that cast local shadows (Fig. 3c). Notably, these features typically present significant challenges to computer-vision-guided gold particle detection. Our final annotated dataset consisted of six different FRIL images obtained from brain tissue that were automatically divided into 3000 sub-sections, each sub-section 256 by 256 pixels as shown in Fig. 1a. By building image sub-sectioning into our analysis workflow, we ensured that our algorithm was able to analyze datasets of any starting size, including those obtained with high-magnification tile scans, which is a common approach for imaging large regions of brain tissue at a resolution necessary for visualizing the colloidal gold particles and for quantifying the fine structure of the long, complex dendritic appendages of neurons (Fig. 3a).
Network performance
We compared the ability of Gold Digger to annotate gold particles in EM images to human manual performance. We quantified the time, accuracy, and variability of four different individuals who were asked to annotate gold particles in FRIL images and compared these measurements to those collected from Gold Digger. Two reference images were used (2537 by 3413 pixels; small, and 3326 by 7043 pixels; medium); both images included two different sized gold particles (6 and 12 nm) totaling 612 and 1810 particles, respectively. Gold particle annotations were evaluated by an expert to determine a ground truth for the number of particles, as well as their respective locations. When comparing Gold Digger’s gold-particle annotation accuracy to that of the human annotators, the network performed at a comparable level, independent of gold particle size (Fig. 4a). Gold Digger’s analysis speed, on the other hand, was magnitudes faster than human annotation (Fig. 4b). It is important to note that these reference images were relatively small; tile-scanned FRIL images from brain tissue can easily exceed 16000 by 1600 pixels (Fig. 3a), enforcing the idea that manually counting gold particles is tedious and slow. Finally, the center-mark locations for gold particles identified by Gold Digger were less variable than those tagged by human annotators, with less variability for larger sized gold particles (Fig. 4c).
To compare Gold Digger’s accuracy to an alternative automated approach, we developed a computer-vision solution based on a previously published algorithm 3. Briefly, this algorithm applied a fixed parameter gray-level threshold to the image. Next, the remaining binary components were filtered between a predetermined minimum and maximum area parameter. These parameters were chosen based on the known sizes of gold particles contained in the image (i.e., 6 and 12 nm). Lastly, a circularity threshold was applied to each component to validate the common roundness of colloidal gold particles. This threshold-area-circularity (TAC) algorithm showed a high accuracy on the smaller reference image comparable to the level achieved by Gold Digger, though the performance of the TAC algorithm degraded when analyzing the larger reference image, missing nearly half of the gold particles (Fig. 4a). The drop off in performance of the TAC algorithm is likely attributable to the difficulty in establishing an effective, uniform color threshold to differentiate all gold particles in large FRIL images due to the increasing variability of background gray-values inherent in larger, complex images.
Pretraining and transfer learning
To reduce the amount of training data that was necessary to achieve acceptable gold-particle annotation performance, we pretrained our network using the UT Zappos50k dataset 9,10. As opposed to training a network from scratch, pretraining is advantageous because it allows the network to use what it learns from one dataset and apply it to the next dataset in a process called transfer learning 11. Transfer learning allows the weights of the network to update from their random initialization values to ones closer to the weights needed to solve the problem at hand. Although these initial pretraining images pose little resemblance to EM data, this first round of pretraining allows the network to better equip itself for general image analysis by learning the various features which exist in images, such as edges, shades, and shapes. Thus, the cGAN network can discriminate between features in newly introduced training sets more easily, resulting in a decreased need for additional training data.
To gauge the benefit of pretraining Gold Digger, we measured the L1 distance of our generator network, which is also implemented as the loss function used to measure the success of the network during training. We compared the loss value of a naïve network, which saw no prior data, and a network pretrained on the UT Zappos50k dataset. We observed large decreases in both the initial loss value of the pretrained network, as well as the number of training epochs that it took for the loss to converge (Fig. 4d). This was likely the result of the previously mentioned initialization process, which gives pretrained networks a set of learned image processing features. In the same way that it improved the current level of Gold Digger’s performance, pretraining is expected to facilitate future training of the network as well.
Generalization
Gold Digger uses a deep learning architecture to employ the power of generalization to vastly increase its applicability to heterogeneous data. Generalization is similar to transfer learning obtained through training, as it allows the learned evaluative features of a convolutional neural network to accomplish a similar level of performance on previously unseen data even if that differs significantly from the original training data 7. We tested Gold Digger’s generalization ability by evaluating its annotation performance on images which were collected based on magnification and sample preparations not present in the training set. While FRIL training images were collected at a 43k magnification, we employed image scaling to create “faux” magnifications in the training set. We then examined Gold Digger’s ability to identify colloidal gold particles on datasets collected at two different microscope magnifications – 43k and 83k. Gold Digger quantified gold particles within the same area of interest imaged at both magnifications, achieving a similar accuracy at both resolutions despite its lack of previous training with the high-magnification images (Fig. 4e).
Colloidal gold particles are used as electron dense markers for labeling FRIL samples as well as those prepared using the post-embedding approach. To determine if Gold Digger could generalize annotation of colloidal gold particles to images obtained from an EM preparation that differed from that used for training, we analyzed data obtained from pre- and post-embedding immunogold labeled brain samples, where the immuno-reaction is performed prior to, and after, embedding the sample into resin, respectively. Pre- and post-embedded sections were immunolabeled with a single size of gold particle (12 nm) and imaged at 43k magnification (1.11 nm/pixel resolution) (Fig. 5b-c), and compared to results collected using FLIM (Fig. 5a). Although Gold Digger had been trained to identify gold particles only in FRIL images (Fig. 5a), it was able to achieve a similar level of annotation accuracy in images from post-embedded samples (Fig. 5b, d). However, Gold Digger did not reach any notable labeling accuracy on pre-embedded samples (Fig. 5c,d). Overall, these results indicate that Gold Digger does not seem to be narrowly overfit to its training dataset and is capable of annotating colloidal gold particles in images of varying magnifications and sample preparations.