Automatic cell segmentation from brighteld microscopy images of pseudohyphal cell-aggregates

Background: The automatic segmentation of pseudohyphal cell-aggregates from brighteld microscopy images for counting forming cells is a challenging task due to the heterogeneous optical appearances of the cells as they may lie on different focal planes. The current cell counting method is based on a time-consuming manual counting of stained cells on a hemocytometer and in most cases, it represents estimates of low statistical signicance due to the effort needed to prepare and analyze many samples. In this work, we evaluated the effectiveness of a marker-controlled watershed algorithm for automatic segmentation of pseudohyphae from brighteld microscopic images. The cell heterogeneity problem was addressed by processing intracellular contents of focused and defocused cells to extract initial foreground markers for the watershed method. By properly segmenting cells of different classes within a pseudohypha allows increasing the number of cells analyzed contributing thus to more reliable estimates. To facilitate the evaluation of the proposal by acquiring images containing a diversity of cells´ appearances, we utilized in situ microscopy, an imaging system used to capture images directly from suspensions.Results: The performance of the method was evaluated on 120 portraits of a yeast exhibiting a diversity of pseudohyphal morphologies. Automatic results were compared with manual references obtained by visual inspection of the images. Despite the simultaneous occurrence of a representative mixture of focused, over-, and under-focused cells, the method produced robust results with an average segmentation sensitivity, specicity, and accuracy of 76%, 89%, and 76%, respectively. On average, each microscopic image was processed within 3 s.Conclusions: Our approach was capable to segment pseudohyphae formed by cells exhibiting a large diversity of appearances. The application of a marker-controlled watershed algorithm as a simple, yet effective technique for segmenting pseudohyphae demonstrated satisfactory overall performance to support automated analysis of pseudohyphal cell-aggregates from brighteld images.

estimates of low statistical signi cance due to the effort needed to prepare and analyze many samples. In this work, we evaluated the effectiveness of a marker-controlled watershed algorithm for automatic segmentation of pseudohyphae from bright eld microscopic images. The cell heterogeneity problem was addressed by processing intracellular contents of focused and defocused cells to extract initial foreground markers for the watershed method. By properly segmenting cells of different classes within a pseudohypha allows increasing the number of cells analyzed contributing thus to more reliable estimates. To facilitate the evaluation of the proposal by acquiring images containing a diversity of cellsá ppearances, we utilized in situ microscopy, an imaging system used to capture images directly from suspensions.Results: The performance of the method was evaluated on 120 portraits of a yeast exhibiting a diversity of pseudohyphal morphologies. Automatic results were compared with manual references obtained by visual inspection of the images. Despite the simultaneous occurrence of a representative mixture of focused, over-, and under-focused cells, the method produced robust results with an average segmentation sensitivity, speci city, and accuracy of 76%, 89%, and 76%, respectively. On average, each microscopic image was processed within 3 s.Conclusions: Our approach was capable to segment pseudohyphae formed by cells exhibiting a large diversity of appearances. The application of a marker-controlled watershed algorithm as a simple, yet effective technique for segmenting pseudohyphae demonstrated satisfactory overall performance to support automated analysis of pseudohyphal cellaggregates from bright eld images.

Background
Single-cell yeast suspensions are usually preferable in industrial yeast fermentation processes over cells in aggregates. Single cells exhibit a larger contact area with the nutrient medium, which in many cases helps to optimize the process. However, under deprivation of glucose or nitrogen [1,2] or in the presence of fusel alcohols [3], it is commonly observed that Saccharomyces cerevisiae strains switch between cell division with separated daughter cells and the generation of pseudohyphae, a morphology that exhibits several chains of still attached daughter cells growing at arbitrary orientations.
Besides affecting fermentation time and e ciency, pseudohyphal yeast morphology may also impair centrifugation systems. The latter is an expensive and complex process aimed at recovering yeast cells for subsequent fermentation cycles [4]. Pseudohyphal morphology is characteristic of a very common rough type of yeast colony on solid media. If it is found in biofuel fermentation processes, it normally exhibits a higher tolerance with respect to osmotic and ethanolic stresses [5]. This biotype of cell yeast morphology is also called snow ake [6]. Studies have shown that a wild S. cerevisiae strain exhibiting pseudohyphal morphology is often associated with lower ethanol production and higher residual sugar content in the fermented must [7,8]. In contrast, in the brewing and chemical manufacturing industry, pseudohyphal yeasts have been exploited to change avour and aroma in beverage products and to produce acetic acid, respectively [9].
Because pseudohyphal morphology is so relevant to many industrial processes, the delivery of morphological information at high rates can assist the personnel of a production process in implementing assertive actions within practical times.
Direct counting using a hemocytometer in conventional light microscopy is an inexpensive and easy-touse technique. However, it is also laborious and time-consuming if used for enumeration of single cells, budding cells and pseudohyphal cell-aggregates.
Only a few proposals for analysis of yeasts exhibiting pseudohyphal morphology have been presented.
As an example, a computational algorithm to detect both yeast and pseudohyphal form structures from a database of phase contrast microscopy images of the human pathogen Candida glabrata, a species that exhibits elongated pseudohyphal growth structures, was developed [10]. Another work combined image cytometry with uorescent staining to determine several parameters of the yeast Brettanomyces during beer fermentation [9], a strain that also forms pseudohyphae. Using acridine orange and propidium iodide stains to identify live and dead cells, respectively, cells imaged on a dedicated counting chamber allowed measurements of cell concentration and viability of pseudohyphae. Despite improving the accuracy and e ciency of cell counting by highlighting stained-nuclei yeasts, large statistics from non-correlated samples are only possible by extracting, preparing and processing of many samples.
Optical density measurements can automate the measurement of biomass without needing sampling.
However, its accuracy bears on the assumption that a homogeneous culture of suspended single cells is under analysis. As a non-imaging approach, optical density lacks the capacity of assessing, for instance, a transient development of cell´s morphology.
When observing pseudohyphal cell-aggregates through a light microscope, forming cells may exhibit different optical appearances as they may lie on different distances from the objective focal plane. Thus, only a fraction of the pseudohyphal structure may be imaged in-focus, an imaging condition usually chosen in routine microscopy as cellular morphology and internal structures of the specimen can be easily recognizable. As cells located outside the focal plane lead to defocused cells, rendering, ultimately, a cell unrecognizable under larger amounts of defocus, the number of forming cells may be underestimated when solely cells in-focus are considered by segmentation algorithms.
However, since defocused cells are still part of the pseudohyphae under analysis, more representative estimates with respect to the number of forming cells could be obtained if not only cells in-focus but also defocused ones could be considered.
In this work, intracellular contents of cells in-focus as well as those located above and below the focal plane were processed to create initial foreground markers for a marker-controlled watershed algorithm performs automatic pseudohyphal cell-aggregates segmentation from bright eld images. The markers extraction was carried out by applying the same sequence of morphological reconstructions along with conventional morphological operations, all adjustment-free parameters, to all pseudohyphae input images regardless of the forming cells´ focus condition. The strategy was applied on microscopic images containing each pseudohyphae with a representative mixture of focused, over-focused, and underfocused cells. The performance of the segmentation algorithm was evaluated by comparing automatic results with manual references obtained by visual inspection of the images.

Results
2207 ISM images were acquired. An exemplary image is shown in Fig. 1.
Unlike dispersed S. cerevisiae single yeast cells, see e.g. [12], the present strain exhibits a varying amount of branched chains of cells with different orientations so that their optical appearance depends on their distance from the objective focal plane (Fig. 1). Cells located between the light source and the focal plane act as convex lenses and generate bright spots with smooth dark edges. On the other hand, cells located between the objective lens and the focal plane appear as dark, blurred spots [19,25]. In this study, the rst group of cells is termed over-focused and the second is termed under-focused.
As seen in Fig. 1, pseudohyphae with a representative mixture of focused, slightly over-focused and slightly under-focused cells were imaged. The coexistence of these different focusing conditions poses a challenge to the cell detection with xed parameters like the threshold values and the cut-off frequencies in the lters. These parameters were optimized by experiment and kept constant during the whole experiment. To test the performance of the cell segmentation technique, single-pseudohyphae portraits automatically cropped from ISM images were analyzed by the segmentation algorithm. The portraits were classi ed manually into four categories of pseudohyphae (Fig. 2). The estimate of the number of pseudohyphae was based on the number of portraits generated by the algorithm. From 6690 pseudohyphae from manual counting performed directly in the 2207 ISM images, the algorithm generated 6632 portraits. Table 1 displays the manual classi cation results from the portraits. The algorithm detection error of 4% (i.e.,, 244 missing pseudohyphae) was due to much blurred structures (probably from pseudohyphae located far away from the over-focused plane) and moderate background intensity heterogeneities erroneously detected as pseudohyphae.
The distribution was determined by manual classi cation of 6446 pseudohyphae portraits from 2207 ISM images.  Compared to the input image in Fig. 3a, the high pass ltering enhanced contrast of intracellular content and edges between conjoined cells and background (Fig. 3b). However, this step also increased background noise, which was signi cantly suppressed by applying a moderate low-pass ltering (Fig.  3c).

Segmentation of pseudohyphae and of their individual cells
Organelles generate image objects interpreted as foreground markers by the watershed algorithm.
Opening-closing by reconstruction followed by regular morphological opening helped to smoothen intracellular regions without impairing signi cantly the cell contours ( Fig. 3 d-e-f). During this process, grayscale irregularities over the entire image are also smoothened to avoid under-and oversegmentations. As undesired regional minima may still exist after this morphological ltering, these are suppressed by applying a h-minima transform, whose effect is illustrated in Fig. 4. The h-value was determined by trial and error, aiming at minimizing segmentation errors as assessed manually on one hundred portraits. Fig. 4 Effect of the h-minima transform on the segmentation performance. a input image to the watershed transform with a reference line for measuring the grayscale pro le. b red curve: original grayscale pro le along the line shown in a, blue curve: grayscale pro le after the h-minima transform: all regional minima with depth smaller or equal to 2 (the adopted h-value) were removed (as shown by the large arrow), while the height of the remaining regional minima was increased by 2 (pointed by the thinner arrows). Wherever blue curve segments are identical with red segments, they are shown only in blue color. c watershed segmentation of the input image without processing with the h-minima transform. The over-segmentation occurred due to the existence of two catchment basins inside a single cell. d watershed segmentation of the input image previously processed by the h-minima transform. Portraits correspond to 250 × 250 pixels.
Finally, the watershed-transform segmented pseudohyphae forming cells as watershed regions ( Fig. 3h  and 4d). Hereby it is a merit of the segmentation technique that the bright center spots within cells (shown in Fig. 3g) can be avoided. They were not segmented because the catchment basins within them had been attened out by the previous morphological operations and h-minima transform (Fig. 4d).
To evaluate the in uence of the focusing conditions on the segmentation performance, the algorithm was applied to the four pseudohyphal categories (Fig. 5). The overall performance of the segmentation algorithm was evaluated by comparing its results with the portraits segmented manually. Table 2 shows the results in detail.
The algorithm was applied to portraits containing pseudohyphae at different focusing conditions. The number of cells in pseudohyphae, as determined by the algorithm, is shown as mean standard deviation of true positives, false negatives, and false positives. For each focusing condition, 30 singlepseudohypha portraits were segmented by the algorithm and subsequently examined manually.
In summary, the mean speci city values (always > 0.88) indicate satisfactory segmentation for all focusing conditions investigated. The accuracy values over all categories show that 77% of the pseudohyphal structures were analyzed correctly.
The average processing time of the algorithm was 3 s per ISM image on an Intel PC, Quad-CPU, 2.66 GHz, 4.0 GB RAM.

Discussion
Because pseudohyphal cell-aggregates are typically heterogeneous with respect to the number and appearance of forming cells, estimating the number of cells in pseudohyphae from images acquired at a single focal plane may not provide reliable information about broadening pseudohyphal structure. By properly processing focused and defocused cells within the same pseudohyphal structure made it possible to quantify larger numbers of forming cells, contributing thus to improve estimates of the number of cells in pseudohyphae. However, while the generation of pseudohyphal cell-aggregates information from bright eld microscopy images relies on accurate analysis of many non-correlated images, automated pseudohyphal cell-aggregates segmentation posed a challenge due to changes in the appearance of the imaged forming cells, as they may lie on different planes of focus.
In applications where the lack of intracellular information is not an issue, the easiest way to segment bright eld images using conventional image processing is based on the detection of over-focused cells through thresholding-based techniques since the cells exhibit a bright spot of focused light which is sharply imaged on a darker background [19,25]. Because defocused bright eld microscopy produces contrast-enhanced images, this image acquisition strategy has been applied on several specimens, including yeast cells [14], mammalian cells [26], human cells [27], and CHO cells [23].
However, as pseudohyphae formed by chains of cells growing at arbitrary orientations, as in the case of the present work, may lie on different distances from the objective focal plane, forming cells may exhibit different appearances in the acquired image. Therefore, reliable estimates of the number of cells in pseudohyphae require the acquisition and proper segmentation of many non-correlated images.
To automate image analysis of pseudohyphae from bright eld microscopic images we applied a standard marker-controlled watershed algorithm. Before applying the watershed method, the input images were preprocessed with the aim to extract initial foreground markers, ideally with a one-to-one correspondence between markers and forming cells. To this end, we exploited cell´s body contents, a common feature observed in most acquired images, provided cell-border exhibit enough contrast against the background. In this process, the same sequence of morphological reconstructions along with conventional morphological operations, all adjustments-free parameters, was applied in every input image to homogenize intracellular contents, no matter their appearance. By inverting the modi ed image so that formerly dark cell borders become bright structures helped to obtain more realistic cell shapes in the segmented image. Over-segmentations were minimized by calculating the h-minima transform in the inverse image; however, it is worth mentioning our di culty in empirically select the h-value to perform satisfactory segmentation over all classes of pseudohyphae due to their heterogeneous appearance.
Inspection of the values of sensitivity, speci city and accuracy revealed that the segmentation algorithm performed best on over-focused cells. The reason is that these cells are displayed with large contrast. In contrast, the performance of the algorithm is less in cases of unfocussed or focused cells due to overand under-segmentations. Over-segmentation occurred due to higher-contrasted intracellular contents that were wrongly segmented as single cells. Under-segmentation occurred mainly because edges between conjoined cells were hardly discernable when they are in-focus and therefore many cell groups were incorrectly segmented as single cells. In the case of under-focused cells, over-segmentation occurred mainly due to the lack of contrast between cell border and background. Bright halos, artifacts observed mainly in under-focused yeasts, generated false positives. This effect was signi cantly suppressed by applying the h-minima transform. Similarly, most intensity irregularities in the background were not segmented as cells. A moderate number of cells, mostly cells in focus, were not recognized (i.e.,, falsely negative) because cells´ contours were missing so that the inside part of the cells was falsely regarded as background. Some false negatives were caused by under-focused cells overlying other cells in the pseudohyphal structure.
In contrast to comparable approaches using, e.g.,, the uorescence capacity of an image cytometer to detect stained cells with higher accuracy, our approach utilized bright eld images. This imaging modality is a crucial prerequisite to facilitate the development of automated analysis of many non-correlated samples as it requires no time-consuming steps to prepare samples with uorescence labels.
Although the automated cell segmentation method allowed the analysis of many non-correlated pseudohyphal cell-aggregates images, the proposal has limitations: while applying the watershed method to all classes of forming cells was a key consideration in obtaining more representative estimates, precise size and shape information about over-focused cells might be not extracted from the segmented images. Due to the convex-lens effect producing a bright spot in the cell center, the apparent size of over-focused cells may differ from their actual size. Likewise, cell shape might be hardly determined in over-focused cells since their edges are less de ned.
Despite the demonstrated capability of our straightforward segmentation approach to support automated analysis of pseudohyphae based on a label-free imaging modality, the proposal still has a large potential for applying more sophisticated image analysis algorithms aiming at improving segmentation accuracy. As an example, one could, e.g. use the overall image contrast and brightness as additional parameters allowing machine learning techniques to adapt the cell segmentation algorithm to the image conditions.

Conclusions
In this work, we evaluated the effectiveness of an image analysis approach based on a standard markercontrolled watershed method to automate cell segmentation for cell counting in pseudohyphal cellaggregates. The proposal was designed for segmenting forming cells from bright eld microscopy images of pseudohyphae imaged at different levels of focus and produced satisfactory accuracy, sensitivity, and speci city levels when compared to manual evaluation. The overall segmentation performance over a wide variety of cells´ appearances showed that the approach supports analysis of pseudohyphal cell-aggregates from bright eld images in a fully automated fashion. By incorporating different classes of cells within the same pseudohypha into the process of estimating the number of forming cell, this work contributed with an alternative approach towards to solve the problem of providing more representative estimates. Improved pseudohyphal cell-aggregates estimates are particularly important for enabling a better insight into biological processes and, ultimately, to improve their performance.

Methods
Microorganism, culture medium and growth conditions A rough-colony yeast strain of S. cerevisiae displaying pseudohyphae was isolated from a fuel ethanol facility in São Paulo State, Brazil, and utilized in this experiment. The strain (originally termed 'strain 52') was identi ed by the sequencing of the D1/D2 region of the large subunit (26S) rRNA and deposited at the culture collection 'Coleção de Culturas Tropical' of 'Fundação André Tosello', Campinas-SP-Brazil under the code CCT7787. This yeast strain was utilized in previous studies of fermentation characteristics [5,7,8].
The yeast strain was stored in YPD medium (10 g/L yeast extract, 20 g/L glucose, 20 g/L peptone and 20 g/L agar) in slants at 4 o C. For analysis, 10 mL of YPD broth in Falcon tubes were inoculated with two loops of the yeast cells and cultivated at 160 rpm and 30 o C for 24 h. After 24 h, the yeast suspension was centrifuged at 580g for 10 minutes and the resulting yeast pellet was suspended in saline solution (0.85% NaCl) for the microscopic observations. The experiment was conducted at room temperature in an unba ed glass vessel (250 mL working volume) equipped with the in situ microscope and a magnetic stirrer (150 rpm).

Image acquisition
Microscopic images were acquired directly from the suspension by using a custom-built high-resolution (0.5 µm) in situ microscope developed at the Mannheim University of Applied Sciences, Mannheim, Germany [11,24]. Essentially, it consists of a transmitted bright eld microscope that is directly coupled to moving suspensions to capture micrographs of the suspended objects.
The images were acquired immediately after the yeast suspension was prepared.

Image analysis
An image analysis algorithm was implemented using the MATLAB Image Processing Toolbox (MathWorks, Ismaning, Germany). Essentially, the algorithm comprises two main functions: (i) detect pseudohyphae in ISM images, and (ii) segment individual cells within the pseudohyphae to estimate the number of cells within each pseudohypha. All steps involved in the algorithm are described as follows.
The rst step is segmenting pseudohyphae from the input ISM image to create portraits containing single pseudohyphae. Due to the illumination by a thin light ber, the ISM images suffer from some vignetting, i.e.,, its periphery is darker than its center [24]. This effect is compensated by using the intensity mean of the rst 30 images for brightness normalization of the entire ISM image.
Thereafter, the local variance [28] of the normalized image is computed in order to nd objects of interest. By applying an intensity threshold to the variance image, a binary image containing segmented objects as groups of connected white pixels on a black background is created. Structures touching the image border are removed by applying border cleaning operator. Morphological dilation using a 3-pixel linear structuring element followed by holes lling operation is performed in the remaining objects. Objects smaller than 1500 pixels are also removed. Another morphological dilation (line-shaped, 7 pixels) followed by holes lling is carried out. Objects smaller than 800 pixels in the resulting image are removed.
A morphological erosion (disk-shaped, 3 pixels) followed by removing objects smaller than 350 pixels and image cleaning border creates the nal binary image.
Afterwards, the centroid of the objects in the nal binary image is computed and this information is utilized to crop 250 250 pixel-sized micrographs-called portraits -from the original ISM image. As the generated portraits may still contain more than one object each, a segmentation using a Sobel operator [28] is performed. To generate portraits containing only one objects centralized inside the portrait, the following steps are performed: morphological dilation (line-shaped, 7 pixels), followed by holes lling and discarding of objects smaller than 1000 pixels. The area of the objects inside the portrait is computed and the largest object is selected. Afterwards, its centroid is computed, and this information is used to centralize the selected object within the portrait.
The second step is to segment individual cells within each single-pseudohypha portrait. To this task, this study utilized a marker-controlled watershed transform [29], one of the most used segmentation techniques for separating touching objects. This method considers a grayscale image as a topographic surface where the intensity of each pixel represents the elevation at this point. In this interpretation, "catchment basins" correspond to dark regions surrounded by bright structures. The segmentation process can be visualized by the idea that the basins are " ooded" starting from certain seed-pixels or pixel regions which are designated as "markers". Here, we use minima in the cell-bodies as markers. The ooding stops at ridge lines where water coming from different basins would meet, separating adjacent catchment basins as the objects of interest. In this way, foreground and background pixels are generated. If the cell-borders exhibit enough contrast against the background or against neighboring cells in hypha, it suggests itself to exploit them as ridge lines between catchment basins in the watershed algorithm. The foreground objects should have a one-to-one relationship to individual cells, no matter whether these are single cells or cells in hyphae. For this to happen, each cell should only get one marker, i.e.,, possess one local minimum. Therefore, the cell portraits must be preprocessed in order to eliminate noise and multiple local brightness minima within the cell-bodies.
Related to this task, the input portraits are preprocessed as follows: First a reduction of high frequency noise is done by averaging over 3 × 3 neighborhoods [28]. Then, the brightness inside portraits is normalized. High-pass ltering is used to enhance the object border and a nal smoothening is applied.
Opening-by-reconstruction [28] is the rst procedure aiming at extracting foreground markers from the celĺ s body by homogenizing intracellular contents. However, there are still some remaining inhomogeneities that can generate multiple local minima and thus several markers within a single cell. In order to avoid over-segmentation by multiple markers, these inhomogeneities are attened by applying an openingclosing by reconstruction [28]. As result, the intracellular region is further smoothened without altering the overall shape of the object. The last steps before applying the watershed operation are: 1. regular morphological opening (square-shaped, 6 pixels) for further smoothening, 2. inverting the image so that the formerly dark cell borders become bright and can perform as ridge lines, 3. border cleaning to get rid of adherent objects to the image borders, 4. h-minima transform [30] in order to obtain only one minimum per cell for a well-de ned marker.
Finally, the watershed operation is applied and the number of cells forming the object (a single cell or a pseudohypha) in a portrait is determined as the number of watershed regions inside that portrait. Please see Fig. 3 for details.

Performance evaluation
To     Effect of the h-minima transform on the segmentation performance. a input image to the watershed transform with a reference line for measuring the grayscale pro le. b red curve: original grayscale pro le along the line shown in a, blue curve: grayscale pro le after the h-minima transform: all regional minima with depth smaller or equal to 2 (the adopted h-value) were removed (as shown by the large arrow), while the height of the remaining regional minima was increased by 2 (pointed by the thinner arrows). Wherever blue curve segments are identical with red segments, they are shown only in blue color. c watershed segmentation of the input image without processing with the h-minima transform. The oversegmentation occurred due to the existence of two catchment basins inside a single cell. d watershed segmentation of the input image previously processed by the h-minima transform. Portraits correspond to 250 × 250 pixels.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.