Latin American applied research
versión ISSN 0327-0793
Lat. Am. appl. res. vol.39 no.3 Bahía Blanca jul. 2009
Automatic marker determination algorithm for watershed segmentation using clustering
† Signal Processing Lab., Electronics Engng. Department, Universidad Nacional de Mar del Plata / CONICET
Abstract Biomedical image processing is a difficult task because of the presence of noise, textured regions, low contrast and high spatial resolution. The objects to be segmented show a great variability in shape, size and intensity whose inaccurate segmentation conditions the ulterior quantification and parameter measurement. The partition of an image in regions that allow the experienced observant to obtain the necessary information can be done using a Mathematical Morphology tool called the Watershed Transform (WT). This transform is able to distinguish extremely complex objects and is easily adaptable to various kinds of images.
The success of the WT depends essentially on the existence of unequivocal markers for each of the objects of interest. The standard methods of marker detection are highly specific, they have a high computational cost and they determine markers in an effective but not automatic way when processing highly textured images. This paper proposes the use of clustering techniques for the automatic detection of markers that allows the application of the WT to biomedical images. The results allow us to conclude that the method proposed is an effective tool for the application of the WT.
Keywords Image Segmentation. Watershed Transform. Mathematical Morphology. Pattern Recognition.
Biomedical images are formed by objects with a high variability in shapes, sizes and intensities (Glasbey and Horgan, 1994). They are highly textured and have a high level of noise, low contrast and great spatial resolution. Watershed Transform (WT) is a powerful morphological tool to segment textured images into regions of interest. This transform is adaptable to different types of images and capable to distinguish extremely complex objects.
The WT is a segmentation method based on regions, which classifies pixels according to their spatial proximity, the gradient of their gray levels and the homogeneity of their textures. To avoid over-segmentation a single marker for each object of interest has to be selected.
Classical Watershed Transform is based on regional minima to initiate the flooding algorithm. That is the reason why this classical method produces oversegmentation in textured images. Unlike the classical W.T. that uses regional minima, the marker-based Watershed Transform uses binary markers to start the flooding algorithm.
The selection of adequate markers on these kinds of images is a painful and sometimes unsuccessful task. Hence, the experienced observer uses to define markers in a semiautomatic way (Gauch, 1999; Jackway, 1996; Wang, 1997). The automatic determination of markers is still a difficult goal to achieve. The current determination algorithms are highly dependent on the structure to be segmented (Gorsevski et al., 2003; Li and Harnarneh, 2007). Moreover, they have a high computational cost and they determine markers in an effective but not automatic way when processing highly textured images (Gonzalez and Ballarin, 2006a; Gonzalez and Ballarin, 2006b).
In this work, we present a two-step watershed algorithm for biomedical image segmentation. The main contribution of this work is that the markers are selected by a clustering technique applied to the oversegmented watershed regions.
First we applied the classical WT using the regional minima as markers. Then we measure texture features inside each region. We applied a cluster algorithm to assign to each region one class. WT markers are selected as the cluster that represents the objects of interest. Finally a marker-based WT was applied to the gradient of the original image using the markers computed previously. The results showed an effective and robust segmentation of highly textured images.
A. Watershed Transform
A gray scale image can be interpreted as the topographic image of landscape (gray intensities of higher amplitude correspond to plains and mountains and the lower intensity ones correspond to valleys and rivers) (Clasteman, 1979; Glasbey and Horgan, 1994; Gonzalez and Woods, 1996). Using the features of these images, the technique of digital image processing called Watershed Transform (WT), which through the flooding of the valleys, is capable of recognizing similar topographical areas, surrounded by mountain ridges. The WT is a segmentation method based on regions, which classifies pixels according to their spatial proximity, the gradient of their gray levels and the homogeneity of their textures (Beucher and Meyer, 1993; Couprie and Bertrant, 1997; Vincent and Soille, 1991).
With the objective of segmenting a gray level image, prior to the application of the WT, a gradient image must be obtained, where the levels of the contours of the objects to be segmented represent an area of elevated gray intensity. The areas of low intensity give way to the basins where the water would flow and flood the topography of the image. The elevations in gray levels generated by the contours would remain and give way to the segmentation of the image through the resulting watershed lines (Beucher and Meyer, 1993; Vincent and Soille, 1991). Mathematical morphology allows us to obtain a gradient which is highly adaptable to different kinds of images with a higher precision than conventional algorithms. In this paper we used the morphological gradient to obtain the intermediate image before applying the WT (Dougherty and Astola, 1993; Facon, 1996).
The classic WT floods the gradient image from its regional minima. In non homogeneous or noise embedded images there is not a one to one relation between regional minima and objects of interest. This results in an over segmentation in the majority of images, in other words, after WT each of the objects is represented by more than one region (Beucher and Meyer, 1993; Vincent and Soille, 1991; Gauch, 1999; Jackway, 1996; Stanislav and Straber, 2000). To avoid this over segmentation a single marker for each object of interest is usually defined. These markers or seeds initiate the flooding algorithms indicating the sector that gives rise to the basins. The success of the WT depends mainly upon the characteristics of the markers.
A method for segmentation and marker determination is to cluster pixels that have similar texture characteristics (Gonzalez and Woods, 1996; Gorsevski et al., 2003). The texture is determined through the use of a window that goes through pixels analyzing the intensity distribution of their neighborhoods. We use a similar technique for grouping regions instead of pixels, using the texture inside the region, instead of the texture inside the window.
The K-means algorithm is a classical cluster technique (Mc Queen 1967; Gonzalez and Woods, 1996; Li and Harnarneh, 2007). This unsupervised algorithm requires the specification of the number of classes in which the data set is going to be partitioned. To each class is assigned a cluster center so that the distance of each pattern to its center is minimal. The partition is done by measuring, in an iterative process, the distance between each pattern to its cluster center, and computing again the centers until there is no change (Beucher and Meyer, 1993; Vincent and Soille, 1991).
The successful application of the classical WT is mostly marker dependent (Rodriguez and Pacheco, 2005; Rodriguez and Alarcon, 2007). The selection of adequate markers in textured images is a very difficult and sometimes an ineffective task. The following algorithm improves markers definition.
B. Proposed Algorithm
We propose a two-step watershed algorithm to avoid the oversegmentation of classical watershed (using minima regional markers) where the markers are selected by clustering applied to the oversegmented watershed regions.
There are different authors that propose to group these oversegmented regions applying region merging techniques in order to obtain the final segmented image (Hernandez and Barner, 2006). However we propose to use these oversegmented regions to obtained new markers for a marked-based WT. The fundamental reasons for the development of the algorithm are to reduce the sensitivity of the algorithm to the noise and the presence of irrelevant objects and also to increase its robustness.
The oversegmented regions were processed to assign features to each one of them in order to distinguish the objects of interest based on their texture. The features used were mean and standard deviation evaluated inside each regions. We applied the classical K-means algorithm to cluster the regions and a morphological operator to obtain the internal markers.
The markers of the background or external markers were obtained by morphological erosion of the internal markers complements. This procedure resulted in adequate segmentations when the internal markers and the objects of interest are of similar size.
The gradient of the biopsies does not have a visible contrast. That makes the application of the flooding algorithms difficult. Consequently, obtaining internal and external markers of a considerable size, the results were improved.
The main steps of the algorithm finally implemented are the following:
Step 1: First we applied the classical WT using the regional minima as markers. The result is a labeled image where each pixel indicates the regional minima to which it belongs.
Step 2: The regions resulting from the WT were processed to assign features vector to each one of them. The reason of this is to distinguish the objects of interest based on their texture. The features used were mean and standard deviation inside each region. The features were used in the next step.
Step 3: The feature vectors were grouped by similarity using the k-means algorithm. We used four classes due to the existence of four components of the biopsies (traveculae, fat, blood cells, and intracellular space). Each region was assigned to one of the four classes.
Step 4: Pixels in the regions belonging to one of the clusters (traveculae) were used to define the markers for the next step via a binary image.
Step 5: The final internal markers were obtained applying, to the previous binary image, morphological opening and closing operators with a 4x4 disk structuring element. (Dougherty and Astola, 1993; Facon, 1996). These operators join adjacent regions (assumed to correspond to the same object). We applied an erosion to eliminate the regions that belong to the possible borders of the objects. This way we obtained the image to be used as the internal marker for a new application of the WT.
Step 6: In order to obtain the background markers (external markers) we applied morphological erosion to the complement of the internal markers.
Step 7: Finally we applied the marker-based watershed transform to the gradient of the original image using the internal and external markers computed in the previous steps.
For this work, 56 bone marrow biopsies were used in order to evaluate the proposed algorithms. In these images we need to segment the trabeculae in order to make a diagnosis. Even though the image is formed by different biological structures, the trabeculae are the ones that are most difficult to segment. Thus, they were used to evaluate the algorithms. The images were acquired through a microscope where light, resolution and contrast were established by the lab technician to obtain the best visualization possible. Although the original biopsies are in color, we turned them into gray levels to facilitate their processing. Due to the great variability of theses conditions it is impossible to define markers automatically with the methods developed so far.
All the algorithms were implemented in Matlab® R14. We worked with the standard functions of this language and a specific library SDC Morphology Toolbox (SDC, 2001) with functions of mathematical morphology.
Figure 1 shows the partial results of the application of the proposed algorithm. Figure 1-a) shows two bone marrow biopsies. Figure 1-b) shows the result of the clustering of the regions obtained by WT in four classes (trabreculae, fat, blood cells, and intracellular space). This clustering is obtained from the over-segmentation produced by the direct application of the WT using mean and the standard deviation of the intensities of each region. Fig. 1-c) shows an image that contains only the class that distinguishes the objects of interest (the trabeculae).
Fig. 1. Intermediate stages in the application of the algorithm. a) two bone marrow biopsies, b) classification in four classes (trabreculae, fat, blood cells, and intracellular space), c) only the class that distinguishes the objects of interest (the trabeculae), d) internal markers for the WT, e) internal and external markers superimposed with the gradient obtained f) the resulting segmentation.
To obtain the object markers, or internal markers, morphological opening and closing operators are applied. As it can be seen in Fig 1-d the adjacent regions that correspond to the same object were merged and the isolated pixels that do not belong to the wanted objects were eliminated. Then the result is eroded to eliminate the regions that belong to the possible borders of the objects. As a result the internal markers for the WT are obtained. This last result is observed in Fig. 1-d).
The background markers or external markers are obtained eroding the complement to the internal markers. In Fig. 1-e ) internal and external markers superimposed with the gradient are shown.
Finally, the WT is again calculated with the internal and external markers previously determined and the morphological gradient of the original image. Figure 1-f) shows the resulting segmentation.
In Fig. 2 we can observe another two successful segmentations for the trabeculae applying the proposed algorithm. It was possible to analyze biopsies of different patients that could not be segmented with other marker detection algorithms.
Fig 2. Resulting segmentation for other biopsies sample images.
The quality of the resulting segmentation is evaluated by the index µ (Pastore et al., 2006). This index combines a measurement of spatial similarity through the Hausdorff metrics and the difference in the contour areas based on the symmetric difference between sets. The values for this index vary from 0 to 1. A value close to 1 means a great difference between the figure segmented by the WT and the segmentation produced by an experienced observer. A value close to 0 means figures that differ in a small number of pixels. Figure 3 shows the variation of the μ index measure in the whole set of test images. The mean value was 0.065 and the standard deviation was 0.198.
Fig. 3. µ index evaluation for 56 biopsies
In Table 1 we compare the index calculated using the markers' binary image as a segmentation result (step 5) and the segmentation obtained applying the marker-based watershed transform (step 7). In Fig. 1-d we can observe that the markers do not identify traveculae correctly. Using the proposed algorithm, the value of µ index for biopsies segmentation goes from 0.04 to 0.13. In contrast, the value of the µ index for the segmentation obtained using only the markers goes from 0.08 to 0.18 (see Table 1). For all analysed biopsies it can be confirmed by the index values that it is necessary to apply the second WT to obtain a more precise segmentation.
Table 1: Comparison between segmentation using only the markers and the using the second-step WT from the markers.
The analysis of the results shows that the segmentation done with the proposed algorithm is precise, simple and robust. The algorithm adapts automatically to the different characteristics of the biopsies without having to change the process of biopsies for different patients.
The proposed algorithm is an effective tool to obtain suitable internal and external markers for the marker-based watershed transform. It is possible to apply it on highly textured images, like biomedical images, whose process is complex and cannot be done through conventional methods.
The use of the classical WT reduces the noise in the resulting images and therefore it allows to define unique, homogenous and greater markers unlike the other existing methods.
It also reduces the computational cost of the algorithm. Biomedical images have a high spatial resolution and the computational cost of the markers detection is added to the cost of the gradient computation and WT. This is why the computational cost of the algorithm has to be taken into account during its development.
1. Beucher, S. and F. Meyer, "The morphological approach to segmentation: The Watershed Transformation," Mathematical Morphology and its application to image processing. Optical engineering, 433-482 (1993). [ Links ]
2. Castleman, K.R., Digital Image Processing, Prentice Hall (1979). [ Links ]
3. Couprie, M. and G. Bertrant, "Topological grayscale Watershed Transform," SPIE Vision Geometry Proceedings, 3168, 136-146 (1997). [ Links ]
4. Dougherty, E.R. and J. Astola, An introduction to nonlinear image processing, Ed. New York: Marcel Decker, 433-481 (1993). [ Links ]
5. Facon, J., Morfología Matemática. Teoría y ejemplos, Curitiba Brasil, CITS (1996). [ Links ]
6. Gauch, J.M., "Image segmentation and analysis via multiescale gradient watershed hierarchies," IEEE Trans. on Image Processing, 8, 69-79 (1999). [ Links ]
7. Glasbey, C.A. and G.W. Horgan, Image analysis for the biological science, Statistics in Practice, Series Editor Vic Barnett., John Wiley and Sons (1994). [ Links ]
8. Gonzalez, M.A. and V.L. Ballarin, "Determinación automática de marcadores para la Transformada Watershed mediante técnicas de clasificación de patrones," 9vo Simposio Argentino de Informática y Salud (SIS), Universidad Nacional de Cuyo (2006a). [ Links ]
9. Gonzalez, M.A. and V.L. Ballarin, "Segmentación de imágenes mediante la Transformada Watershed: Obtención de marcadores mediante Morfología Matemática", 9vo Simposio Argentino de Informática y Salud(SIS), Universidad Nacional de Cuyo (2006b). [ Links ]
10. González R. and R. Woods., Tratamiento Digital de imágenes, Addison Wesley (1996). [ Links ]
11. Gorsevski, P.V., P.E. Gessler and P. Jankowski, "Integrating a fuzzy k-means classification and bayesian approach for spatial prediction of landslide hazard," J. of Geographical Systems, 5, 5930-5949, (2003). [ Links ]
12. Hernandez, S.E and K.E. Barner, "Joint region merging criteria for Watershed-Based image segmentation," IEEE Trans on Image Processing, (2000) [ Links ]
13. Jackway, P., "Gradient Watersheds in morphological scale-space," IEEE trans. Image Processing, Machine Intell., 5, 913-921 (1996). [ Links ]
14. Li, X. and G. Harnarneh, "Modeling prior shape and appearance knowledge in watershed segmentation," Image Vision Comput (2007). [ Links ]
15. Mc Queen, J.B., "Some Methods for Classification and Analysis of Multivariate Observation," Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley University of California Press, 281-297 (1967). [ Links ]
16. Pastore, J.I., E. Moler and V.L. Ballarin, "Comparison of segmentation techniques in medical images by means of non-euclidean distances," Digital Signal Processing, under revision (2006). [ Links ]
17. Rodríguez, R. and O. Pacheco, "A Strategy for Atherosclerosis Image Segmentation by using Robust Markers," Journal Intelligent & Robotic System, 50, 121-140 (2007). [ Links ]
18. Rodríguez, R., T.E. Alarcón and O. Pacheco, "A New Strategy to Obtain Robust Markers for Blood Vessels Segmentation by using the Watersheds Method," Journal of Computers in Biology and Medicine, 35, 665-686 (2005). [ Links ]
19. SDC, Morphology Toolbox for MATLAB: User's Guide. SDC Information Systems (2001). [ Links ]
20. Stanislav, L.S. and W. Straber, "Extracting regions of interest appying a local watershed transformation," Proc. of IEEE Visualization (2000). [ Links ]
21. Vincent, L. and P. Soille, "Watersheds in digital spaces: An efficient algorithm based on immersion simulations," IEEE trans. Pattern Anal, Machine Intell., 13, 583-598 (1991). [ Links ]
22. Wang, D., "A multiscale gradient algorithm for Image Segmentation using Watersheds", Pattern Reconition, 30, 2043-2052 (1997). [ Links ]
Received: October 18, 2007.
Accepted: October 23, 2008.
Recommended by Guest Editors D. Alonso, J. Figueroa, E. Paolini and J. Solsona.