Cluster Validity Techniques in Medical Imaging

From MeliWiki
Jump to: navigation, search

<Contributed to MeLi Wiki by Professor Anke Meyer-Baese, Department of Scientific Computing, Florida State University>


Assessment of Clustering

For most clustering techniques the number of clusters is considered to be known. This assumption holds only in most cases where the data is slowly time-varying or is very small. In all other cases when we deal with an unknown set of data, such as it is the case with medical imaging data, this becomes a very unnatural hypothesis. Thus, it becomes evident that two fundamental questions are representative for any clustering problem: (a) how many clusters are found in the data and (b) how accurate becomes the clustering itself. The goal is to determine measures for quantitative evaluation of the results of a clustering algorithm. This is accomplished by cluster validity. This topic has been a broad research area in pattern recognition [1] [2]. A comparison of several validity indices for data being represented by non-overlapping distinct clusters and classified by hierarchical clustering algorithms is given in [3]. Determining the optimal number of clusters represents one of the most crucial classification problems. This task is known as cluster validity.

The chosen validity function enables the validation of an accurate structural representation of the partition obtained by a clustering method. While a visual visualization of the validity is relatively simple for two-dimensional data, in case of multidimensional data sets this becomes very tedious. In this sense, the main objective of cluster validity is to determine the optimal number of clusters that provide the best characterization of a given multi-dimensional data set. An incorrect assignment of values to the parameter of a clustering algorithm results in a data partitioning scheme that is not optimal and thus leading to wrong decisions.


Methods in Biomedical Image Analysis

In the last decades, several advanced noninvasive medical imaging techniques such as magnetic resonance imaging (MRI) and functional MRI (fMRI) have been introduced into clinical practice. These new techniques are not only limited to the mere imaging of morphological structures but also emphasize the biological function. As a consequence, the analysis and visualization of medical image time series data poses a new challenge to both basic research and medical application.

Functional and dynamic MRI time series experiments produce a wealth of data and subsequently information packed into two or three spatial dimensions as a function of time. The fMRI and MRI data are both multivariate by nature since each volume contains information about either brain activation or tissue response at thousands of measured voxels. Multivariate techniques have been successfully and frequently applied to fMRI and MRI studies. They are mostly used to detect brain or tissue response to stimuli, to reduce noise artifacts, to capture the variability of the data in a precise way, and to determine a model to be applied on the data. Another important aspect that can be analyzed by multivariate techniques is both the mapping of activities in the brain to cognitive states and also of enhancement patterns to anatomical, physiological and pathological states. These mappings can be treated as a pattern recognition problem. Thus, algorithms have to be designed that learn and later classify or predict spatial patterns representing multivariate fMRI or MRI experimental data. Such experiments are able to localize brain or tissue response to several stimuli types ranging from pharmacological to artificial. These data are abundant in signal information but poorly characterized in terms of signal and noise structure.

Techniques for the analysis of both functional or dynamic MRI time series are either model-free or model-based. The model-based, supervised data analysis methods rely on knowledge of experimental conditions and model assumptions, and combine problem-specific physiological parameters to a mathematical model of the temporal MRI signal. However, frequently the spatiotemporal behavior of the cerebral or tissue response is not known a priori or is variable and poses a challenge to an evaluation based on model-based methods alone. Therefore, model-based analysis methods impose some limitations on data analysis under complicated experimental conditions. Model-free, on the other hand, are not based on explicit signal models nor on the a priori knowledge of the underlying physiological process. Thus, different signal classes have not to be mathematically described and labeled training data are in general acquired during standard evaluation procedures. Therefore, analysis methods that do not rely on any assumed model are considered more powerful and relevant.

On the other hand, time series clustering represents an emerging technique for temporal data mining research by providing important information in several domains. The core problem addressed by this technique is to determine out of a group of unlabeled time series clusters or groups of similar time series based on a clustering algorithm of choice [4].

Thus, the choice of the right cluster validity index is of crucial importance in biomedical image processing. The most established techniques are: Kim's index, Calinski Harabasz (CH) index, and the intraclass index.

These indices were successfully applied before in biomedical time--series analysis [5].

An overview about these methods is given below:


1. Calinski Harabasz index [6]: this index is computed for <math>m</math> data points and <math>K</math> clusters as

<math>CH=\frac{[trace \mathbf{B}/(K-1)]}{[trace \mathbf{M}/(m-K)]}</math>

where <math>\mathbf{B}</math> and <math>\mathbf{W}</math> represent the between and within cluster scatter matrices. The maximum hierarchy level is used to indicate the correct number of partitions in the data.


2. Intraclass index [5]: This index is given as

<math>I_W=\frac{1}{n}\sum_{k=1}^{N}\sum_{i=1}^{n_k}\left \| x_i-w_k \right \|^2</math>

where <math>n_k</math> is the number of points in cluster <math>k</math> and <math>w_k</math> is a prototype associated with the <math>k</math>-th cluster. <math>I_W</math> is computed for different cluster numbers. The maximum value of the second derivative of <math>I_W</math> as a function of cluster number is taken as an estimate for the optimal partition. This index provides a possible way of assessing the quality of a partition of <math>K</math> clusters.


3. Kim's index [7]: This index equals the sum of the over--partition <math>v_o(K,X,W)</math> and under--partition <math>v_u(K,X,W)</math> function measure

<math>I_{Kim}=\frac{v_u(K)-v_{umin}}{v_{umax}-v_{umin}}+\frac{v_o(K)-v_{omin}}{v_{omax}-v_{omin}}</math>

where <math>v_u(K)</math> is the under-partitioned average over the cluster number of the mean intra-cluster distance and measures the structural compactness of each class, <math>v_{umin}</math> is its minimum while <math>v_{umax}</math> the maximum value. <math>v_u(K,X,W)</math> is given by the average of the mean intra-cluster distance over the cluster number <math>K</math> and measures the structural compactness of each and every class. <math>v_o(K,X,W)</math> is given by the ratio between the cluster number <math>K</math> and the minimum distance between cluster centers, describing inter-cluster separation.

<math>X</math> is the matrix of the data points and <math>W</math> is the matrix of the prototype vectors. Similarly, <math>v_o(K)</math> is the over-partitioned measure defined as the ratio between the cluster number and the minimum distance between cluster centers measuring the inter-cluster separation. <math>v_{omin}</math> is its minimum while <math>v_{omax}</math> the maximum value. The goal is to find the optimal cluster number with the smallest value of <math>I_{Kim}</math> for a cluster number <math>K=2</math> to <math>Kmax</math>.


Experimental Results

Cerebrovascular stroke is the third leading cause of mortality in industrial countries after cardiovascular disease and malignant tumors. Therefore, the analysis of cerebral circulation, has become an issue of enormous clinical importance.

Measurement of tissue perfusion yields important information about organ viability and function. Dynamic susceptibility contrast MR imaging, also known as contrast agent bolus tracking represents a non-invasive method for cerebrovascular perfusion analysis. In contrast to other methods to determine cerebral circulation such as iodinated contrast media in combination with dynamic X-ray computed tomography (CT) and the administration of radioactive tracers for positron emission tomography (PET) blood flow quantification studies, it allows high spatial and temporal resolution and avoids the disadvantage of patient exposure to ionizing radiation. Below is shown the diffusion weighted MR image of a subject with cerebral infarct [8].


Error creating thumbnail: File missing


The main objective of dynamic contrast--enhanced MRI is to separate abnormal from normal contrast enhancement reflecting disturbed hemodynamics. Perfusion-weighted imaging is based on the analysis of signal changes during the first pass of circulation of a rapid bolus of contrast agent. The most popular analysis is based on a gamma-variate function, a nonlinear model which has the typical problems of fitting nonlinear nonlinear models to noisy data. To overcome the above mentioned difficulties, clustering of voxels by their temporal characteristics based on unsupervised techniques has proven to be an adequate method [8]. The main drawback comes from the association difficulty between the voxel time series and parametric model where model misspecifications subsequently can lead to the fact that really active voxels are lost.

Dynamic susceptibility contrast-enhanced perfusion weighted MRI was performed on a 1.5 T system (Magnetom Vision, Siemens, Erlangen, Germany) using a standard circularly polarized head coil for radio frequency transmission and detection. MR imaging allows assessment of regional cerebral blood flow (rCBF), regional cerebral blood volume (rCBV), and mean transit time (MTT), for definitions see e.g. [9].

In clinical praxis, the computation of rCBV, rCBF, and MTT values from the MRI signal dynamics has been demonstrated to be relevant, even if its underlying theoretical basis may be weak under pathological conditions. The analysis of perfusion MRI data by unsupervised clustering methods provides the advantage that it does not imply speculative presumptive knowledge on contrast agent dilution models, but strictly focuses on the observed complete MRI signal time-series.

Clustering results based on an unsupervised clustering algorithm (minimal free energy VQ) for a 38 scan dynamic contrast-enhanced MRI perfusion study in a patient with a subacute stroke affecting the right basal ganglia are presented in the following. Clustering these pixel time courses (PTC)s identifies groups of pixels with similar signal dynamics. Figure below shows the cluster assignment maps overlaid onto an EPI scan of the perfusion sequence in a patient with stroke in the supply region of the left middle cerebral artery. In these maps, all the pixels are highlighted that belong to a specific cluster.


Figure 2 Cluster Validity.jpg


The following Figure shows the prototypical cluster-specific concentration-time curves (CTCs) belonging to the pixel clusters of the preceding Figure. Cluster numbers correspond to the former Figure. MTT values are indicated as multiples of the scan interval (1.5s), rCBV values are normalized with respect to the maximal value (cluster #2).


Figure 3 Cluster Validity.jpg


The region of the cerebrovascular insult in the right basal ganglia is clearly depicted by the clusters #1, #2, #3 and #4. It can be seen the small corresponding CTC amplitude, i.e. the small cluster-specific rCBV, rCBF as well as the large MTT. Cluster #14 contains peripheral regions. Clusters #5, #6, and #7 may be attributed to large, intermediate and small parenchymal vessels of the non-affected left side represented by subsequently increasing MTT, decreasing rCBF, rCBV, and smaller recirculation peaks.


The effectiveness of the different cluster validity indices and clustering methods in automatically evolving the appropriate number of clusters is demonstrated experimentally in form of cluster assignment maps for the above perfusion MRI data set with the number of clusters varying from two to thirty-six.

The Table below shows the optimal cluster number <math>K^*</math> obtained for the perfusion MRI data set based on the different cluster validity indices {MLWH07].


Index Number of Clusters
<math>K^*_{Kim}</math> 18
<math>K^*_{CH}</math> 24
<math>K^*_{Interclass}</math> 3


The results show that based on the indices <math>K^*_{Kim}</math> and <math>K^*_{Interclass}</math> a larger number of clusters is needed to represent the data set.

As an example, the Figure below shows the cluster assignment maps for different cluster numbers corresponding to the optimal cluster number <math>K^*</math> and <math>K= K^*\pm 1</math>. The cluster assignment maps correspond to the cluster-specific concentration-time curves exhibiting the minimum regional cerebral blood volume (rCBV) [8].


Figure 4 Cluster Validity.jpg


Project

Construct simplified models of data as obtained by perfusion MRI experiments. Standard normally distributed numbers should be concatenated in order to simulate the baseline level during a control condition. Activations during the cerebral infarct conditions are simulated by several basic patterns according to the functions <math>y=a</math> (box-car pattern), <math>y=b+cx</math>, and <math>y=dx\sin x</math>, where <math>a</math>, <math>b</math>, <math>c</math> and <math>d</math> are constants. These functions are selected to produce different shapes of signals. The basic patterns are superimposed on the noise signals with regular spacing in order to complete the simulated perfusion MRI signals.

Choose a simple unsupervised clustering algorithm (Kohonen map) and determine the optimal number of clusters according to the three cluster validity indices. Plot and visualize the partition results.


Wiki Assessment

  1. What is cluster validity and why does it become an important issue?
  2. What are the differences between time-series classification and traditional stationary features?
  3. Why are unsupervised methods in general more suitable than supervised?
  4. What is model-free versus model-based classification?
  5. On which criteria are the presented cluster validity indices operating?
  6. When does it make sense to apply empirically methods as cluster validity measures for bioimaging signals?


References

  1. R. Duda and P. Hart. “Pattern Classification and Scene Analysis”, John Wiley, New York, 1973.
  2. S. Theodoridis and K. Koutroumbas. “Pattern Recognition”, Academic Press, San Diego, 1998.
  3. W. Milligan and M. C. Cooper. “An examination of procedures for determining the number of clusters in a data set”, Psychometrika, 50:159-179, 1985.
  4. A. Meyer-Baese, A. Saalbach, O. Lange and A. Wismuller. “Unsupervised clustering of fMRI and MRI time series”, Biomedical Signal Processing and Control, 2:295-310, 2007.
  5. 5.0 5.1 C. Goutte, P. Toft, E. Rostrup, F. Nielsen, and L.K. Hansen, “On clustering fmri series”, Neuroimage, 9:298-310, 1999.
  6. R. B. Calinski and J. Harabasz “A dendrite method for cluster analysis”, Psychometrika, 3:1-27, 1974.
  7. D. Kim, Y. Park and D. Park, “A novel validity index for determination of the optimal number of clusters,” IEICE Transactions on Inf. and Syst., E84-D(2):281-285, 2001.
  8. 8.0 8.1 8.2 A. Meyer-Baese, O. Lange, A. Wismueller and, M. Hurdal, “Analysis of dynamic susceptibility contrast MRI time series based on unsupervised clustering methods,” IEEE Transactions on Information Technology in Medicine, 11:563-573, 2007
  9. K.A Rempp, G. Brix, F. Wenz, C.R. Becker, F. Gueckel and W.J. Lorenz, Quantification of regional cerebral blood flow and volume with dynamic susceptibility contrast-enhanced {MR} imaging, “Radiology”, 193:637-641, 1994.