Classification

Classification or cluster analysis groups similar objects in distinct clusters (in the EM Field usually called classes) while minimizing the variance (of a certain parameter) within one cluster/class. During image processing, classification is used to find similar 2D images within a dataset, but also similar 3D structures within a set of structures.

A prerequisite for a feasible classification is a complexity reduction of the given input dataset. This can be achieved by performing a Principal Component Analysis (PCA). The PCA can be carried out either internally by the classification-logic or beforehand by the respective PCA logic. Additionally, the user has to define the input set and the number of expected classes/clusters. Now, the logic will split the dataset according to the information provided by the PCA into as many classes/clusters as determined aiming for an optimum of i) minimal variance within each class/cluster and ii) a maximized signal-to-noise ratio.

This classification mode uses an internal/external PCA to reduce the dataset's complexity before splitting it into a defined number of classes.

Parameters Description
Eigen images location Define, whether the Eigen images used for complexity reduction are generated on-the-fly internally (intern) or provided by an input from en external source (extern)
→ Number of eigen images How many Eigen images (and therefore dimensions) should be used as components during linear combination
Split up method Determine, whether large classes should be split into smaller classes i) to obtain classes containing a similar number of images/volumes (Cluster size) or ii) to minimize internal variance within each class, as measured by the cross-correlation-coefficients (cccVariance)
Number of classes Number of resulting classes/clusters
Remove duplicated images Duplicate images identified by the classification are removed
Input Description
Input Stack of input images
Pre Eigen Images (Only available, if Eigen images location = extern; i.e. Eigen images precomputed with Principal Component Analysis (PCA) logic) Stack of Eigen images with the sum of all images and a mask as the last two images
Output Description
Output Stack of all images with added/altered classID header information
Sums Stack of one image per class/cluster, which represents the average of all images within that class/cluster
Output Description
ClassID ID of the class/cluster, the image belongs to

Once clusters/classes of images are found, they can be averaged (see SumByClassNumber) to improve the signal-to-noise-ratio substantially.1)


1)
van Heel, M. (1984). Multivariate statistical classification of noisy images (randomly oriented biological macromolecules). Ultramicroscopy, 13(1-2), 165–83. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/6382731