Classification

Classification or cluster analysis groups similar objects in distinct clusters (in the EM Field usually called classes) while minimizing the variance (of a certain parameter) within one cluster/class. During image processing, classification is used to find similar 2D images within a dataset, but also similar 3D structures within a set of structures.

A prerequisite for a feasible classification is a complexity reduction of the given input dataset. This can be achieved by performing a Principal Component Analysis (PCA). The PCA can be carried out either internally by the classification-logic or beforehand by the respective PCA logic. Additionally, the user has to define the input set and the number of expected classes/clusters. Now, the logic will split the dataset according to the information provided by the PCA into as many classes/clusters as determined aiming for an optimum of i) minimal variance within each class/cluster and ii) a maximized signal-to-noise ratio.

This classification mode uses an internal/external PCA to reduce the dataset's complexity before splitting it into a defined number of classes.

Parameters	Description
Eigen images location	Define, whether the Eigen images used for complexity reduction are generated on-the-fly internally (intern) or provided by an input from en external source (extern)
→ Number of eigen images	How many Eigen images (and therefore dimensions) should be used as components during linear combination
Split up method	Determine, whether large classes should be split into smaller classes i) to obtain classes containing a similar number of images/volumes (Cluster size) or ii) to minimize internal variance within each class, as measured by the cross-correlation-coefficients (cccVariance)
Number of classes	Number of resulting classes/clusters
Remove duplicated images	Duplicate images identified by the classification are removed

Input	Description
Input	Stack of input images
Pre Eigen Images	(Only available, if Eigen images location = extern; i.e. Eigen images precomputed with Principal Component Analysis (PCA) logic) Stack of Eigen images with the sum of all images and a mask as the last two images

Output	Description
Output	Stack of all images with added/altered classID header information
Sums	Stack of one image per class/cluster, which represents the average of all images within that class/cluster

Output	Description
ClassID	ID of the class/cluster, the image belongs to

Once clusters/classes of images are found, they can be averaged (see SumByClassNumber) to improve the signal-to-noise-ratio substantially.¹⁾

¹⁾

van Heel, M. (1984). Multivariate statistical classification of noisy images (randomly oriented biological macromolecules). Ultramicroscopy, 13(1-2), 165–83. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/6382731

Classification

Usage

Modes

PCA

Concept

Guide