Journey Into Hyperspace II: Dimensionally Challenged Sorting
Francis L. Battye
The Walter & Eliza Hall Institute
Flow Cytometry Cluster Analysis has the power to explicitly define a sub-population of cells on the basis of a set of measurements of arbitrarily large size which has been applied to each of the cells. These measurements may be thought of as a set of vectors spanning the flow-cytometric hyperspace. The position and extent of the region containing these cells may be exactly described for computing purposes where it is desired to sort all cells falling within its boundary. However, this classification can not be translated readily to standard cell sorters which usually base their sort criteria on a projection of the data onto a set of two-dimensional regions. While it is clear that combining such a collection of 2D regions can not perfectly delineate any arbitrary region in a hyperspace (typically of 6 or 8 dimensions), it may be hoped that an adequate approximation can be approached.
Hence, an algorithm has been developed for selecting the best set of 2D projections and then the boundaries therein for translating into criteria for the sorting of any given cell sub-population.
The choice of 2D projections is based on a measure of "exposure" of the required cells, i.e. the proportion of these cells not overlapped by other populations. Thence, the sort regions begin with the boundaries in these projections which enclose a large proportion (~99%) of the required cells. These boundaries are then iteratively whittled back by removing edges at which the local purity is lowest. As expected, the extent to which these sort regions need to be whittled back depends on a trade-off between purity and recovery and the end point must be determined by the human operator.
The algorithm has been implemented in a C program which outputs sort regions readable by the Acquisition and Sort Processor (ASP) of a FACStar cell sorter.