Probability Density Plotting

Francis L. Battye

The Walter & Eliza Hall Institute

A common problem in the interpretation of the two-parameter correlated displays resulting from two-colour immunofluorescence experiments in flow cytometry, is the identification and location of small minority sub-populations. In displaying the data using contour lines or "dot plots", these minority groups are often camouflaged by the statistical "noise" surrounding the majority populations. Because of constraints on time, data storage space, or sample material, it is not always possible to count larger samples with a view to improving the statistical accuracy. As an alternative, one may wish to use the data from this limited cell sample to calculate a probability estimator function which may bear some resemblance to the expected data display for an infinitely large cell sample. A good description of the theoretical basis has been given by Moore & Kautz ["Handbook of Experimental Immunology", ed. D.M.Weir, Vol 1, Ch 30, 1986]. Simply, the technique may be likened to smoothing the two-dimensional array by convolution with a "kernel" function (in this case of the form K(y)a(1-(y/hn)2)2, where hn is a local "nearest neighbour" distance reflecting the density of cells actually counted in that region).

The author's implementation of this scheme in a Fortran program to run on a VAX computer employs a number of computational simplifications and short cuts to reduce the CPU processing time to around 10 or 20 seconds for 25000 cells of 1024-channel resolution list mode data. The data, having been smoothed in this way, is plotted as a contour line display with the contour levels chosen so that fixed percentiles of the cells lie between each line. The program has shown excellent resolution of sub-1% subpopulations in close proximity to larger groups as defined in multi-colour immunofluorescence cytometry.

Back to Publications