The analysis of G-banded chromosomes remains the most important tool available to the clinical cytogeneticist. The analysis is laborious when performed manually, and the utility of automated chromosome identification algorithms has been limited by the fact that classification accuracy of these methods seldom exceeds about 80% in routine practice. In this study, we use four new approaches to automated chromosome identification - singular value decomposition (SVD), principal components analysis (PCA), Fisher discriminant analysis (FDA), and hidden Markov models (HMM) - to classify three well-known chromosome data sets (Philadelphia, Edinburgh, and Copenhagen), comparing these approaches with the use of neural networks (NN). We show that the HMM is a particularly robust approach to identification that attains classification accuracies of up to 97% for normal chromosomes and retains classification accuracies of up to 95% when chromosome telomeres are truncated or small portions of the chromosome are inverted. This represents a substantial improvement of the classification accuracy for normal chromosomes, and a doubling in classification accuracy for truncated chromosomes and those with inversions, as compared with NN-based methods. HMMs thus appear to be a promising approach for the automated identification of both normal and abnormal G-banded chromosomes.
@article{CoKoOlOl00,
author = {John M. Conroy and Tamara G. Kolda and Dianne P. O'Leary and Timothy J. O'Leary},
title = {Chromosome Identification Using Hidden {Markov} Models: Comparison with Neural Networks, Singular Value Decomposition, Principal Components Analysis, and {Fisher} Discriminant Analysis},
journal = {Laboratory Investigation},
volume = {80},
number = {11},
pages = {1629--1641},
month = {November},
year = {2000},
doi = {10.1038/labinvest.3780173},
}