Previous abstract Next abstract
We present preliminary results of the application of clustering analysis and unsupervised classification techniques to the catalogs of objects from the digital scans of the POSS-II. The data set consists of matched catalogs of approximately 8 million objects, for each of which a number of attributes have been measured. They are from 15 sky survey fields near the Galactic poles, measured in 2 or 3 colors each. Smaller, CCD-based catalogs were also used to investigate how a limited amount of data with a superior image quality can help us understand correlations found in the attribute spaces, and give us hints on how to explore more efficiently the large data space produced from the POSS-II plate scans. We apply Bayesian clustering algorithms to both the plate-based and CCD-based catalogs. Our first experiments have shown that the program we used, AutoClass , was able to find sensible categories from a few simple attributes of the object images. The data were separated in four distinct classes: stars, galaxies with bright central cores, galaxies without bright cores, and stars with fuzz around them. The two classes of galaxies show different $(g-r)$ color distributions, and populate different loci in the concentration index versus mean surface brightness diagram; presumably they correspond to the early and late Hubble types. It is important to emphasize that none of these three last mentioned attributes were given to {\it AutoClass}, and thus constitute an independent check of its performance. We will also present the first results from new algorithms to define and discover clusters and groups of galaxies in an objective manner, using these software tools.