A New Nonparametric Pairwise Clustering Algorithm Based on Iterative Estimation of Distance Profiles

Dubnov, Shlomo; El-Yaniv, Ran; Gdalyahu, Yoram; Schneidman, Elad; Tishby, Naftali; Yona, Golan

doi:10.1023/A:1013631728342

A New Nonparametric Pairwise Clustering Algorithm Based on Iterative Estimation of Distance Profiles

Published: April 2002

Volume 47, pages 35–61, (2002)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

A New Nonparametric Pairwise Clustering Algorithm Based on Iterative Estimation of Distance Profiles

Download PDF

Shlomo Dubnov¹,
Ran El-Yaniv²,
Yoram Gdalyahu³,
Elad Schneidman⁴,
Naftali Tishby³ &
…
Golan Yona⁵

808 Accesses
20 Citations
Explore all metrics

Abstract

We present a novel pairwise clustering method. Given a proximity matrix of pairwise relations (i.e. pairwise similarity or dissimilarity estimates) between data points, our algorithm extracts the two most prominent clusters in the data set. The algorithm, which is completely nonparametric, iteratively employs a two-step transformation on the proximity matrix. The first step of the transformation represents each point by its relation to all other data points, and the second step re-estimates the pairwise distances using a statistically motivated proximity measure on these representations. Using this transformation, the algorithm iteratively partitions the data points, until it finally converges to two clusters. Although the algorithm is simple and intuitive, it generates a complex dynamics of the proximity matrices. Based on this bipartition procedure we devise a hierarchical clustering algorithm, which employs the basic bipartition algorithm in a straightforward divisive manner. The hierarchical clustering algorithm copes with the model validation problem using a general cross-validation approach, which may be combined with various hierarchical clustering methods.

We further present an experimental study of this algorithm. We examine some of the algorithm's properties and performance on some synthetic and ‘standard’ data sets. The experiments demonstrate the robustness of the algorithm and indicate that it generates a good clustering partition even when the data is noisy or corrupted.

Article PDF

Pairwise Data Clustering Accompanied by Validation and Visualisation

Hierarchical Means Clustering

Article Open access 23 September 2022

Maurizio Vichi, Carlo Cavicchia & Patrick J. F. Groenen

Hierarchical Clustering for Large Data Sets

References

Bishop, C. (1995). Neural networks for pattern recognition. Oxford: Oxford Press.
Google Scholar
Blake, C., Keogh, E., & Merz, C. (1998). UCI Repository of machine learning databases. http://www.ics.uci.edu/ ~mlearn/MLRepository.html.
Blatt, M., Wisemann, S., & Domany, E. (1996). Clustering data through an analogy to the Potts model. In D. Touretzky, M. Mozer, & M. Hasselmo (Eds.), Advances in neural information processing system (Vol. 8, pp. 416–422), Cambridge, MA: The MIT Press.
Google Scholar
Buhmann, J., & Hofmann, T. (1995). Pairwise data clustering by deterministic annealing. Technical Report IAITR–95–7, Department of Computer Science, University of Bonn. ftp://ftp.informatik.uni-bonn.de/pub/paper/ infIII/IAI-TR–95–7.ps.gz.
Cheeseman, P., Kelly, J., Self, M., Stutz, J., Taylor, W., & Freeman, D. (1988). AUTOCLASS: A Bayesian classification system. In Proceedings of the Fifth International Machine Learing Conference (pp. 54–64). Ann Arbor MI: Morgan Kaufmann.
Google Scholar
Clark, J., & Yallop, C. (1995). An introduction to phonetics and phonology (Blackwell textbooks in linguistics, Vol. 9). Oxford: Blackwell Pub.
Google Scholar
Crawford, T., Iliopolous, C., & Raman, R. (1998). String matching techniques for musical similarity and melodic recognition. Computing in Muiscology, 11, 73–100.
Google Scholar
Dempster, A., Laird, N., & Rubin, D. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, 39 (Series B), 1–38.
Google Scholar
Dubnov, S., El-Yaniv, R., & Assayag, G. (1997). Universal classification applied to musical sequences. In Proceedings of the International Computer Music Conference. Ann Arbor, Michigan.
Duda, R., & Hart, P. (1973). Pattern classification and scene analysis. New York: Wiley.
Google Scholar
El-Yaniv, R., Fine, S., & Tishby, N. (1997). Agnostic classification of markovian sequences. In M. Jordan, M. Kearns, & S. Solla (Eds.), Advances in neural information processing system (Vol. 10). Cambridge, MA: The MIT Press.
Google Scholar
Fanty, M., & Cole, R. (1991). Spoken letter recognition. In R. Lippmann, J. Moody, D. Touretzky, & S. Hanson (Eds.), Advances in neural information processing system (Vol. 3., pp. 220–226). Cambridge, MA: The MIT Press.
Google Scholar
Fisher, D. (1996). Iterative optimization and simplification of hierarchical clusterings. J. Artifi. Intel. Res., 4, 147–179.
Google Scholar
Fisher, R. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7:2, 179–188.
Google Scholar
Gdalyahu, Y., Weinshall, D., & Werman, M. (1988).Arandomized algorithm for pairwise clustering. In M. Kearns, S. Solla, & D. Cohn (Eds.), Advances in neural information processing systems (Vol. 11). Cambridge, MA: The MIT Press.
Google Scholar
Gomory, R., & Hu, T. (1961). Multi-terminal network flows. SIAM Journal of Applied Mathematics, 9, 551–570.
Google Scholar
Gowda, K.C., & Krishna, G. (1979). The condensed nearest neighbor rule using the concept of mutual nearest neighborhood. IEEE Transactions on Information Theory, 25, 488–490.
Google Scholar
Gutman, M. (1989). Asymptotically optimal classification for multiple tests with empirically obsereved statistics. IEEE Transactions on Information Theory, 35:2, 401–408.
Google Scholar
Jain, A., & Dubes, R. (1988). Algorithms for clustering data. New Jersey: Prentice-Hall.
Google Scholar
Kleinberg, J. (1998). Authoritative sources in a hyperlinked environment. In Proc. 9th ACM-SIAM Symposium on Discrete Algorithms.
Kruskal, J. (1978). A theorem about CONCOR. Technical Report MH 2C-571, Bell Laboratories, 600 Mountain Avenue, Murray Hill, NJ 07974.
Google Scholar
Kullback, S. (1959). Information theory and statistics. New York: Wiley & Sons.
Google Scholar
Lin, J. (1991). Divergence measures based on the Shannon entropy. IEEE Transactions on Information Theory, 37:1, 145–151.
Google Scholar
Linial, M., Linial, N., Tishby, N., & Yona, G. (1997). Global organization of protein segments using newalgorithms for metric empbedding and clustering. Journal of Molecular Biology, 268, 539–556.
Google Scholar
McQuitty, L., & Clark, J. (1968). Clusters from iterative intercolumnar correlational analysis. Educ. Psych. Meas., 28, 211–238.
Google Scholar
Michalski, R., & Stepp, R. (1983). Automated construction of classifications: Conceptual clustering versus numerical taxonomy. IEEE Transactions on Pattern Analysis and Machine Intelligenc, 5, 219–243.
Google Scholar
Pereira, F., Tishby, N., & Lee, L. (1993). Distributional clustering of English words. In Proc. of the 31st Annual Meeting of the Association for Computational Linguistics.
Perona, P., & Freeman, W.T. (1998). A factorization approach to grouping. In Proceedings of ECCV.
Rolland, P., & Ganascia, J. (1999). Musical pattern extraction and similarity assesment. In E. Miranda (Ed.), Reading in music and AI. New York: Harwood Academic Press.
Google Scholar
Rose, K., Gurewitz, E., & Fox, G. (1992). Vector quantization by deterministic annealing. IEEE Transactions on Information Theory, 38.
Schreibman, A. (2000). Stochastic modeling for efficient computation of information theoretic quantities. Master's thesis, Hebrew University.
Schwob, R. (1998). The classical midi archive. http://www.prs.net/midi.html.
Shi, J., & Malik, J. (1997). Normalized cuts and image segmentation. In Proceedings of CVPR.
Slonim, N., & Tishby, N. (2000). Data clustering by Markovian relaxation and the information bottleneck method. Advances in Neural Information Processing Systems (Vol. 13, pp. 640–646). Cambridge, MA: The MIT Press.
Google Scholar
Smith, S. (1993). Threshold validity for mutual neighborhood clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15, 89–92.
Google Scholar
Tishby, N., Pereira, F., & Bialek, W. (1999). The information bottleneck method. In Proceedings of the 37th Annual Allerton Conference on Communication, Control and Computing (pp. 368–377).
Wallace, C., & Dowe, D. (1994). Intrinsic classification by MML-the SNOB program. In Proceedings of the 7th Australian Joint Conference on Artificial Intelligence (pp. 37–44). UNE, Armidale, NSW, Australia: World Scientific.
Google Scholar
Wong, Y. (1993). Clustering data by melting. Neural Computation, 5, 89–104.
Google Scholar
Wu, Z., & Leahy, R. (1993). An optimal graph theoretic approach to data clustering: Theory and its application to image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15, 1101–1113.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Communication Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, 84105, Israel
Shlomo Dubnov
Department of Computer Science, Technion—Israel Institute of Technology, Haifa, 32000, Israel
Ran El-Yaniv
School of Computer Science and Engineering and Center for Neural Computation, Hebrew University, Jerusalem, 91904, Israel
Yoram Gdalyahu & Naftali Tishby
School of Computer Science and Engineering, Department of Neurobiology and Center for Neural Computation, Hebrew University, Jerusalem, 91904, Israel
Elad Schneidman
Department of Computer Science, Cornell University, Ithaca, NY, 14853-7501, USA
Golan Yona

Authors

Shlomo Dubnov
View author publications
You can also search for this author in PubMed Google Scholar
Ran El-Yaniv
View author publications
You can also search for this author in PubMed Google Scholar
Yoram Gdalyahu
View author publications
You can also search for this author in PubMed Google Scholar
Elad Schneidman
View author publications
You can also search for this author in PubMed Google Scholar
Naftali Tishby
View author publications
You can also search for this author in PubMed Google Scholar
Golan Yona
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dubnov, S., El-Yaniv, R., Gdalyahu, Y. et al. A New Nonparametric Pairwise Clustering Algorithm Based on Iterative Estimation of Distance Profiles. Machine Learning 47, 35–61 (2002). https://doi.org/10.1023/A:1013631728342

Download citation

Issue Date: April 2002
DOI: https://doi.org/10.1023/A:1013631728342

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A New Nonparametric Pairwise Clustering Algorithm Based on Iterative Estimation of Distance Profiles

Abstract

Article PDF

Similar content being viewed by others

Pairwise Data Clustering Accompanied by Validation and Visualisation

Hierarchical Means Clustering

Hierarchical Clustering for Large Data Sets

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

A New Nonparametric Pairwise Clustering Algorithm Based on Iterative Estimation of Distance Profiles

Abstract

Article PDF

Similar content being viewed by others

Pairwise Data Clustering Accompanied by Validation and Visualisation

Hierarchical Means Clustering

Hierarchical Clustering for Large Data Sets

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation