Cluster Analysis using Rough Set Theory

Girish Kumar Singh, Shrabanti Mandal


Cluster is a group of objects of similar type and clustering is the process of finding clusters in dataset. Finding a set of clusters in a dataset is one fold of the data mining and it should be further analysis for knowledge. This paper present a method based on the concepts of Rough Set Theory to analysis the outcome of clustering process. The proposed method will able to explain the existence of clusters and why two clusters are different.


Data mining; Clustering; Rough set theory

Full Text:



J.G. Bazan, A comparison of dynamic and non-dynamic rough set methods for extracting laws from decision tables, in Rough Sets in Knowledge Discovery 1: Methodology and Applications, Polkowski and Skowron (editors), Physica-Verlag, Heidelberg, Germany, Chapter 17, pp. 321–365 (1998).

P. Berkhin, A survey of clustering data mining techniques, in Grouping Multidimensional Data, Springer, pp. 25–71 (2002).

A. Chouchoulas and Q. Shen, Rough set-aided keyword reduction for text categorisation, Applied Artificial Intelligence 15 (9) (2001), 843–873.

P.A. Devijver and J. Kittler (eds.), Pattern Recognition Theory and Applications, Springer-Verlag (1987).

M. Ester, H.P. Kriegel, J. Sander and X. Xu, A density-based algorithm for discovering clusters in large spatial databases with noise, in Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD’96), Portland: Oregon, pp. 226–231.

S. Guha, R. Rastogi and K. Shim, Cure: an efficient clustering algorithm for large databases, in ACM SIGMOD International Conference on the Management of Data, (Seattle, WA, USA), pp. 73–84 (1998).

S. Guha, R. Rastogi and K. Shim, ROCK: A robust clustering algorithm for categorical attributes, Information Systems 25 (5) (2000), 345–366.

X. Hu, T.Y. Lin and J. Han, A new rough set model based on database systems, Journal of Fundamental Informatics 59 (2004), 135–152.

K. Hu, Y. Lu and C. Shi, Feature ranking in rough sets, AI Communications, Special issue on Artificial intelligence advances in China 16 (1) (2003), 41–50.

A.K. Jain and R.C. Dubes, Algorithms for Clustering Data, Prentice Hall, Upper Saddle River, New Jersey (1988).

R. Jensen and Q. Shen, A rough set-aided system for sorting WWW bookmarks, in Proceedings of the First Asia-Pacific Conference on Web Intelligence: Research and Development (WI’2001), 2001, pp. 95–105 (2001).

D.S. Johnson, Approximation algorithms for combinatorial problems, Journal of Computer and System Sciences 9 (1974), 256–278.

S.K. Pal and A. Skowron, Rough Fuzzy Hybridization – A New Trend in Decision Making, Springer (1999).

Z. Pawlak, Rough sets, International Journal of Computer and Information Sciences, 11 (1982), 341-356.

L. Rokach and O. Maimon, Clustering methods, in Data Mining and Knowledge Discovery Handbook, Springer US, pp. 321–352 (2005).

G.K. Singh and S. Minz, Attribute based hierarchical clustering algorithm, National Conference on Trends in Advance Computing (NCTAC’07), pp.177–183 (2007).

J. Starzyk, D.E. Nelson and K. Sturtz, Reduct generation in information systems, Bulletin of International Rough Set Society 3 (1998), 19–22.

UCI Repository of Machine Learning,

S. Vinterbo and A. Ohrn, Minimal approximate hitting sets and rule templates, International Journal of Approximate Reasoning 25 (2) (2000), 123–143.

J. Wang and J. Wang, Reduction algorithms based on discernibility matrix: the ordered attributes method, Journal of Computer Science & Technology 16 (6) (2001), 489–504.

T. Zhang, R. Ramakrishnan and M. Livny, BIRCH: An ecient data clustering method for very large databases, in Proceedings of ACM SIGMOD, Montreal Canada, pp. 103–114 (1996).


eISSN 0975-5748; pISSN 0974-875X