Semi-supervised community detection on attributed networks using non-negative matrix tri-factorization with node popularity

Di JIN , Jing HE , Bianfang CHAI , Dongxiao HE

Front. Comput. Sci. ›› 2021, Vol. 15 ›› Issue (4) : 154324

PDF (531KB)
Front. Comput. Sci. ›› 2021, Vol. 15 ›› Issue (4) : 154324 DOI: 10.1007/s11704-020-9203-0
RESEARCH ARTICLE

Semi-supervised community detection on attributed networks using non-negative matrix tri-factorization with node popularity

Author information +
History +
PDF (531KB)

Abstract

The World Wide Web generates more and more data with links and node contents, which are always modeled as attributed networks. The identification of network communities plays an important role for people to understand and utilize the semantic functions of the data. A few methods based on non-negative matrix factorization (NMF) have been proposed to detect community structure with semantic information in attributed networks. However, previous methods have not modeled some key factors (which affect the link generating process together), including prior information, the heterogeneity of node degree, as well as the interactions among communities. The three factors have been demonstrated to primarily affect the results. In this paper, we propose a semi-supervised community detection method on attributed networks by simultaneously considering these three factors. First, a semi-supervised non-negative matrix tri-factorization model with node popularity (i.e., PSSNMTF) is designed to detect communities on the topology of the network. And then node contents are integrated into the PSSNMTF model to find the semantic communities more accurately, namely PSSNMTFC. Parameters of the PSSNMTFC model is estimated by using the gradient descent method. Experiments on some real and artificial networks illustrate that our new method is superior over some related stateof- the-art methods in terms of accuracy.

Keywords

community detection / non-negative matrix trifactorization / node popularity / attributed networks

Cite this article

Download citation ▾
Di JIN, Jing HE, Bianfang CHAI, Dongxiao HE. Semi-supervised community detection on attributed networks using non-negative matrix tri-factorization with node popularity. Front. Comput. Sci., 2021, 15(4): 154324 DOI:10.1007/s11704-020-9203-0

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Yang J, McAuley J, Leskovec J. Community detection in networks with node attributes. In: Proceedings of IEEE International Conference on Data Mining. 2013, 1151–1156

[2]

Peel L, Larremore D B, Clauset A. The ground truth about metadata and community detection in networks. Science Advances, 2016, 3(5): e1602548

[3]

NewmanME J, Clauset A. Structure and inference in annotated networks. Nature Communications, 2016, 7: 11863

[4]

Bothorel C, Cruz J D, Magnani M, Micenkova B. Clustering attributed graphs: models, measures and methods. Network Science, 2015, 3(3): 408–444

[5]

Moayedikia A. Multi-objective community detection algorithm with node importance analysis in attributed networks. Applied Soft Computing, 2018, 67: 434–451

[6]

Atzmüller M. Subgroup and community analytics on attributed graphs. In: Proceedings of CEUR Workshop. 2015

[7]

Boden B. Combined Clustering of Graph and Attribute Data. Rwth Aachen, 2012, 13–18

[8]

Günnemann S, Boden B, Färber I, Seidl T. Efficient mining of combined subspace and subgraph clusters in graphs with feature vectors. In: Proceedings of Pacific-asia Conference on Knowledge Discovery and Data Mining. 2013, 261–275

[9]

Günnemann S, Färber I, Boden B, Seidl T. Subspace clustering meets dense subgraph mining: a synthesis of two paradigms. In: Proceedings of 2010 IEEE International Conference on Data Mining. 2010, 845–850

[10]

Chai B F, Wang J L, Xu J W, Li W B. Active semi-supervised community detection method based on link model. Journal of Computer Applications, 2017, 37(11): 3090–3094

[11]

Yang L, Cao X, Jin D, Wang X, Meng D. A unified semi-supervised community detection framework using latent space graph regularization. IEEE Transactions on Cybernetics, 2015, 45(11): 2585–2598

[12]

Shi X H, Lu H T, He Y C, He S. Community detection in social network with pairwisely constrained symmetric non-negative matrix factorization. In: Proceedings of the IEEE/ACMInternational Conference on Advances in Social Networks Analysis and Mining. 2015, 541–546

[13]

Liu X, Wang W J, He D X, Jiao P F, Jin D, Cannistraci C V. Semisupervised community detection based on non-negative matrix factorization with node popularity. Information Sciences, 2017, 381: 304–321

[14]

Liu W Y, Yue K, Liu H, Zhang P. Associative categorization of frequent patterns based on the probabilistic graphical model. Frontiers of Computer Science, 2014, 8(2): 265–278

[15]

Combe D, Largeron C, Egyed-Zsigmond E, Géry M. Combining relations and text in scientific network clustering. In: Proceedings of International Conference on Advances in Social Networks Analysis and Mining. 2012, 1248–1253

[16]

Dang T, Viennet E. Community detection based on structural and attribute similarities. In: Proceedings of International Conference on Digital Society. 2012, 7–12

[17]

Neville J, Adler M, Jensen D. Clustering relational data using attribute and link information. In: Proceedings of International Joint Conference on Text Mining and Link Analysis Workshop. 2003

[18]

Muslim N. A combination approach to community detection in social networks by utilizing structural and attribute data. Social Networking, 2016, 5(1): 11–15

[19]

Elhadi H, Agam G. Structure and attributes community detection: comparative analysis of composite, ensemble and selection methods. In: Proceedings of Workshop on Social Network Mining and Analysis. 2013, 1–10

[20]

Strehl A, Ghosh J. Cluster ensembles — a knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research, 2003, 3(3): 583–617

[21]

Xu Z Q, Ke Y P, Wang Y, Cheng H. A model-based approach to attributed graph clustering. In: Proceedings of ACM Sigmod International Conference on Management of Data. 2012, 505–516

[22]

Xu Z Q, Ke Y P, Wang Y, Cheng H, Cheng J. GBAGC: a general bayesian framework for attributed graph clustering. ACM Transactions on Knowledge Discovery from Data, 2014, 9(1): 1–43

[23]

Yu L, Wu B, Wang B. Topic model-based link community detection with adjustable range of overlapping. In: Proceedings of International Conference on Advances in Social Networks Analysis and Mining. 2013, 1437–1438

[24]

Liu L, Peng T. Clustering-based topical Web crawling using CFu-tree guided by link-context. Frontiers of Computer Science, 2014, 8(4): 581–595

[25]

Zhu S H, Yu K, Chi Y, Gong Y H. Combining content and link for classification using matrix factorization. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2007, 487–494

[26]

Yang T B, Jin R, Chi Y, Zhu S J. Combining link and content for community detection. In: Proceedings of Encyclopedia of Social Network Analysis and Mining. 2017, 1–10

[27]

Liu D, Liu X, Wang W J, Bai H Y. Semi-supervised community detection based on discrete potential theory. Physica A: Statistical Mechanics and Its Applications, 2014, 416: 173–182

[28]

Ma X K, Gao L, Yong X R, Fu L D. Semi-supervised clustering algorithm for community structure detection in complex networks. Physica A: Statistical Mechanics and Its Applications, 2010, 389(1): 187–197

[29]

Deng X L, Wen Y, Chen Y H. Highly efficient epidemic spreading model based LPA threshold community detection method. Neurocomputing, 2016, 210: 3–12

[30]

Wang X, Cui P, Wang J, Pei J. Community preserving network embedding. In: Proceedings of AAAI Conference on Artificial Intelligence. 2017

[31]

Wang W J, Liu X, Jiao P F, Chen X, Jin D. A unified weakly supervised framework for community detection and semantic matching. In: Proceedings of Pacific-asia Conference on Knowledge Discovery and Data Mining. 2018, 218–230

[32]

Brunet J P, Tamayo P, Golub T R, Mesirov J P. Metagenes and molecular pattern discovery using matrix factorization. Proceedings of the National Academy of Sciences, 2004, 101(12): 4164–4169

[33]

Cavallari S, Zheng W S, Cai H Y, Chang C C. Learning community embedding with community detection and node embedding on graphs. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. 2017, 377–386

[34]

Eaton E, Mansbach R. A spin-glass model for semi-supervised community detection. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence. 2012, 900–906

[35]

Jin H, Yu W, Li S J. Graph regularized nonnegative matrix tri-factorization for overlapping community detection. Physica A: Statistical Mechanics and Its Applications, 2019, 515: 376–387

[36]

Pei Y, Chakraborty N, Sycara K P. Nonnegative matrix tri-factorization with graph regularization for community detection in social networks. In: Proceedings of International Conference on Artificial Intelligence. 2015

[37]

Zhu S H, Yu K, Chi Y, Gong Y H. Combining content and link for classification using matrix factorization. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2007, 487–494

[38]

Wang X, Jin D, Cao X C, Yang L. Semantic community identification in large attribute networks. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2016, 265–271

[39]

Wu Q Y, Wang Z Y, Li C S, Ye Y M. Protein functional properties prediction in sparsely-label PPI networks through regularized non-negative matrix factorization. BMC Systems Biology, 2015, 9(S1): S9

[40]

Wang R S, Zhang S H, Wang Y, Zhang X S, Chen L N. Clustering complex networks and biological networks by nonnegative matrix factorization with various similarity measures. Neurocomputing, 2008, 72(1–3): 134–141

[41]

Zhang Y, Du N, Ge L, Jia K B. A collective NMF method for detecting protein functional module from multiple data sources. In: Proceedings of ACM Conference on Bioinformatics. 2012, 655–660

[42]

Chin P, Rao A, Vu V. Stochastic block model and community detection in the sparse graphs: a spectral algorithm with optimal rate of recovery. In: Proceedings of Conference on Learning Theory. 2015, 391–423

[43]

Cao J X, Jin D, Yang L, Dang J W. Incorporating network structure with node contents for community detection on large networks using deep learning. Neurocomputing, 2018, 297: 71–81

[44]

Wang D D, Li T, Zhu S G, Ding C H Q. Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2008, 307–314

[45]

Perozzi B, Al-Rfou R, Skiena S. DeepWalk: online learning of social representations. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2014, 701–710

[46]

Tang J, Qu M, Wang M Z, Zhang M. Line: large-scale information network embedding. In: Proceedings of the 24th International Conference on World Wide Web. 2015, 1067–1077

[47]

Grover A, Leskovec J. node2vec: scalable feature learning for networks. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016, 855–864

[48]

Zhang Z Y, Sun K D, Wang S Q. Enhanced community structure detection in complex networks with partial background information. Scientific Reports, 2013, 3(1): 3241

RIGHTS & PERMISSIONS

Higher Education Press

AI Summary AI Mindmap
PDF (531KB)

Supplementary files

Article highlights

1148

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/