Topic Splitting: A Hierarchical Topic Model Based on Non-Negative Matrix Factorization

Rui Liu; Xingguang Wang; Deqing Wang; Yuan Zuo; He Zhang; Xianzhu Zheng

doi:10.1007/s11518-018-5375-7

Journal of Systems Science and Systems Engineering ›› 2018, Vol. 27 ›› Issue (4) :479 -496. DOI: 10.1007/s11518-018-5375-7

Article

Topic Splitting: A Hierarchical Topic Model Based on Non-Negative Matrix Factorization

Author information +

History +

PDF

Abstract

Hierarchical topic model has been widely applied in many real applications, because it can build a hierarchy on topics with guaranteeing of topics’ quality. Most of traditional methods build a hierarchy by adopting low-level topics as new features to construct high-level ones, which will often cause semantic confusion between low-level topics and high-level ones. To address the above problem, we propose a novel topic model named hierarchical sparse NMF with orthogonal constraint (HSOC), which is based on non-negative matrix factorization and builds topic hierarchy via splitting super-topics into sub-topics. In HSOC, we introduce global independence, local independence and information consistency to constraint the split topics. Extensive experimental results on real-world corpora show that the purposed model achieves comparable performance on topic quality and better performance on semantic feature representation of documents compared with baseline methods.

Keywords

Hierarchical topic model / non-negative matrix factorization / hierarchical NMF / topic splitting

Cite this article

Download citation ▾

Rui Liu, Xingguang Wang, Deqing Wang, Yuan Zuo, He Zhang, Xianzhu Zheng. Topic Splitting: A Hierarchical Topic Model Based on Non-Negative Matrix Factorization. Journal of Systems Science and Systems Engineering, 2018, 27(4): 479-496 DOI:10.1007/s11518-018-5375-7

登录浏览全文

4963

注册一个新账户忘记密码

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Blei D. M., Ng A. Y., Jordan M I.. Latent dirichlet allocation. The Journal of Machine Learning Research, 2006, 3: 993-1022.

[2]	Boyd–Graber J. L., Blei D. M. Z. X.. A topic model for word sense disambiguation, 2007

[3]	Choi S.. Algorithms for orthogonal nonnegative matrix factorization. 2008. IEEE International Joint Conference on Neural Networks, 2008 1828-1832.

[4]	Chen Y., Zhang H., Wu J., et al. Modeling emerging, evolving and fading topics using dynamic soft orthogonal NMF with sparse representation. 2015 IEEE International Conference on Data Mining (ICDM). IEEE, 2015, 2015: 61-70.

[5]	Deerwester S. C., Dumais S. T., Landauer T. K., et al. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 1990, 41(6): 391-407.

[6]	Fan R. E., Chang K. W., Hsieh C. J., et al. LIBLINEAR: A library for large linear classification. The Journal of Machine Learning Research, 2008, 9: 1871-1874.

[7]	Gilks W. R., Wild P.. Adaptive rejection sampling for Gibbs sampling. Applied Statistics, 1992, 1992: 337-348.

[8]	Golub G. H., Reinsch C.. Singular value decomposition and least squares solutions. Numerische Mathematik, 1970, 14(5): 403-420.

[9]	Griffiths D., Tenenbaum M.. Hierarchical topic models and the nested Chinese restaurant process. Advances in Neural Information Processing Systems, 2004

[10]	Gillis N., Kuang D., Park H.. Hierarchical clustering of hyperspectral images using rank–two nonnegative matrix factorization. IEEE Transactions on Geoscience and Remote Sensing, 2015, 53(4): 2066-2078.

[11]	Hartigan J. A.. Clustering algorithms, 1975.

[12]	Hoffman M., Bach F. R., Blei D. M.. Online learning for latent dirichlet allocation. Advances in Neural Information Processing Systems., 2010, 2010: 856-864.

[13]	Hofmann, T.(1999). Probabilistic latent semantic analysis. Proceedings of the Fifteenth conference on Uncertainty in Artificial Intelligence. Morgan Kaufmann Publishers Inc. 1999: 289–296.

[14]	Hoyer P. O.. Non–negative matrix factorization with sparseness constraints. The Journal of Machine Learning Research, 2004, 5: 1457-1469.

[15]	Hoyer P. O.. Non–negative sparse coding. Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing. IEEE, 2002, 2002: 557-565.

[16]	Hyvarinen A., Oja E.. Independent component analysis: algorithms and applications. Neural Networks, 2000, 13(4): 411-430.

[17]	Jain A. K., Dubes R. C.. Algorithms for clustering data, 1988.

[18]	Johnson S. C.. Hierarchical clustering schemes. Psychometrika, 1967, 32(3): 241-254.

[19]	Jolliffe I.. Principal Component Analysis, 2002.

[20]	Kimura K., Tanaka Y., Kudo M.. A fast hierarchical alternating least squares algorithm for orthogonal nonnegative matrix factorization, 2014

[21]	Lee D. D., Seung H. S.. Learning the parts of objects by non–negative matrix factorization. Nature, 1999, 401(6755): 788-791.

[22]	Lee H., Battle A., Raina R., et al. Efficient sparse coding algorithms. Advances in Neural Information Processing Systems., 2006, 2006: 801-808.

[23]	Li W., McCallum A.. Pachinko allocation: DAGstructured mixture models of topic correlations. Proceedings of the 23rd International Conference on Machine Learning. ACM, 2006, 2006: 577-584.

[24]	Mao X. L., Ming Z. Y., Chua T. S., et al. SSHLDA: a semi–supervised hierarchical topic model. Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Association for Computational Linguistics, 2012, 2012: 800-809.

[25]	Mikolov T., Chen K., Corrado G., et al. Efficieqnt estimation of word representations in vector space, 2013

[26]	Mimno D., Li W., McCallum A.. Mixtures of hierarchical topics with pachinko allocation. Proceedings of the 24th International Conference on Machine Learning. ACM, 2007, 2007: 633-640.

[27]	Mimno D., Wallach H.M., Talley E., Leenders M., McCallum A.. Optimizing semantic coherence in topic models. Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2011, 2011: 262-272.

[28]	Moon T. K.. The expectation–maximization algorithm. Signal Processing Magazine, IEEE, 1996, 13(6): 47-60.

[29]

Stevens K., Kegelmeyer P., Andrzejewski D., et al. Exploring topic coherence over many models and many topics. Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Association for Computational Linguistics, 2012, 2012: 952-961.

[30]	Perotte A. J., Wood F., Elhadad N., et al. Hierarchically supervised latent Dirichlet allocation. Advances in Neural Information Processing Systems., 2011, 2011: 2609-2617.

[31]	Porteous I., Newman D., Ihler A., et al. Fast collapsed gibbs sampling for latent dirichlet allocation. Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2008, 2008: 569-577.

[32]	Pennacchiotti M., Gurumurthy S.. Investigating topic models for social media user recommendation. Proceedings of the 20th International Conference Companion on World Wide Web, 2011, 2011: 101-102.

[33]	Ramage D., Hall D., Nallapati R., et al. Labeled LDA: A supervised topic model for credit attribution in multi–labeled corpora. Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1–Volume 1. Association for Computational Linguistics, 2009, 2009: 248-256.

[34]	Rosen–Zvi M., Griffiths T., Steyvers M., et al. The author–topic model for authors and documents. Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence. AUAI Press, 2004, 2004: 487-494.

[35]	Roder M., Both A., Hinneburg A.. Exploring the space of topic coherence measures. Proceedings of the 8th ACM International Conference on Web Search and Data Mining. ACM, 2015, 2015: 399-408.

[36]	Salton G., Wong A., Yang C. S.. A vector space model for automatic indexing. Communications of the ACM, 1975, 18(11): 613-620.

[37]	Steinbach M., Karypis G., Kumar V.. A comparison of document clustering techniques. KDD Workshop on Text Mining., 2000, 400(1): 525-526.

[38]	Journal of the American Statistical Association, 2006, 101(476):

[39]	Trigeorgis G., Bousmalis K., Zafeiriou S., et al. A deep semi–nmf model for learning hidden representations. Proceedings of the 31st International Conference on Machine Learning, 2014, 2014: 1692-1700.

[40]	Than K., Ho T. B.. Fully sparse topic models. Machine Learning and Knowledge Discovery in Databases. Springer Berlin Heidelberg, 2012, 2012: 490-505.

[41]	Wainwright M. J.. Structured regularizers for highdimensional problems: Statistical and computational issues. Annual Review of Statistics and Its Application, 2014, 2014: 233-253.

[42]	Wang Y., Zhao X., Sun Z., et al. Peacock: learning longtail topic features for industrial applications. ACM Transactions on Intelligent Systems and Technology (TIST), 2015, 6(4): 47

[43]	Yu B.. An evaluation of text classification methods for literary study. Literary and Linguistic Computing, 2008, 23(3): 327-343.

[44]	Zuo Y., Zhao J., Xu K.. Word network topic model: a simple but general solution for short and imbalanced texts. Knowledge and Information Systems, 2014 1-20.