Topicmodeling for large-scale text data

Xi-ming LI; Ji-hong OUYANG

doi:10.1631/FITEE.1400352

PDF(430 KB)

Front. Inform. Technol. Electron. Eng ›› 2015, Vol. 16 ›› Issue (6) : 457-465. DOI: 10.1631/FITEE.1400352

Topicmodeling for large-scale text data

Xi-ming LI¹^,² ,
Ji-hong OUYANG¹^,²

Author information +

History +

Abstract

This paper develops a novel online algorithm, namely moving average stochastic variational inference (MASVI), which applies the results obtained by previous iterations to smooth out noisy natural gradients. We analyze the convergence property of the proposed algorithm and conduct a set of experiments on two large-scale collections that contain millions of documents. Experimental results indicate that in contrast to algorithms named ‘stochastic variational inference’ and ‘SGRLD’, our algorithm achieves a faster convergence rate and better performance.

Keywords

Latent Dirichlet allocation (LDA) / Topic modeling / Online learning / Moving average

Cite this article

EndNote

Ris (Procite)

Bibtex

Download citation ▾

Xi-ming LI, Ji-hong OUYANG. Topicmodeling for large-scale text data. Front. Inform. Technol. Electron. Eng, 2015, 16(6): 457‒465 https://doi.org/10.1631/FITEE.1400352

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Amari, S., 1998. Natural gradient works efficiently in learning. Neur. Comput., 10(2): 251-276. [ CrossRef Google scholar

[2]	Andrieu, C., de Freitas, N., Doucet, A., , 2003. An introduction to MCMC for machine learning. Mach. Learn., 50(1-2): 5-43. [ CrossRef Google scholar

[3]	Blatt, D., Hero, A.O., Gauchman, H., 2007. A convergent incremental gradient method with a constant step size. SIAM J. Optim., 18(1): 29-51. [ CrossRef Google scholar

[4]	Blei, D.M., 2012. Probabilistic topic models. Commun. ACM, 55(4): 77-84. [ CrossRef Google scholar

[5]	Blei, D.M., Ng, A.Y., Jordan, M.I., 2003. Latent Dirichlet allocation. J. Mach. Learn. Res., 3: 993-1022.

[6]	Canini, K.R., Shi, L., Griffiths, T.L., 2009. Online inference of topics with latent Dirichlet allocation. J. Mach. Learn. Res., 5(2): 65-72.

[7]	Griffiths, T.L., Steyvers, M., 2004. Finding scientific topics. PNAS, 101(suppl 1): 5228-5235. [ CrossRef Google scholar

[8]	Hoffman, M., Bach, F.R., Blei, D.M., 2010. Online learning for latent Dirichlet allocation. Advances in Neural Information Processing Systems, p.856-864.

[9]	Hoffman, M., Blei, D.M., Wang, C., , 2013. Stochastic variational inference. J. Mach. Learn. Res., 14(1): 1303-1347.

[10]	Liu, Z., Zhang, Y., Chang, E.Y., , 2011. PLDA+: parallel latent Dirichlet allocation with data placement and pipeline processing. ACM Trans. Intell. Syst. Technol., 2(3), Article 26.

[11]	Newman, D., Asuncion, A., Smyth, P., , 2009. Distributed algorithms for topic models. J. Mach. Learn. Res., 10: 1801-1828.

[12]	Ouyang, J., Lu, Y., Li, X., 2014. Momentum online LDA for large-scale datasets. Proc. 21st European Conf. on Artificial Intelligence, p.1075-1076.

[13]	Patterson, S., Teh, Y.W., 2013. Stochastic gradient Riemannian Langevin dynamics on the probability simplex. Advances in Neural Information Processing Systems, p.3102-3110.

[14]	Ranganath, R., Wang, C., Blei, D.M., , 2013. An adaptive learning rate for stochastic variational inferencen. J. Mach. Learn. Res., 28(2): 298-306.

[15]	Schaul, T., Zhang, S., LeCun, Y., 2013. No more pesky learning rates. arXiv preprint, arXiv:1206:1106v2.

[16]	Song, X., Lin, C.Y., Tseng, B.L., , 2005. Modeling and predicting personal information dissemination behavior. Proc. 11th ACM SIGKDD Int. Conf. on Knowledge Discovery in Data Mining, p.479-488. [ CrossRef Google scholar

[17]	Tadić, V.B., 2009. Convergence rate of stochastic gradient search in the case of multiple and non-isolated minima. arXiv preprint, arXiv:0904.4229v2.

[18]	Teh, Y.W., Newman, D., Welling, M., 2007. A collapsed variational Bayesian inference algorithm for latent Dirichlet allocation. Advances in Neural Information Processing Systems, p.1353-1360.

[19]	Wang, C., Chen, X., Smola, A.J., , 2013. Variance reduction for stochastic gradient optimization. Advances in Neural Information Processing Systems, p.181-189.

[20]	Wang, Y., Bai, H., Stanton, M., , 2009. PLDA: parallel latent Dirichlet allocation for large-scale applications. Proc. 5th Int. Conf. on Algorithmic Aspects in Information and Management, p.301-314. [ CrossRef Google scholar

[21]	Yan, F., Xu, N., Qi, Y., 2009. Parallel inference for latent Dirichlet allocation on graphics processing units. Advances in Neural Information Processing Systems, p.2134-2142.

[22]	Ye, Y., Gong, S., Liu, C., , 2013. Online belief propagation algorithm for probabilistic latent semantic analysis. Front. Comput. Sci., 7(5): 526-535. [ CrossRef Google scholar