Topicmodeling for large-scale text data
Xi-ming LI, Ji-hong OUYANG
Topicmodeling for large-scale text data
This paper develops a novel online algorithm, namely moving average stochastic variational inference (MASVI), which applies the results obtained by previous iterations to smooth out noisy natural gradients. We analyze the convergence property of the proposed algorithm and conduct a set of experiments on two large-scale collections that contain millions of documents. Experimental results indicate that in contrast to algorithms named ‘stochastic variational inference’ and ‘SGRLD’, our algorithm achieves a faster convergence rate and better performance.
Latent Dirichlet allocation (LDA) / Topic modeling / Online learning / Moving average
[1] |
Amari, S., 1998. Natural gradient works efficiently in learning. Neur. Comput., 10(2): 251-276. [
CrossRef
Google scholar
|
[2] |
Andrieu, C., de Freitas, N., Doucet, A.,
CrossRef
Google scholar
|
[3] |
Blatt, D., Hero, A.O., Gauchman, H., 2007. A convergent incremental gradient method with a constant step size. SIAM J. Optim., 18(1): 29-51. [
CrossRef
Google scholar
|
[4] |
Blei, D.M., 2012. Probabilistic topic models. Commun. ACM, 55(4): 77-84. [
CrossRef
Google scholar
|
[5] |
Blei, D.M., Ng, A.Y., Jordan, M.I., 2003. Latent Dirichlet allocation. J. Mach. Learn. Res., 3: 993-1022.
|
[6] |
Canini, K.R., Shi, L., Griffiths, T.L., 2009. Online inference of topics with latent Dirichlet allocation. J. Mach. Learn. Res., 5(2): 65-72.
|
[7] |
Griffiths, T.L., Steyvers, M., 2004. Finding scientific topics. PNAS, 101(suppl 1): 5228-5235. [
CrossRef
Google scholar
|
[8] |
Hoffman, M., Bach, F.R., Blei, D.M., 2010. Online learning for latent Dirichlet allocation. Advances in Neural Information Processing Systems, p.856-864.
|
[9] |
Hoffman, M., Blei, D.M., Wang, C.,
|
[10] |
Liu, Z., Zhang, Y., Chang, E.Y.,
|
[11] |
Newman, D., Asuncion, A., Smyth, P.,
|
[12] |
Ouyang, J., Lu, Y., Li, X., 2014. Momentum online LDA for large-scale datasets. Proc. 21st European Conf. on Artificial Intelligence, p.1075-1076.
|
[13] |
Patterson, S., Teh, Y.W., 2013. Stochastic gradient Riemannian Langevin dynamics on the probability simplex. Advances in Neural Information Processing Systems, p.3102-3110.
|
[14] |
Ranganath, R., Wang, C., Blei, D.M.,
|
[15] |
Schaul, T., Zhang, S., LeCun, Y., 2013. No more pesky learning rates. arXiv preprint, arXiv:1206:1106v2.
|
[16] |
Song, X., Lin, C.Y., Tseng, B.L.,
CrossRef
Google scholar
|
[17] |
Tadić, V.B., 2009. Convergence rate of stochastic gradient search in the case of multiple and non-isolated minima. arXiv preprint, arXiv:0904.4229v2.
|
[18] |
Teh, Y.W., Newman, D., Welling, M., 2007. A collapsed variational Bayesian inference algorithm for latent Dirichlet allocation. Advances in Neural Information Processing Systems, p.1353-1360.
|
[19] |
Wang, C., Chen, X., Smola, A.J.,
|
[20] |
Wang, Y., Bai, H., Stanton, M.,
CrossRef
Google scholar
|
[21] |
Yan, F., Xu, N., Qi, Y., 2009. Parallel inference for latent Dirichlet allocation on graphics processing units. Advances in Neural Information Processing Systems, p.2134-2142.
|
[22] |
Ye, Y., Gong, S., Liu, C.,
CrossRef
Google scholar
|
/
〈 | 〉 |