Syntactic word embedding based on dependency syntax and polysemous analysis

Zhong-lin YE; Hai-xing ZHAO

doi:10.1631/FITEE.1601846

PDF(653 KB)

Front. Inform. Technol. Electron. Eng ›› 2018, Vol. 19 ›› Issue (4) : 524-535. DOI: 10.1631/FITEE.1601846

Orginal Article

Syntactic word embedding based on dependency syntax and polysemous analysis

Zhong-lin YE¹ ,
Hai-xing ZHAO¹^,²

Author information +

History +

Abstract

Most word embedding models have the following problems: (1) In the models based on bag-of-words contexts, the structural relations of sentences are completely neglected; (2) Each word uses a single embedding, which makes the model indiscriminative for polysemous words; (3) Word embedding easily tends to contextual structure similarity of sentences. To solve these problems, we propose an easy-to-use representation algorithm of syntactic word embedding (SWE). The main procedures are: (1) A polysemous tagging algorithm is used for polysemous representation by the latent Dirichlet allocation (LDA) algorithm; (2) Symbols ‘+’ and ‘−’ are adopted to indicate the directions of the dependency syntax; (3) Stopwords and their dependencies are deleted; (4) Dependency skip is applied to connect indirect dependencies; (5) Dependency-based contexts are inputted to a word2vec model. Experimental results show that our model generates desirable word embedding in similarity evaluation tasks. Besides, semantic and syntactic features can be captured from dependency-based syntactic contexts, exhibiting less topical and more syntactic similarity. We conclude that SWE outperforms single embedding learning models.

Keywords

Dependency-based context / Polysemous word representation / Representation learning / Syntactic word embedding

Cite this article

EndNote

Ris (Procite)

Bibtex

Download citation ▾

Zhong-lin YE, Hai-xing ZHAO. Syntactic word embedding based on dependency syntax and polysemous analysis. Front. Inform. Technol. Electron. Eng, 2018, 19(4): 524‒535 https://doi.org/10.1631/FITEE.1601846