Analysis of Patents Related to COVID-19 - Based on Patent Clustering Model in Specific Fields

Fu Nan , Li Qian , Yuan Hongmei

Asian Journal of Social Pharmacy ›› 2024, Vol. 19 ›› Issue (4) : 371 -382.

Asian Journal of Social Pharmacy ›› 2024, Vol. 19 ›› Issue (4) :371 -382.
research-article
Analysis of Patents Related to COVID-19 - Based on Patent Clustering Model in Specific Fields
Author information +
History +

Abstract

Objective To improve the efficiency of patent clustering related to COVID-19 through the topic extraction algorithm and BERT model, and to help researchers understand the patent applications for novel corona virus. Methods The weights of topic vector and BERT model vector were adjusted by cross-entropy loss algorithm to obtain joint vector. Then, k-means++ algorithm was used for patent clustering after dimension reduction. Results and Conclusion The model was applied to patents for corona virus drugs, and five clustering topics were generated. Through comparison, it is proved that the clustering results of this model are more centralized and the differentiation between clusters is significant. The five clusters generated are visually analyzed to reveal the development status of patents for corona virus drugs.

Keywords

corona virus / patent clustering / patent analysis / BERT model

Cite this article

Download citation ▾
Fu Nan, Li Qian, Yuan Hongmei. Analysis of Patents Related to COVID-19 - Based on Patent Clustering Model in Specific Fields. Asian Journal of Social Pharmacy, 2024, 19(4): 371-382 DOI:

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Deng Qiang, Wang Zichen, Wu Qin, et al. Research progress of novel corona virus vaccine[J]. Journal of Jinan University (Natural Science and Medical Edition), 2020, 41 (6): 511-519.

[2]

Swapna P, Naga VSB, Ahmed K. COVID-19: Vaccines and therapeutics[J]. Bioorganic & Medicinal Chemistry Letters, 2022, 75: 128987.

[3]

Chao Rong, Zou Yi, Long Min. Analysis and reflection on the patent literature of corona virus in China[J]. Library, 2020 (10): 89-95.

[4]

Qiu Xipeng, Sun Tianxiang, Xu Yige, et al. Pre-trained models for natural language processing: A survey[J]. Science China (Technological Sciences), 2020, 63: 1872-1897.

[5]

Hu Xiaohua, Zhang Xiaodan, Lu Caimei, et al. Exploiting wikipedia as external knowledge for document clustering [A]. KDD ‘09: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining[C]. New York, United States: Association for Computing Machinery, 2009: 389-396.

[6]

Wei Tingting, Lu Yonghe, Chang Huiyou, et al. A semantic approach for text clustering using WordNet and lexical chains[J]. Expert Systems with Applications, 2015, 42 (4): 2264-2275.

[7]

Kozlowski M, Rybinski H. Clustering of semantically enriched short texts[J]. Journal of Intelligent Information Systems, 2019, 53 (1): 1-24.

[8]

Zheng Chutao, Liu Cheng, Wong Hausan. Corpus-based topic diffusion for short text clustering[J]. Neurocomputing, 2018, 275: 2444-2458.

[9]

Zhu Liangqi, Huang Bo, Huang Jitao, et al. Research on short text clustering[J]. Computer Engineering and Application, 2022, 58 (2): 145-152.

[10]

Subakti A, Murfi H, Hariadi N. The performance of BERT as data representation of text clustering[J]. Journal of Big Data, 2022, 9 (1): 15-26.

[11]

Jenna K, Filip G, Sampo P. Dependency parsing of biomedical text with BERT[J]. BMC Bioinformatics, 2020, 21: 580.

[12]

Li Zhengguang, Chen Heng, Qi Ruihua, et al. DocR-BERT: Document-level R-BERT for chemical-induced disease relation extraction via gaussian probability distribution[J]. IEEE Journal of Biomedical and Health Informatics, 2022, 26 (3): 1341-1352.

[13]

Zhang Yue, Lin Jianyuan, Zhao Lianmin, et al. Novel antibacterial peptide recognition algorithm based on BERT[J]. Briefings in Bioinformatics, 2021, 22 (6): 1-11.

[14]

Blei DM, Ng AY, Joradn MI. Latent dirichlet allocation[J]. Journal of Machine Learning Research, 2003, 3: 993-1022.

[15]

Sun Changnian, Zheng Cheng, Xia Qingsong. Chinese text similarity calculation based on LDA[J]. Computer Technology and Development, 2013, 23 (1): 217-220.

[16]

Devlin J, Chang MW, Lee K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[EB/OL]. (2018-10-11)[2022-11-22]. https://arxiv.org/abs/1810.04805.

[17]

David A, Sergei V. K-means++: The advantages of careful seeding [A]. SODA ‘07: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms[C]. United States: Society for Industrial and Applied Mathematics, 2007: 1027-1035.

[18]

Cui Yiming, Che Wanxiang, Liu Ting, et al. Pre-training with whole word masking for Chinese BERT[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2021, 29: 3504-3514.

[19]

Arbelaitz O, Gurrutxaga I, Muguerza J, et al. An extensive comparative study of cluster validity indices[J]. Pattern Recognition, 2013, 46 (1): 243-256.

[20]

Peter R. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis[J]. Journal of Computational and Applied Mathematics, 1987, 20 (1): 53-65.

12

Accesses

0

Citation

Detail

Sections
Recommended

/