Analysis of the methods of word sense disambiguation in the biomedical domain
https://doi.org/10.35596/1729-7648-2019-123-5-60-65
Abstract
A method for resolving the lexical ambiguity of biomedical terms has been proposed. The method is based on a comparison of «word bags» obtained from the context, definitions and information on related terms from the UMLS metathesaurus [1]. Modification of the method using the analysis of word importance using the statistical measure TF-IDF has been proposed. Experimental verification of the method has been performed on the open test MSH WSD data set [2], developed to support research in the field of lexical resolution.
About the Authors
A. V. PashukBelarus
Pashuk Aleksandr Vladimirovich - PG student of the control systems department.
220013, Minsk, P. Brovka str., 6, tel. +375-29-875-23-34
A. B. Gurinovich
Belarus
PhD, associate professor of computational methods and programming department.
220013, Minsk, P. Brovka str., 6
N. A. Volorova
Belarus
PhD, associate professor of the informatics department of Belarusian state university of informatics and radioelectronics.
220013, Minsk, P. Brovka str., 6
A. P. Kuznetsov
Belarus
D.Sci, professor of control systems department.
220013, Minsk, P. Brovka str., 6
References
1. . Unified Medical Language System (UMLS) // U.S. National Library of Medicine. URL: https://www.nlm.nih.gov/research/umls/ (date of access: 20.11.2018).
2. Word Sense Disambiguation (WSD) Test Collections // U.S. National Library of Medicine. URL: https://wsd.nlm.nih.gov/ (date of access: 30.11.2018).
3. Statistical Reports on MEDLINE/PubMed Baseline Data // U.S. National Library of Medicine. URL: https://www.nlm.nih.gov/bsd/licensee/baselinestats.html (date of access: 16.11.2018).
4. Ide N., Veronis J. Introduction to the special issue on word sense disambiguation: the state of the art // Computational Linguistics - Special issue on word sense disambiguation. 1998. № 24. P. 2-40.
5. Navigli R. Word sense disambiguation: a survey // ACM Computing Surveys. 2009. № 41. P. 1-69.
6. Lesk M. Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone // Proceeding SIGDOC '86 Proceedings of the 5th annual international conference on Systems documentation. Toronto, Ontario, Canada: ACM, 1986. P. 24-26.
7. Leacock C., Miller G.A. Using corpus statistics and WordNet relations for sense identification // Computational Linguistics - Special issue on word sense disambiguation. 1998. № 24. P. 147-165.
8. Preiss J., Stevenson M. DALE: A Word Sense Disambiguation System for Biomedical Documents Trained using Automatically Labeled Examples // Proceedings of the 2013 NAACL HLT Demonstration Session. Atlanta, Georgia: Association for Computational Linguistics, 2013. P. 1-4.
9. Liu H., Teller V., Friedman C.A Multi-aspect Comparison Study of Supervised Word Sense Disambiguation // Journal of the American Medical Informatics Association. 2004. № 11. P. 320-331.
10. Word sense disambiguation across two domains: Biomedical literature and clinical notes / G.K. Savova [et al.] // Journal of Biomedical Informatics. 2008. № 41. P. 1088-1100.
11. Jimeno-Yepes A. J., Aronson A. R. Knowledge-based biomedical word sense disambiguation: comparison of approaches // BMC Bioinformatics. 2010. № 11. P. 569-581.
Review
For citations:
Pashuk A.V., Gurinovich A.B., Volorova N.A., Kuznetsov A.P. Analysis of the methods of word sense disambiguation in the biomedical domain. Doklady BGUIR. 2019;(5):60-65. (In Russ.) https://doi.org/10.35596/1729-7648-2019-123-5-60-65