Comparison of Methods for Assessing the Semantic Similarity of Text Fragments

K. Krez; E. Shneiderov; P. Shish; E. Kondratenko

doi:10.35596/1729-7648-2026-24-2-85-91

Comparison of Methods for Assessing the Semantic Similarity of Text Fragments

K. Krez, E. Shneiderov, P. Shish, E. Kondratenko

https://doi.org/10.35596/1729-7648-2026-24-2-85-91

Full Text:

PDF (Rus)

Generate QR code

Abstract

With the rapid growth of text data, there is a need for methods capable of effectively comparing text fragments by meaning, including cases of paraphrasing, synonymization, and sentence restructuring. One of the pressing challenges is comparing the results of semantic comparison methods based on various models with the human perception of semantic similarity. This article discusses an expert method for assessing the semantic similarity of text fragments based on the assessments of survey participants. The method consists of creating an interpretable semantic similarity scale derived from human perception of text content and used to analyze the consistency of various methods. To develop a “human” assessment, a survey of 138 participants was conducted. A comparative analysis revealed that various semantic similarity assessment methods demonstrate varying degrees of consistency with the human perception of text semantic similarity.

Keywords

semantic similarity, natural language processing, expert method, text comparison, rating scaling, questionnaire, Pearson correlation, Spearman correlation

About the Authors

K. Krez

Belarusian State University of Informatics and Radioelectronics
Belarus

Krez Karina, Postgraduate, Assistant at the Department of Information and Computer Systems Design

220013, Minsk, P. Brovki St., 6

Тel.: +375 29 952-75-56

E. Shneiderov

Belarusian State University of Informatics and Radioelectronics
Belarus

Cand. Sci. (Tech.), Associate Professor at the Department of Information and Computer Systems Design

Minsk

P. Shish

Belarusian State University of Informatics and Radioelectronics
Belarus

Student

Minsk

E. Kondratenko

Belarusian State University of Informatics and Radioelectronics
Belarus

Student

Minsk

References

1. Devlin J., Chang M.-W., Lee K., Toutanova K. (2019) BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2019). 4171–4186. DOI: 10.18653/v1/N19-1423.

2. Reimers N., Gurevych I. (2019) Sentence-BERT: Sentence Embeddings Using Siamese BERTNetworks. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP 2019). 3982–3992. DOI: 10.18653/v1/D19-1410.

3. Salton G., Buckley C. (1988) Term-Weighting Approaches in Automatic Text Retrieval. Information Processing & Management. 24 (5), 513–523. DOI: 10.1016/0306-4573(88)90021-0.

4. Gao T., Yao X., Chen D. (2021) SimCSE: Simple Contrastive Learning of Sentence Embeddings. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 6894–6910. DOI: 10.18653/v1/2021.emnlp-main.552.

5. Feng F., Yang Y., Cer D., Arivazhagan N., Wang W. (2022) Language-Agnostic BERT Sentence Embedding. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. 878–891. DOI: 10.18653/v1/2022.acl-long.62.

Review

For citations:

Krez K., Shneiderov E., Shish P., Kondratenko E. Comparison of Methods for Assessing the Semantic Similarity of Text Fragments. Doklady BGUIR. 2026;24(2):85-91. (In Russ.) https://doi.org/10.35596/1729-7648-2026-24-2-85-91

JATS XML

This work is licensed under a Creative Commons Attribution 4.0 License.

ISSN 1729-7648 (Print)
ISSN 2708-0382 (Online)

Username
Password
	Remember me
Not a user? Register with this site Forgot your password?

User

Doklady BGUIR

Comparison of Methods for Assessing the Semantic Similarity of Text Fragments

Full Text:

Abstract

Keywords

About the Authors

References

Review

For citations:

Cookies policy