References

bsuir

Доклады БГУИР

Doklady BGUIR

1729-76482708-0382

БГУИР

10.35596/1729-7648-2023-21-4-110-117

bsuir-3689

Research Article

ЭЛЕКТРОНИКА, РАДИОФИЗИКА, РАДИОТЕХНИКА, ИНФОРМАТИКА

ELECTRONICS, RADIOPHYSICS, RADIOENGINEERING, INFORMATICS

Комбинированный метод отбора информативных признаков для выявления речевых патологий по голосу

Combined Method for Informative Feature Selection for Speech Pathology Detection

Лихачёв

Д. С.

Likhachov

D. S.

Лихачёв Денис Сергеевич - кандидат технических наук, доцент, доцент кафедры электронных вычислительных средств.

220013, Минск, ул. П. Бровки, 6. Тел.: +375 17 293-85-05

Likhachov Denis Sergeevich - Cand. of Sci., Associate Professor, Associate Professor at Computer Engineering De partment.

220013, Minsk, P. Brovki St., 6. Tel.: +375 17 293-85-05

likhachov@bsuir.by

Вашкевич

М. И.

Vashkevich

M. I.

Доктор технических наук, доцент, доцент кафедры электронных вычислительных средств.

220013, Минск, ул. П. Бровки, 6

Maxim I. Vashkevich - Dr. of Sci. (Tech.), Associate Professor at Computer Engineering Department.

220013, Minsk, P. Brovki St., 6

Петровский

Н. А.

Petrovsky

N. A.

Кандидат технических наук, доцент, доцент кафедры электронных вычислительных средств.

220013, Минск, ул. П. Бровки, 6

Nick A. Petrovsky - Cand. of Sci., Associate Professor, Associate Professor at Computer Engineering De partment.

220013, Minsk, P. Brovki St., 6

Азаров

И. С.

Azarov

E. S.

Доктор технических наук, доцент, заведующий кафедрой электронных вычислительных средств.

220013, Минск, ул. П. Бровки, 6

Elias S. Azarov - Dr. of Sci. (Tech.), Associate Professor, Head of Computer Engineering Department.

220013, Minsk, P. Brovki St., 6

Белорусский государственный университет информатики и радиоэлектроникиBelarusian State University of Informatics and Radioelectronics

2023

29082023

214110117

2023

Лихачёв Д.С., Вашкевич М.И., Петровский Н.А., Азаров И.С.

Likhachov D.S., Vashkevich M.I., Petrovsky N.A., Azarov E.S.

Данная работа распространяется под лицензией Creative Commons Attribution 4.0.

This work is licensed under a Creative Commons Attribution 4.0 License.

https://doklady.bsuir.by/jour/article/view/3689

Задача выявления голосовых патологий отличается малым объемом доступных данных для обучения, вследствие чего системы классификации, использующие малоразмерные данные, являются наиболее актуальными. Предлагается совместное использование методов LASSO (least absolute shrinkage and selection operator) и BSS (backward stepwise selection) в отборе наиболее значимых признаков для задач определения голосовых патологий, в частности бокового амиотрофического склероза. Использованы признаки на основе мел-частотных кепстральных коэффициентов, традиционно применяемые в обработке речевых сигналов, и на основе дискретной оценки огибающей спектра авторегрессионного процесса. Вторые спектральные признаки извлекаются с помощью генеративного метода, предполагающего вычисление дискретного преобразования Фурье последовательности отчетов, сгенерированной с использованием авторегрессионной модели входного голосового сигнала. Последовательность генерируется таким образом, чтобы учесть периодическую природу преобразования Фурье. Это позволяет повысить точность оценки спектра и уменьшить эффект спектральной утечки. Отбор признаков с помощью методов LASSO и BSS позволил повысить эффективность классификации, используя меньшее число признаков, по сравнению с применением только метода LASSO.

The task of detecting vocal abnormalities is characterized by a small amount of available data for training, as a consequence of which classification systems that use low-dimensional data are the most relevant. We propose to use LASSO (least absolute shrinkage and selection operator) and BSS (backward stepwise selection) methods together to select the most significant features for the detection of vocal pathologies, in particular amyotrophic lateral sclerosis. Features based on fine-frequency cepstral coefficients, traditionally used in speech signal processing, and features based on discrete estimation of the autoregressive spectrum envelope are used. Spectral features based on the autoregressive process envelope spectrum are extracted using the generative method, which involves calculating a discrete Fourier transform of the report sequence generated using the autoregressive model of the input voice signal. The sequence is generated by the autoregressive model so as to account for the periodic nature of the Fourier transform. This improves the accuracy of the spectrum estimation and reduces the spectral leakage effect. Using LASSO in conjunction with BSS allowed us to improve the classification efficiency using a smaller number of features as compared to using the LASSO method alone.

анализ голосагенеративный методавторегрессиямашинное обучениеспектральные признакиклассификация

voice analysisgenerative methodautoregressionmachine learningspectral featuresclassification

References1

Rabiner, L. R. Fundamentals of Speech Recognition / L. R. Rabiner, B. H. Juang // Pearson Education. 1993.

Rabiner L. R., Juang B. H. (1993) Fundamentals of Speech Recognition. Pearson Education.

Benba, A. Discriminating between Patients with Parkinson’s and Neurological Diseases Using Cepstral Analysis / A. Benba, A. Jilbab, A. Hammouch // IEEE Transactions on Neural Systems and Rehabilitation Engineering. 2016. Vol. 24, No 10. P. 1100–1108.

Benba A., Jilbab A., Hammouch A. (2016) Discriminating between Patients with Parkinson’s and Neurological Diseases Using Cepstral Analysis. IEEE Transactions on Neural Systems and Rehabilitation Engineering. 24 (10), 1100–1108.

Vashkevich, M. Classification of ALS Patients Based on Acoustic Analysis of Sustained Vowel Phonations / M. Vashkevich, Y. Rushkevich // Biomedical Signal Processing and Control. 2021. Vol. 65. P. 1–14.

Vashkevich M., Rushkevich Y. (2021) Classification of ALS Patients Based on Acoustic Analysis of Sustained Vowel Phonations. Biomedical Signal Processing and Control. 65, 1–14.

Amyotrophic Lateral Sclerosis / M. C. Kiernan [et al.] // Lancet. 2011. Vol. 377, Iss. 9769. P. 942–955.

Kiernan M. C. (2011) Amyotrophic Lateral Sclerosis. Lancet. 377 (9769), 942–955.

Detection of Bulbar ALS Using a Comprehensive Speech Assessment Battery / Y. Yunusova [et al.] // Proceedings of the International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications. 2013. P. 217–220

Yunusova Y. (2013) Detection of Bulbar ALS Using a Comprehensive Speech Assessment Battery. Proceedings of the International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications. 217–220.

Fractal Features for Automatic Detection of Dysarthria / T. Spangler [et al.] // IEEE EMBS International Conference on Biomedical Health Informatics. 2017. P. 437–440.

Spangler T. (2017) Fractal Features for Automatic Detection of Dysarthria. IEEE EMBS International Conference on Biomedical Health Informatics. 437–440.

Малоразмерные спектральные признаки для машинного обучения в задачах анализа и классификации голосового сигнала / Д. С. Лихачёв [и др.] // Информатика. 2023. № 1. С. 102–112. DOI: 10.37661/18160301-2023-20-1-102-112.

Likhachov D. S., Vashkevich M. I., Petrovsky N. A., Azarov E. S. (2023) Small-Size Spectral Features for Machine Learning in Voice Signal Analysis and Classification Tasks. Informatics. (20), 102–112. DOI: 10.37661/1816-0301-2023-20-1-102-112 (in Russian).

Markel, J. D. Linear Prediction of Speech / J. D. Markel, A. H. Gray. Berlin, New York: Springer-Verlag, 1976. 290 p.

Markel J. D., Gray A. H. (1976) Linear Prediction of Speech. Berlin, New York, Springer-Verlag. 290.

Вашкевич, М. И. Система анализа и классификации голосового сигнала на основе пертрубационных параметров и кепстрального представления в психоакустических шкалах / М. И. Вашкевич, Д. С. Лихачёв, И. С. Азаров // Доклады БГУИР. 2022. Т. 20, № 4. С. 73–82. DOI: https://doi.org/10.35596/1729-7648-2022-20-1-73-82.

Vashkevich M. I., Likhachov D. S., Azarov E. S. (2022) Voice Analysis and Classification System Based on Perturbation Parameters and Cepstral Presentation in Psychoacoustic Scales. Doklady BGUIR. 20 (1), 73–82. DOI: 10.35596/1729-7648-2022-20-1-73-82 (in Russian).

Анализ акустических параметров голоса для выявления заболеваний гортани / М. И. Вашкевич [и др.] // Информатика. 2020. № 17. С. 78–86.

Vashkevich M. I., Burak A. A., Kanoika N. S., Daldova V. S. (2020) Analysis of Acoustic Voice Parameters for Larynx Pathology Detection. Informatics. 17 (1), 78–86 (in Russian).

Flach, P. Machine Learning: the Art and Science of Algorithms that Make Sense of Data / P. Flach. Great Britain: Cambridge University Press, 2012. 416 p.

Flach P. (2012) Machine Learning: the Art and Science of Algorithms that Make Sense of Data. Great Britain, Cambridge University Press Publ. 416.

An Introduction to Statistical Learning with Applications in R / G. James [et al.]. Springer, 2013. 440 p.

James G., Witten D., Hastie T., Tibshirani R. (2013) An Introduction to Statistical Learning with Applications in R. Springer Publ. 440.

Kotu, V. Data Science: Concepts and Practice / V. Kotu, B. Deshpande. 2 ed. USA: Morgan Kaufmann Publishers an Imprint of Elsevier, 2019.

Kotu V., Deshpande B. (2019) Data Science: Concepts and Practice. 2 ed. USA, Morgan Kaufmann Publishers an Imprint of Elsevier.

Voice Database Used in the Article Classification of ALS Patients Based on Acoustic Analysis of Sustained Vowel Phonations [Electronic Resource]. Mode of access: https://github.com/Mak-Sim/Minsk2020_ALS_database. Date of access: 12.05.2023.

Voice Database Used in the Article Classification of ALS Patients Based on Acoustic Analysis of Sustained Vowel Phonations. Available: https://github.com/Mak-Sim/Minsk2020_ALS_database (Accessed 12 May 2023).

The Necessity of Leave One Subject Out (LOSO) Cross Validation for EEG Disease Diagnosis / S. Kunjan [et al.] // Brain Informatics. Springer, 2021. P. 558–567.

Kunjan S., Grummett T. S., Pope K. J., Powers D. M. W., Fitzgibbon S. P., Lewis T. W. (2021) The Necessity of Leave One Subject Out (LOSO) Cross Validation for EEG Disease Diagnosis. Brain Informatics. Springer Publ. 558–567.

The authors declare that there are no conflicts of interest present.