<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.3 20210610//EN" "JATS-journalpublishing1-3.dtd">
<article article-type="research-article" dtd-version="1.3" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xml:lang="ru"><front><journal-meta><journal-id journal-id-type="publisher-id">bsuir</journal-id><journal-title-group><journal-title xml:lang="ru">Доклады БГУИР</journal-title><trans-title-group xml:lang="en"><trans-title>Doklady BGUIR</trans-title></trans-title-group></journal-title-group><issn pub-type="ppub">1729-7648</issn><issn pub-type="epub">2708-0382</issn><publisher><publisher-name>БГУИР</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.35596/1729-7648-2022-20-1-73-82</article-id><article-id custom-type="elpub" pub-id-type="custom">bsuir-3287</article-id><article-categories><subj-group subj-group-type="heading"><subject>Research Article</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="ru"><subject>ЭЛЕКТРОНИКА, РАДИОФИЗИКА, РАДИОТЕХНИКА, ИНФОРМАТИКА</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="en"><subject>ELECTRONICS, RADIOPHYSICS, RADIOENGINEERING, INFORMATICS</subject></subj-group></article-categories><title-group><article-title>Система анализа и классификации голосового сигнала на основе пертрубационных параметров и кепстрального представления в психоакустических шкалах</article-title><trans-title-group xml:lang="en"><trans-title>Voice Analysis and Classification System Based on Perturbation Parameters and Cepstral Presentation in Psychoacoustic Scales</trans-title></trans-title-group></title-group><contrib-group><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Вашкевич</surname><given-names>М. И.</given-names></name><name name-style="western" xml:lang="en"><surname>Vashkevich</surname><given-names>M. I.</given-names></name></name-alternatives><bio xml:lang="ru"><p>Вашкевич Максим Иосифович - кандидат технических наук, доцент кафедры электронных вычислительных средств.</p><p>220013, Минск, ул. П. Бровки, 6, тел. +375-17-293-84-78</p></bio><bio xml:lang="en"><p>Vashkevich Maksim Iosifovich - Cand. of Sci., Associate Professor at the Computer Engineering Department.</p><p>220013, Minsk, P. Brovki st., 6, tel. +375-17-293-84-78</p></bio><email xlink:type="simple">vashkevich@bsuir.by</email><xref ref-type="aff" rid="aff-1"/></contrib><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Лихачёв</surname><given-names>Д. С.</given-names></name><name name-style="western" xml:lang="en"><surname>Likhachov</surname><given-names>D. S.</given-names></name></name-alternatives><bio xml:lang="ru"><p>Кандидат технических наук, доцент кафедры электронных вычислительных средств.</p><p>Минск</p></bio><bio xml:lang="en"><p>Cand. of Sci., Associate Professor at the Computer Engineering Department.</p><p>Minsk</p></bio><xref ref-type="aff" rid="aff-1"/></contrib><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Азаров</surname><given-names>И. С.</given-names></name><name name-style="western" xml:lang="en"><surname>Azarov</surname><given-names>E. S.</given-names></name></name-alternatives><bio xml:lang="ru"><p>Доктор технических наук, заведующий кафедрой вычислительных средств.</p><p>Минск</p></bio><bio xml:lang="en"><p>Dr. of Sci., Head of the Computer Engineering Department.</p><p>Minsk</p></bio><xref ref-type="aff" rid="aff-1"/></contrib></contrib-group><aff-alternatives id="aff-1"><aff xml:lang="ru"><institution>Белорусский государственный университет информатики и радиоэлектроники</institution></aff><aff xml:lang="en"><institution>Belarusian State University of Informatics and Radioelectronics</institution></aff></aff-alternatives><pub-date pub-type="collection"><year>2022</year></pub-date><pub-date pub-type="epub"><day>01</day><month>03</month><year>2022</year></pub-date><volume>20</volume><issue>1</issue><fpage>73</fpage><lpage>82</lpage><permissions><copyright-statement>Copyright &amp;#x00A9; Вашкевич М.И., Лихачёв Д.С., Азаров И.С., 2022</copyright-statement><copyright-year>2022</copyright-year><copyright-holder xml:lang="ru">Вашкевич М.И., Лихачёв Д.С., Азаров И.С.</copyright-holder><copyright-holder xml:lang="en">Vashkevich M.I., Likhachov D.S., Azarov E.S.</copyright-holder><license xml:lang="ru" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>Данная работа распространяется под лицензией Creative Commons Attribution 4.0.</license-p></license><license xml:lang="en" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>This work is licensed under a Creative Commons Attribution 4.0 License.</license-p></license></permissions><self-uri xlink:href="https://doklady.bsuir.by/jour/article/view/3287">https://doklady.bsuir.by/jour/article/view/3287</self-uri><abstract><p>Описан подход к построению системы анализа и классификации голосового сигнала на основе пертурбационных параметров и кепстрального представления. Рассмотрены два варианта кепстрального представления голосового сигнала: при помощи мел-частотных кепстральных коэффициентов (МЧКК) и при помощи барк-частотных кепстральных коэффициентов (БЧКК). В работе использовался общепринятый подход к вычислению МЧКК на основе частотно-временного анализа методом дискретного преобразования Фурье (ДПФ) с объединением энергии в субполосах. Данный метод аппроксимирует частотное разрешение слуха человека, но имеет фиксированное временное разрешение. В качестве альтернативы предложен вариант кепстрального представления на основе БЧКК. При расчете БЧКК использовался неравнополосный ДПФ-модулированный банк фильтров, аппроксимирующий частотную и временную разрешающую способность слуха. Целью работы ставилось сравнение эффективности применения признаков на основе МЧКК и БЧКК для построения систем анализа и классификации голосового сигнала. Результаты эксперимента показали, что в случае использования акустических признаков на основе МЧКК можно получить систему классификации голоса со средней полнотой классификации 80,6 %, а в случае использовании признаков на основе БЧКК этот показатель равен 83,7 %. При дополнении набора МЧКК признаков пертурбационными параметрами голоса средняя полнота классификации повысилась до 94,1 %, при аналогичном дополнении набора БЧКК признаков средняя полнота классификации увеличилась до 96,7 %.</p></abstract><trans-abstract xml:lang="en"><p>The paper describes an approach to design a system for analyzing and classification of a voice signal based on perturbation parameters and cepstral representation. Two variants of the cepstral representation of the voice signal are considered: based on mel-frequency cepstral coefficients (MFCC) and based on bark-frequency cepstral coefficients (BFCC). The work used a generally accepted approach to calculating the MFCC based on the time-frequency analysis by the method of discrete Fourier transform (DFT) with summation of energy in subbands. This method approximates the frequency resolution of human hearing, but has a fixed temporal resolution. As an alternative, a variant of the cepstral representation based on the BFCC has been proposed. When calculating the BFCC, a warped DFT-modulated filter bank was used, which approximates the frequency and temporal resolution of hearing. The aim of the work was to compare the effectiveness of the use of features based on the MFCC and BFCC for the designing systems for the analysis and classification of the voice signal. The results of the experiment showed that in the case when using acoustic features based on the MFCC, it is possible to obtain a voice classification system with an average recall of 80.6 %, and in the case when using features based on the BFCC, this metric is 83.7 %. With the addition of the set of MFCC features with perturbation parameters of the voice, the average recall of the classification increased to 94.1 %, with a similar addition to the set of BFCC features, the average recall of the classification increased up to 96.7 %.</p></trans-abstract><kwd-group xml:lang="ru"><kwd>голосовой сигнал</kwd><kwd>МЧКК</kwd><kwd>БЧКК</kwd><kwd>патология голоса</kwd></kwd-group><kwd-group xml:lang="en"><kwd>voice signal</kwd><kwd>MFCC</kwd><kwd>BFCC</kwd><kwd>vocal pathology</kwd></kwd-group></article-meta></front><back><ref-list><title>References</title><ref id="cit1"><label>1</label><citation-alternatives><mixed-citation xml:lang="ru">Harar P., Galaz Z., Alonso-Hernandez J.B., Mekyska J., Burget R., Smekal Z. Towards robust voice pathology detection. Neural Computing and Applications. 2020;32(20): 15747-15757.</mixed-citation><mixed-citation xml:lang="en">Harar P., Galaz Z., Alonso-Hernandez J.B., Mekyska J., Burget R., Smekal Z. Towards robust voice pathology detection. Neural Computing and Applications. 2020;32(20): 15747-15757.</mixed-citation></citation-alternatives></ref><ref id="cit2"><label>2</label><citation-alternatives><mixed-citation xml:lang="ru">Likhachov D., Vashkevich M., Azarov E., Malhina K., Rushkevich Y. A mobile application for detection of amyotrophic lateral sclerosis via voice analysis. International Conference on Speech and Computer, 2021. Springer, Cham; 2021:372-383.</mixed-citation><mixed-citation xml:lang="en">Likhachov D., Vashkevich M., Azarov E., Malhina K., Rushkevich Y. A mobile application for detection of amyotrophic lateral sclerosis via voice analysis. International Conference on Speech and Computer, 2021. Springer, Cham; 2021:372-383.</mixed-citation></citation-alternatives></ref><ref id="cit3"><label>3</label><citation-alternatives><mixed-citation xml:lang="ru">Benba A., Jilbab A., Hammouch A. Discriminating between patients with Parkinson’s and neurological diseases using cepstral analysis. IEEE Transactions on Neural Systems and Rehabilitation Engineering. 2016;24(10):1100–1108.</mixed-citation><mixed-citation xml:lang="en">Benba A., Jilbab A., Hammouch A. Discriminating between patients with Parkinson’s and neurological diseases using cepstral analysis. IEEE Transactions on Neural Systems and Rehabilitation Engineering. 2016;24(10):1100–1108.</mixed-citation></citation-alternatives></ref><ref id="cit4"><label>4</label><citation-alternatives><mixed-citation xml:lang="ru">Tsanas A., Little M.A., McSharry P.E., Spielman J., Ramig L.O. Novel speech signal processing algorithms for high-accuracy classification of Parkinson's disease. IEEE Transactions on Biomedical Engineering. 2012;59(5):1264-1271.</mixed-citation><mixed-citation xml:lang="en">Tsanas A., Little M.A., McSharry P.E., Spielman J., Ramig L.O. Novel speech signal processing algorithms for high-accuracy classification of Parkinson's disease. IEEE Transactions on Biomedical Engineering. 2012;59(5):1264-1271.</mixed-citation></citation-alternatives></ref><ref id="cit5"><label>5</label><citation-alternatives><mixed-citation xml:lang="ru">Vashkevich M., Rushkevich Y. Classification of ALS patients based on acoustic analysis of sustained vowel phonations. Biomedical Signal Processing and Control. 2021;65:1-14.</mixed-citation><mixed-citation xml:lang="en">Vashkevich M., Rushkevich Y. Classification of ALS patients based on acoustic analysis of sustained vowel phonations. Biomedical Signal Processing and Control. 2021;65:1-14.</mixed-citation></citation-alternatives></ref><ref id="cit6"><label>6</label><citation-alternatives><mixed-citation xml:lang="ru">Huang X., Acero A., Hon H.-W. Spoken language processing: A guide to theory, algorithm, and system development. Prentice hall PTR; 2001: 980.</mixed-citation><mixed-citation xml:lang="en">Huang X., Acero A., Hon H.-W. Spoken language processing: A guide to theory, algorithm, and system development. Prentice hall PTR; 2001: 980.</mixed-citation></citation-alternatives></ref><ref id="cit7"><label>7</label><citation-alternatives><mixed-citation xml:lang="ru">Bielawski K., Petrovsky A. Proposition of minimum bands multirate noise reduction system which exploits properties of the human auditory system and all-pass transformed filter bank. IEEE Workshop Signal Processing. 2001:65-70.</mixed-citation><mixed-citation xml:lang="en">Bielawski K., Petrovsky A. Proposition of minimum bands multirate noise reduction system which exploits properties of the human auditory system and all-pass transformed filter bank. IEEE Workshop Signal Processing. 2001:65-70.</mixed-citation></citation-alternatives></ref><ref id="cit8"><label>8</label><citation-alternatives><mixed-citation xml:lang="ru">Gareth J., Daniela W., Trevor H., Robert T. An introduction to statistical learning with applications in R. NewYork: Springer; 2013.</mixed-citation><mixed-citation xml:lang="en">Gareth J., Daniela W., Trevor H., Robert T. An introduction to statistical learning with applications in R. NewYork: Springer; 2013.</mixed-citation></citation-alternatives></ref><ref id="cit9"><label>9</label><citation-alternatives><mixed-citation xml:lang="ru">Vashkevich M., Petrovsky A. Rushkevich Y. Bulbar ALS detection based on analysis of voice perturbation and vibrato. IEEE International Conference on Signal Processing: Algorithms, Architectures, Arrangements, and Applications. 2019: 267-272.</mixed-citation><mixed-citation xml:lang="en">Vashkevich M., Petrovsky A. Rushkevich Y. Bulbar ALS detection based on analysis of voice perturbation and vibrato. IEEE International Conference on Signal Processing: Algorithms, Architectures, Arrangements, and Applications. 2019: 267-272.</mixed-citation></citation-alternatives></ref></ref-list><fn-group><fn fn-type="conflict"><p>The authors declare that there are no conflicts of interest present.</p></fn></fn-group></back></article>
