References

bsuir

Доклады БГУИР

Doklady BGUIR

1729-76482708-0382

БГУИР

10.35596/1729-7648-2020-18-1-43-51

bsuir-2591

Research Article

ЭЛЕКТРОНИКА, РАДИОФИЗИКА, РАДИОТЕХНИКА, ИНФОРМАТИКА

ELECTRONICS, RADIOPHYSICS, RADIOENGINEERING, INFORMATICS

МЕТОД КОРРЕКЦИИ СЛУХА НА ОСНОВЕ ПСИХОАКУСТИЧЕСКИ ОБУСЛОВЛЕННОГО ПЕРЕНОСА ЧАСТОТ В РЕЧЕВОМ СИГНАЛЕ

HEARING CORRECTION METHOD BASED ON PSYCHOACOUSTICALLY MOTIVATED FREQUENCY TRANSPOSITION IN A SPEECH SIGNAL

Порхун

М. И.

Porhun

M. I.

Порхун Максим Игоревич, ассистент кафедры электронных вычислительных средств

220013, г. Минск, ул. П. Бровки, д. 6, тел. +375-17-293-84-20

Porhun Maxim Igorevich, Assistant Lecturer of Computer Engineering Department

220013, Minsk, P. Brovki str., 6, tel. +375-17-293-84-20

porhun@bsuir.by

Вашкевич

М. И.

Vashkevich

M. I.

к.т.н., доцент, доцент кафедры электронных вычислительных средств

PhD, Associate Professor, Associate Professor of Computer Engineering Department

Белорусский государственный университет информатики и радиоэлектроникиBelarussian state university of informatics and radioelectronics

2020

06032020

1814351

2020

Порхун М.И., Вашкевич М.И.

Porhun M.I., Vashkevich M.I.

Данная работа распространяется под лицензией Creative Commons Attribution 4.0.

This work is licensed under a Creative Commons Attribution 4.0 License.

https://doklady.bsuir.by/jour/article/view/2591

Целью работы являлась разработка метода обработки речевого сигнала для коррекции слуховых патологий на основе психоакустически обусловленного переноса высокочастотных составляющих спектра сигнала в низкочастотную область с последующим частотно-зависимым усилением. Для достижения поставленной цели были решены задачи, связанные с разработкой принципов переноса частот в речевом сигнале. Разработанный метод является адаптивным, его настройка осуществляется согласно аудиограмме тугоухого человека. Для переноса частот выбираются две частотных полосы: исходная (откуда производится перенос) и целевая (куда производится перенос). Ширина исходной частотной полосы фиксирована, а ширина целевой полосы выбирается адаптивно. Перенос спектра выполняется только для согласных звуков, восприятие которых тугоухими людьми затруднено. Классификация звуков по признаку гласный/согласный/пауза реализована на базе нейронной сети. В качестве информационных признаков выбирались: среднее число переходов через нуль, кратковременная энергия, кратковременная амплитуда, нормализованная автокорреляционная функция и первый спектральный момент. Чтобы сохранить максимально натуральное звучание переносимых звуков используется концепция равной громкости. Для компенсации ослабления восприятия звука тугоухим человеком используется частотно-зависимое усиление сигнала на основе аудиограммы. Эффективность предложенного метода проверена экспериментально с использованием моделирования эффекта потери слуха. В эксперименте учувствовали 10 человек, которым давали прослушивать записи, пропущенные через модель потери слуха, а также записи, прощенные через модель потери слуха с последующей коррекцией. Результаты показали, что применение разработанного метода коррекции слуха в среднем улучшает разборчивость речи на 6 %.

The purpose of the work was to develop a speech signal processing method for the hearing pathologies correction based on psychoacoustically motivated transposition of high-frequency components of the signal spectrum to the low-frequency region with subsequent frequency-dependent amplification. To achieve this goal, several tasks related to the development of principles of frequency transposition in a speech signal were solved. The adjustment of the method is carried out according to the audiogram of a deaf person. For frequency transposition, source and target frequency bands are selected. The width of the source frequency band is fixed, while the width of the target band is adaptive. Spectrum transposition is performed only for consonants, the perception of which is more difficult for people with hearing loss. The classification of sounds (into vowel-consonant - pause classes) is implemented using one-layer neural network. The feature vector consists of: the zero crossing rate, short-term energy, short-term magnitude, normalized autocorrelation function and the first spectral moment. To preserve the naturalness of transposed sounds, the concept of equal loudness is used. To compensate for the attenuation in the perception of sound by a deaf person, a frequencydependent signal amplification based on an audiogram is used. The effectiveness of the proposed method was verified experimentally using hearing loss effect simulation. The experiment involved 10 people who were given to listen to the recordings passed through the hearing loss model, as well as recordings passed through the hearing loss model with subsequent correction (using proposed method). The results showed that using the proposed hearing correction method improves speech intelligibility on average by 6 %.

коррекция слухаслуховые патологиимоделирование потери слуха

hearing correctionhearing impairmentshearing loss simulation

References1

Simpson A. Frequency-lowering devices for managing high-frequency hearing loss: a review. Trends in amplification. 2009;13(2):87-106. DOI: 10.1177/1084713809336421.

Alexander J.M. Individual variability in recognition of frequency-lowered speech. Seminars in Hearing. 2013;34(2):86-109. DOI: 10.1055/s-0033-1341346.

Robinson J.D., Baer T., Moore B. Using transposition to improve consonant discrimination and detection for listeners with severe high-frequency hearing loss. International Journal of Audiology. 2007;46(6):293-308. DOI: 10.1080/14992020601188591.

Hogan C.A., Turner C.W. High-frequency audibility: Benefits for hearing-impaired listeners. The Journal of the Acoustical Society of America. 1998;104:432-441. DOI: 10.1121/1.423247.

Королёва И.В. Введение в аудиологию и слухопротезирование. СПб: КАРО; 2012.

Korolyova I.V. [Introduction to Audiology and Hearing Prosthetics]. SPb : KARO; 2012. (In Russ.)

Фонлантен А., Хорст А. Слуховые аппараты. Ростов н/Д.: Феникс; 2009.

Vonlanthen A., Horst A. [Hearing Aids]. Rostov n/D: Phoenix; 2009. (In Russ.)

Traunmuller H. Analytical Expressions for the tonotopic sensory scale. Acoustical Society of America. 1990; 88(1):97-100. DOI: 10.1121/1.399849.

Liu Y.-T., Chang R.Y., Tsao Y., Chang Y.-P. A new frequency lowering technique for Mandarin-speaking hearing aid users. IEEE Global Conference on Signal and Information Processing (GlobalSIP), Orlando, FL. 2015;722-726. DOI: 10.1109/GlobalSIP.2015.7418291.

Liu Y.-T., Chang R.Y., Tsao Y., Chang Y.-P. A new frequency lowering technique for Mandarin-speaking hearing aid users / IEEE Global Conference on Signal and Information Processing (GlobalSIP), Orlando, FL. 2015;722-726. DOI: 10.1109/GlobalSIP.2015.7418291.

Николенко С.И., Архангельская Е.В., Кадурин А.А. Глубокое обучение. Погружение в мир нейронных сетей. СПб.: Питер; 2019.

Nikolenko S.I., Arhangel'skaya E.V., Kadurin A.A. Glubokoe obuchenie. Pogruzhenie v mir neyronnyih setey. SPb.: Piter; 2019. (in Russ.)

The authors declare that there are no conflicts of interest present.