0% Complete
Home
/
14th International Conference on Computer and Knowledge Engineering
Leveraging Self-Supervised Models for Automatic Whispered Speech Recognition
Authors :
Aref Farhadipour
1
Homa Asadi
2
Volker Dellwo
3
1- University of Tehran
2- university of isfahan
3- Zurich University of Applied Sciences
Keywords :
Automatic Speech Recognition،whisper speech to text،self-supervised learning،speech processing،deep learning،transformers،Wavlm model،Dialect variation،Whisper model
Abstract :
In automatic speech recognition, any factor that alters the acoustic properties of speech can pose a challenge to the system's performance. This paper presents a novel approach for automatic whispered speech recognition in the Irish dialect using the self-supervised WavLM model. Conventional automatic speech recognition systems often fail to accurately recognise whispered speech due to its distinct acoustic properties and the scarcity of relevant training data. To address this challenge, we utilized a pre-trained WavLM model, fine-tuned with a combination of whispered and normal speech data from the wTIMIT and CHAINS datasets, which include the English language in Singaporean and Irish dialects, respectively. Our baseline evaluation with the OpenAI Whisper model highlighted its limitations, achieving a Word Error Rate (WER) of 18.8% and a Character Error Rate (CER) of 4.24% on whispered speech. In contrast, the proposed WavLM-based system significantly improved performance, achieving a WER of 9.22% and a CER of 2.59%. These results demonstrate the efficacy of our approach in recognising whispered speech and underscore the importance of tailored acoustic modeling for robust automatic speech recognition systems. This study provides valuable insights into developing effective automatic speech recognition solutions for challenging speech affected by whisper and dialect. The source codes for this paper are freely available.
Papers List
List of archived papers
An interactive user groups recommender system based on reinforcement learning
Hediyeh Naderi Allaf - Mohsen Kahani
Damage Detection After the Earthquake Using Sentinel-1 and 2 Images and Machine Learning Algorithms (Case Study: Sarpol-e Zahab Earthquake)
Niloofar Alizadeh - Behnam Asghari Beirami - Mehdi Mokhtarzade
Practical Implementation of Real-Time Waste Detection and Recycling based on Deep Learning for Delta Parallel Robot
Hasan Jalali - Shaya Garjani - Ahmad Kalhor - Mehdi Tale Masouleh - Parisa Yousefi
Optimization Resource Allocation in NOMA-based Fog Computing with a Hybrid Algorithm
Zohreh Torki - S.Mojtaba Matinkhah
Optimizing Foreign Exchange Trading Performance Through Reinforcement Machine Learning Framework
Ervin Gubin Moung - Hani Yasmin Binti Murnizam - Maisarah Mohd Sufian - Valentino Liaw - Ali Farzamnia - Lorita Angeline
Leveraging the Power of Object Detection Models in Identifying Litter for a Significant Reduction in Environmental Pollution
Lim Zhen Xian - Ervin Gubin Moung - Jason Teo Tze Wi - Nordin Saad - Farashazillah Yahya - Tiong Lin Rui - Ali Farzamnia
Virtual Network Embedding based on Univariate Distribution Estimation
Arezoo Jahani
Segmentation of Hard Exudates in Retinal Fundus Images Using BCDU-Net
Nafise Ameri - Nasser Shoeibi - Mojtaba Abrishami
A Self-Configurable Model for Cloud Resource Allocation
Ali Bazghandi
SingAll: Scalable Control Flow Checking for Multi-Process Embedded Systems
Mehdi Amininasab - Ahmad Patooghy - Mahdi Fazeli
more
Samin Hamayesh - Version 41.5.3