0% Complete
Home
/
13th International Conference on Computer and Knowledge Engineering
AgeNet-AT: An End-to-End Model for Robust Joint Speaker Age Estimation and Gender Recognition Based on Attention Mechanism and Titanet
Authors :
Mahsa Zamani Tarashandeh
1
Amirhossein Torkanloo
2
Mohammad Hossein Moattar
3
1- Department of Electrical Engineering, Faculty of Engineering Ferdowsi University of Mashhad
2- Department of Computer Engineering Ferdowsi University of Mashhad Mashhad, Iran
3- ,Department of Computer Engineering Mashhad Branch, Islamic Azad University, Mashhad, IRAN
Keywords :
Age estimation،Gender classification،Multi-task learning،Attention mechanism،Titanet
Abstract :
Speaker age estimation has become popular in recent years due to its potential applications in various fields, including forensics and human-computer interaction. However, noise and utterance length robustness is a key factor in the performance of the approaches. In this work, a robust age estimation and gender recognition model named AgeNet-AT is proposed based on an attention mechanism and Titanet model. The proposed approach applies Titanet as the embedding extractor, and attention mechanism to create an end-to-end architecture for age estimation. Since Titanet is a model designed to distinguish different speaker identities, it is hypothesized that some of its extracted features may contain properties that can differentiate speakers’ age and gender. Therefore, Titanet is chosen as the embedding approach in this study. Additionally, an attention layer is used to focus on the most valuable features for age estimation. Furthermore, an auxiliary task of gender classification is added to the model in order to improve the estimation performance. The experiments are conducted on TIMIT dataset for different evaluation conditions, such as various utterance lengths and noise levels. The experimental results indicate the robustness of the AgeNet-AT model. The model has outperformed the state-of-the-art age estimation results on TIMIT dataset with Root Mean Square Error (RMSE) of 5.92 and 6.85 and Mean Absolute Error (MAE) of 4.30 and 4.73 for male and female speakers, respectively.
Papers List
List of archived papers
Depression Diagnosis Using Optimization of Nonlinear EEG Features Based on Parametric Learning Tactics
Ali Asadi Zeidabadi - Melika Changizi - Mahdi Zolfagharzadeh Kermani - Sara Bargi Barkouk
Automatic Infrared-Based Volume and Mass Estimation System for Agricultural Products
Seyed Muhammad Hossein Mousavi - S. Muhammad Hassan Mosavi
Distinguishing Abstracts of Human-Written and ChatGPT-Generated Papers in the Field of Computer Science
Mohsen Arzani - Hamed Vahdat-Nejad - Matin Hossein-Pour
Multi-Layered Defense Against Modern Phishing: A Dual-Sandbox and CDR Approach
Mahdi Seyfipoor - Mohammad Mahdi Eskandari
Spatial-channel attention-based stochastic neighboring embedding pooling and long short term memory for lung nodules classification
AHMED SAIHOOD - HOSSEIN KARSHENAS - AHMADREZA NAGHSH NILCHI
A Dual-Branch Attention-Enhanced CNN for Corn Leaf Disease Classification via RGB-HLS Color Space Fusion
Mohammad Ali Salehi Rad - Kamran Kazemi - Mohammad Sadegh Helfroush - Tahereh Golshaeian
Recommending Popular Locations Based on Collected Trajectories
Mohammad Rabbani bidgoli - Saber Ziaei
Implementation of a Low-Overhead 2-Bit Parity-Preserving Reversible Vedic Multiplier for Quantum Architectures
Shekoofeh Moghimi - Negin Mashayekhi - Mohammad Reza Reshadinezhad
Improvement of Credit Scoring by LSTM Autoencoder Model
Milad Sattari Maleki - Seyedeh Niusha Motevallian - Faezehsadat Hosseini - Mohammad Sabokrou - Hamidreza Soltanalizadeh Maleki
Leveraging a structure-based and learning-based predictor using various feature groups in bioinformatics (case study: protein-peptide region residue-level interaction)
Shima Shafiee - Abdolhossein Fathi
more
Samin Hamayesh - Version 43.7.0