0% Complete
Home
/
14th International Conference on Computer and Knowledge Engineering
AvashoG2P: A multi-module G2P Converter for Persian
Authors :
Ali Moghadaszadeh
1
Fatemeh Pasban
2
Mohsen Mahmoudzadeh
3
Maryam Vatanparast
4
Amirmohammad Salehoof
5
1- Part AI Research Center
2- Part AI Research Center
3- Ferdowsi University of Mashhad
4- Part AI Research Center
5- Part AI Research Center
Keywords :
TTS،G2P
Abstract :
The conversion of graphemes to phonemes (G2P) is a fundamental task in text-to-speech (TTS) and automatic speech recognition (ASR) systems. Over the years, G2P systems have evolved from rule-based and statistical methods to advanced neural network-based approaches. Despite these advancements, G2P conversion for Persian remains challenging due to the complex relationship between spelling and pronunciation and the scarcity of high-quality datasets. This paper introduces the AvashoG2P, a multi-module novel solution for Persian G2P conversion. The AvashoG2P system leverages a sequence-to-sequence (seq2seq) model with a GRU-based recurrent unit and an attention mechanism. This model is trained on both diacritized and non-diacritized words, enhancing its understanding of phonemes and their relationships. The system achieves a Word Error Rate (WER) of 15\% and a Phoneme Error Rate (PER) of 5\%, demonstrating its effectiveness. One of the critical components of AvashoG2P is its homograph disambiguation module, which utilizes a single model for all homographs, addressing a significant challenge in Persian text processing. Our method leverages a classification approach for homograph disambiguation, which assigns a phoneme label to the entire input window. Our system achieves high accuracy while optimizing for latency and memory consumption. We achieve significant improvements in accuracy and F1 scores using transformer-based models and machine learning classifiers. Our results highlight the superior performance of the XLMRoberta model among transformer models, with an F1 Weighted score of 94.7, and the SVC model among machine learning classifiers, with an F1 Weighted score of 89.96. Additionally, we present the AvashoG2P-Benchmark, a comprehensive test dataset designed to facilitate future research and benchmarking in Persian G2P tasks (available at: https://huggingface.co/datasets/PartAI/AvashoG2P-Benchmark).
Papers List
List of archived papers
Virtual machine consolidation using SLA-aware genetic algorithm placement for data centers with non-stationary workloads
Hossein Monshizadeh Naeen
Persis: A Persian Font Recognition Pipeline Using Convolutional Neural Networks
Mehrdad Mohammadian - Neda Maleki - Tobias Olsson - Fredrik Ahlgren
Adaptive Active Queue Management for Time Slot Channel Hopping in Industrial Internet of Things
Mehdi Zirak - Yasser Sedaghat - Mohammad Hossein Yaghmaee Moghaddam
An intelligent linguistic error detection approach to automated diagnosis of Dyslexia disorder in Persian speaking children
Fatemeh Asghari - Mahsa Khorasani - Mohsen Kahani - Seyed Amir Amin Yazdi - Mahdi Arkhodi Ghalenoei
Improvement of Credit Scoring by LSTM Autoencoder Model
Milad Sattari Maleki - Seyedeh Niusha Motevallian - Faezehsadat Hosseini - Mohammad Sabokrou - Hamidreza Soltanalizadeh Maleki
An Automated Visual Defect Segmentation for Flat Steel Surface Using Deep Neural Networks
Dorna Nourbakhsh Sabet - Mohammad Reza Zarifi - Javad Khoramdel - Yasamin Borhani - Esmaeil Najafi
A routing method with the approach of reducing energy consumption in WSNs with the Jellyfish Search (JS) optimizer algorithm and unequal clustering
Ehsan Gholami - Javad Hamidzadeh
Overview of Electric Vehicles Charging Stations in Smart Grids
Mohammed Wadi - Wisam Elmasry - Mohammed Jouda - Hossein Shahinzadeh - Gevork B. Gharehpetian
Analysis of Address Lifespans in Bitcoin and Ethereum
Amir Mohammad Karimi Mamaghan - Amin Setayesh - Behnam Bahrak
Novel Insights in Deep Learning for Predicting Climate Phenomena
Mohammad Naisipour - Saghar Ganji - Iraj Saeedpanah - Behnam Mehrakizadeh - Ahmad Reza Labibzadeh
more
Samin Hamayesh - Version 41.7.6