0% Complete
Home
/
12th International Conference on Computer and Knowledge Engineering
Word-level Persian Lipreading Dataset
Authors :
Javad Peymanfard
1
Ali Lashini
2
Samin Heydarian
3
Hossein Zeinali
4
Nasser Mozayani
5
1- School of Computer Engineering Iran University of Science and Technology, Tehran, Iran
2- School of Computer Engineering Iran University of Science and Technology, Tehran, Iran
3- School of Computer Engineering Iran University of Science and Technology, Tehran, Iran
4- Department of Computer Engineering Amirkabir University of Technology, Tehran, Iran
5- School of Computer Engineering Iran University of Science and Technology, Tehran, Iran
Keywords :
Lip-reading،Persian dataset،audio-visual speech recognition
Abstract :
Lip-reading has made impressive progress in recent years, driven by advances in deep learning. Nonetheless, the prerequisite such advances is a suitable dataset. This paper provides a new in-the-wild dataset for Persian word-level lip reading containing 244,000 videos from approximately 1,800 speakers. We evaluated the state-of-the-art method in this field and used a novel approach for word-level lip reading. In this method, we used the AV-Hubert model for feature extraction and obtained significantly better performance on our dataset.
Papers List
List of archived papers
Decentralized Federated Learning in IoT Environments: A Hierarchical Approach
Majid Mohammadpour - Seyedakbar Mostafavi
Facial Mask Wearing Condition Detection Using SSD MobileNetV2
Amirhossein Tighkhorshid - Yasamin Borhani - Javad Khoramdel - Esmaeil Najafi
An Improved and Accurate Measure for Mining Correlated High-utility Itemsets
Amir Masoud Heidari Orojloo - Morteza Keshtkaran
Deep Inside Tor: Exploring Website Fingerprinting Attacks on Tor Traffic in Realistic Settings
Amirhossein Khajehpour - Farid Zandi - Navid Malekghaini - Mahdi Hemmatyar - Naeimeh Omidvar - Mahdi Jafari Siavoshani
PeQa: a Massive Persian Quenstion-Answering and Chatbot Dataset
Fatemeh Zahra Arshia - Mohammad Ali Keyvanrad - Saeedeh Sadat Sadidpour - Sayyid Mohammad Reza Mohammadi
Speech Emotion Recognition Using a Hierarchical Adaptive Weighted Multi-Layer Sparse Auto-Encoder Extreme Learning Machine with New Weighting and Spectral/SpectroTemporal Gabor Filter Bank Features
Fatemeh Daneshfar - Seyed Jahanshah Kabudian
A Framework for Automated Cardiovascular Magnetic Resonance Image Quality Scoring based on EuroCMR Registry Criteria
Shahabedin Nabavi - Mohsen Ebrahimi Moghaddam - Ahmad Ali Abin - Alejandro Frangi
Automatic Generation of XACML Code using Model-Driven Approach
Athareh Fatemian - Bahman Zamani - Marzieh Masoumi - Mehran Kamranpour - Behrouz Tork Ladani - Shekoufeh Kolahdouz Rahimi
Prediction of West Texas Intermediate Crude-oil Price Using Hybrid Attention-based Deep Neural Networks: A Comparative Study
Alireza Jahandoost - Mahboobeh Houshmand - Seyyed Abed Hosseini
No-Reference Video Quality Assessment by Deep Feature Maps Relations
Amir Hossein Bakhtiari - Azadeh Mansouri
more
Samin Hamayesh - Version 42.2.1