0% Complete
Home
/
12th International Conference on Computer and Knowledge Engineering
Word-level Persian Lipreading Dataset
Authors :
Javad Peymanfard
1
Ali Lashini
2
Samin Heydarian
3
Hossein Zeinali
4
Nasser Mozayani
5
1- School of Computer Engineering Iran University of Science and Technology, Tehran, Iran
2- School of Computer Engineering Iran University of Science and Technology, Tehran, Iran
3- School of Computer Engineering Iran University of Science and Technology, Tehran, Iran
4- Department of Computer Engineering Amirkabir University of Technology, Tehran, Iran
5- School of Computer Engineering Iran University of Science and Technology, Tehran, Iran
Keywords :
Lip-reading،Persian dataset،audio-visual speech recognition
Abstract :
Lip-reading has made impressive progress in recent years, driven by advances in deep learning. Nonetheless, the prerequisite such advances is a suitable dataset. This paper provides a new in-the-wild dataset for Persian word-level lip reading containing 244,000 videos from approximately 1,800 speakers. We evaluated the state-of-the-art method in this field and used a novel approach for word-level lip reading. In this method, we used the AV-Hubert model for feature extraction and obtained significantly better performance on our dataset.
Papers List
List of archived papers
Islamic Geometric algorithms: A survey
Elham Akbari - Azam Bastanfard
Improving performance of multi-label classification using ensemble of feature selection and outlier detection
Mohammad Ali Zarif - Javad Hamidzadeh
Automatic Infrared-Based Volume and Mass Estimation System for Agricultural Products
Seyed Muhammad Hossein Mousavi - S. Muhammad Hassan Mosavi
Two-step thermal-aware routing algorithm in 3D NoC
Majid Nezarat - Masoume Momeni
A Survey on Semi-Automated and Automated Approaches for Video Annotation
Samin Zare - Mehran Yazdi
Information Theoretic Learning-based Deep Embedded Clustering (ITL-DEC)
Hoda Shad - Mona Zamiri - Tahereh Bahreini - Reza Monsefi - Ghoshe Abed Hodtani
Enhanced Principal-curve based Classifiers for Time-series Label Prediction
Seyed Aref Hakimzadeh - Koorush Ziarati
Reversible Data Insertion in Encryption Domain Based on Reduced Quad Difference Expansion
Alireza Ghaemi - Mohammad Zare Ehteshami - Amirhossein Ghaemi
Spatio-Temporal Graph Neural Networks for Accurate Crime Prediction
Rojan Roshankar - Mohammad Reza Keyvanpour
PowerLinear Activation Functions with application to the first layer of CNNs
Kamyar Nasiri - Kamaledin Ghiasi-Shirazi
more
Samin Hamayesh - Version 41.7.6