0% Complete
Home
/
12th International Conference on Computer and Knowledge Engineering
Word-level Persian Lipreading Dataset
Authors :
Javad Peymanfard
1
Ali Lashini
2
Samin Heydarian
3
Hossein Zeinali
4
Nasser Mozayani
5
1- School of Computer Engineering Iran University of Science and Technology, Tehran, Iran
2- School of Computer Engineering Iran University of Science and Technology, Tehran, Iran
3- School of Computer Engineering Iran University of Science and Technology, Tehran, Iran
4- Department of Computer Engineering Amirkabir University of Technology, Tehran, Iran
5- School of Computer Engineering Iran University of Science and Technology, Tehran, Iran
Keywords :
Lip-reading،Persian dataset،audio-visual speech recognition
Abstract :
Lip-reading has made impressive progress in recent years, driven by advances in deep learning. Nonetheless, the prerequisite such advances is a suitable dataset. This paper provides a new in-the-wild dataset for Persian word-level lip reading containing 244,000 videos from approximately 1,800 speakers. We evaluated the state-of-the-art method in this field and used a novel approach for word-level lip reading. In this method, we used the AV-Hubert model for feature extraction and obtained significantly better performance on our dataset.
Papers List
List of archived papers
Time Series Analysis by Bi-GRU for Forecasting Bitcoin Trends based on Sentiment Analysis
Fatemeh Saadatmand - Mohammad Ali Zare Chahoki
An optimal workflow scheduling method in cloud-fog computing using three-objective Harris-Hawks algorithm
Ahmadreza Montazerolghaem - Maryam Khosravi - Fatemeh Rezaee
Automated software design using Machine Learning With Natural Language Processing
Fahimeh Khedmatkon - Seyed Mohammad Hossein Hasheminejad - Jaleh Shoshtarian Malak
A Smart Electrochemical Biosensor for Arsenic Detection in Water
Keyvan Asefpour Vakilian
XAI for Transparent Autonomous Vehicles: A New Approach to Understanding Decision-Making in Self-driving Cars
Maryam Sadat Hosseini Azad - Amir Abbas Hamidi Imani - Shahriar Baradaran Shokouhi
Semi-automatic Detection of Persian Stopwords using FastText Library
Mohammad Dehghani - Mohammad Manthouri
A Survey on Semi-Automated and Automated Approaches for Video Annotation
Samin Zare - Mehran Yazdi
Novel Insights in Deep Learning for Predicting Climate Phenomena
Mohammad Naisipour - Saghar Ganji - Iraj Saeedpanah - Behnam Mehrakizadeh - Ahmad Reza Labibzadeh
Capturing Local and Global Features in Medical Images by Using Ensemble CNN-Transformer
Javad Mirzapour Kaleybar - Hooman Saadat - Hooman Khaloo
YOLOatt-Med: YOLO-Based Attention Mechanism for Medical Image Classification
Fatemeh Naserizadeh - Erfan Akbarnezhad Sany - Parsa Sinichi - Seyyed Abed Hosseini
more
Samin Hamayesh - Version 42.4.1