0% Complete
Home
/
12th International Conference on Computer and Knowledge Engineering
PeQa: a Massive Persian Quenstion-Answering and Chatbot Dataset
Authors :
Fatemeh Zahra Arshia
1
Mohammad Ali Keyvanrad
2
Saeedeh Sadat Sadidpour
3
Sayyid Mohammad Reza Mohammadi
4
1- Faculty of Electrical & Computer Engineering Malek-Ashtar University of Technology Tehran, Iran
2- Faculty of Electrical & Computer Engineering Malek-Ashtar University of Technology Tehran, Iran
3- Faculty of Electrical & Computer Engineering Malek-Ashtar University of Technology Tehran, Iran
4- Faculty of Electrical & Computer Engineering Malek-Ashtar University of Technology Tehran, Iran
Keywords :
Question-Answering System،Tweeter Dataset،Persian QA،Chatbot
Abstract :
TA question-answering (QA) system is an application able to communicate with humans using natural language processing. Modelling a dialogue between humans and machines is considered one of the most important tasks of Artificial Intelligence (AI). Creating a Chatbot with a good performance in modelling human-machine conversations is still one of the unsolved challenges in this field. Although Chatbots have many applications, in general, they should understand users’ meaning through their words and provide them with relevant answers. In the past, Chatbot architectures mainly relied on rules or statistical methods. With the advent of deep learning methods, trainable neural networks soon replaced the traditional models. These sorts of deep models are highly affected by the dataset that would be fed into them, and there is no big enough one available in the Persian language! We present a huge dataset of 14 million Persian tweets from tweeter that is meticulously processed to create a rich collection of 420,000 pairs of question-answer data. We also present modelling results on Transformers, including Sensibleness and Specificity Average (SSA) and the BLEU metric. We will release our dataset, modelling code, and models publicly.
Papers List
List of archived papers
Plant Disease Detection Using Dynamic Knowledge Distillation and Attention Mechanism
Mohammad Ghasemi Arian - Mohammad Hossein Yaghmaee Moghaddam
An Interactive Approach for Query-based Multi-Document Scientific Text Summarization
Mohammadsadra Nejati - Azadeh Mohebi - Abbas Ahmadi
Diagnosis of Depression Based on New Features Extractive from the Frequency Space of the EEG
Melika Changizi - Saeid Rashidi
A Dual-Branch Attention-Enhanced CNN for Corn Leaf Disease Classification via RGB-HLS Color Space Fusion
Mohammad Ali Salehi Rad - Kamran Kazemi - Mohammad Sadegh Helfroush - Tahereh Golshaeian
Automated software design using Machine Learning With Natural Language Processing
Fahimeh Khedmatkon - Seyed Mohammad Hossein Hasheminejad - Jaleh Shoshtarian Malak
Sensitivity Reliability Analysis of Power Distribution Networks Using Fuzzy Logic
Mohammed Wadi - Wisam Elmasry - Ismail Kucuk - Hossein Shahinzadeh
A 2D-CNN Architecture for Improving the Classification Accuracy of an Electronic Nose with Different Sensor Positions
Hannaneh Mahdavi - Reza Goldoust - Saeideh Rahbarpour
Generating Hand-Written Symbols With Trajectory Planning Using A Robotic Arm
Arya Parvizi - Armin Salimi-Badr
Realism in Action: Anomaly-Aware Diagnosis of Brain Tumors from Medical Images Using YOLOv8 and DeiT
Seyed Mohammad Hossein Hashemi - Leila Safari - Mohsen Hooshmand - Amirhossein Dadashzadeh Taromi
Variance-Guided Feature Correlation for Deep Full-Reference Image Quality Assessment
Amirreza Khakpour - Sina Yademellat - Azadeh Mansouri
more
Samin Hamayesh - Version 43.7.0