0% Complete
Home
/
15th International Conference on Computer and Knowledge Engineering
A Hybrid Architecture to Optimize Persian FAQ Retrieval using Semantic Similarity Search
Authors :
Seyed Amir Mohammad Hosseini
1
Fatemeh Dehbashi
2
Setare Kahnemuee
3
Mohsen Kahani
4
Morteza Fardin
5
1- Computer Engineering Department Ferdowsi University of Mashhad
2- Computer Engineering Department Ferdowsi University of Mashhad
3- Computer Engineering Department Ferdowsi University of Mashhad
4- Computer Engineering Department Ferdowsi University of Mashhad
5- Baran Software Group
Keywords :
Persian Language Processing،FAQ Retrieval،Hybrid Clustering،Sentence Embeddings،Semantic Search،Evaluation Framework
Abstract :
Traditional FAQ-based Question Answering (QA) systems, which rely on lexical matching, often fail to comprehend the semantic nuances of morphologically rich languages like Persian. Furthermore, their performance is typically measured by rigid metrics that underestimate a model's true capabilities. This paper addresses these challenges by proposing a novel, two-stage Hybrid Clustering-Based Search architecture designed to improve both the accuracy and efficiency of semantic retrieval. We introduce an iterative "Hierarchical DBSCAN" method to cluster a real-world Persian knowledge base, allowing for a focused, coarse-to-fine search pipeline. To robustly evaluate our system, we created a new multi-faceted dataset containing formal, informal, and challenging queries. Our experiments show that the proposed hybrid architecture, achieves a Top-1 accuracy of 79% and a Top-3 accuracy of 86% and also answering delay of only 0.115 seconds. This represents an improvement in both accuracy and speed compared to the optimal E5-Base model in the standard Direct Semantic Search baseline. Our work provides an efficient and validated blueprint for developing practical semantic QA systems for the Persian language.
Papers List
List of archived papers
Automated software design using Machine Learning With Natural Language Processing
Fahimeh Khedmatkon - Seyed Mohammad Hossein Hasheminejad - Jaleh Shoshtarian Malak
AL-YOLO: Accurate and Lightweight Vehicle and Pedestrian Detector in Foggy Weather
Behdad Sadeghian Pour - Hamidreza Mohammadi Jozani - Shahriar Baradaran Shokouhi
Optimal PMU Placement Considering Reliability of Measurement System in Smart Grids
Mohammad Shahraeini - Shahla Khormali - Ahad Alvandi
Impact of Oversampling Methods on Imbalanced Dataset for Software Fault Prediction
Alireza Abiri - Alireza Tajary - Mansoor Fateh
Towards Efficient Capsule Networks through Approximate Squash Function and Layer-wise Quantization
Mohsen Raji - Kimia Soroush - Amir Ghazizadeh
FaaScaler: An Automatic Vertical and Horizontal Scaler for Serverless Computing Environments
Zahra Rezaei - Saeid Abrishami - Seid Nima Moeintaghavi
The application of Brain Drain Optimization algorithm on static drone placement problem
Mohammad Mehdi Samimi - Alireza Basiri
An Adaptive Budget and Deadline-aware Algorithm for Scheduling Workflows Ensemble in IaaS Clouds
Negin Shafinezhad - Hamid Abrishami - Saeid Abrishami
Analysis of Insect-plant Interactions Affected by Mining operations, A Graph Mining Approach
Mohammad Heydari - Ali Bayat - Amir Albadvi
Persis: A Persian Font Recognition Pipeline Using Convolutional Neural Networks
Mehrdad Mohammadian - Neda Maleki - Tobias Olsson - Fredrik Ahlgren
more
Samin Hamayesh - Version 43.7.0