0% Complete
Home
/
15th International Conference on Computer and Knowledge Engineering
A Hybrid Architecture to Optimize Persian FAQ Retrieval using Semantic Similarity Search
Authors :
Seyed Amir Mohammad Hosseini
1
Fatemeh Dehbashi
2
Setare Kahnemuee
3
Mohsen Kahani
4
Morteza Fardin
5
1- Computer Engineering Department Ferdowsi University of Mashhad
2- Computer Engineering Department Ferdowsi University of Mashhad
3- Computer Engineering Department Ferdowsi University of Mashhad
4- Computer Engineering Department Ferdowsi University of Mashhad
5- Baran Software Group
Keywords :
Persian Language Processing،FAQ Retrieval،Hybrid Clustering،Sentence Embeddings،Semantic Search،Evaluation Framework
Abstract :
Traditional FAQ-based Question Answering (QA) systems, which rely on lexical matching, often fail to comprehend the semantic nuances of morphologically rich languages like Persian. Furthermore, their performance is typically measured by rigid metrics that underestimate a model's true capabilities. This paper addresses these challenges by proposing a novel, two-stage Hybrid Clustering-Based Search architecture designed to improve both the accuracy and efficiency of semantic retrieval. We introduce an iterative "Hierarchical DBSCAN" method to cluster a real-world Persian knowledge base, allowing for a focused, coarse-to-fine search pipeline. To robustly evaluate our system, we created a new multi-faceted dataset containing formal, informal, and challenging queries. Our experiments show that the proposed hybrid architecture, achieves a Top-1 accuracy of 79% and a Top-3 accuracy of 86% and also answering delay of only 0.115 seconds. This represents an improvement in both accuracy and speed compared to the optimal E5-Base model in the standard Direct Semantic Search baseline. Our work provides an efficient and validated blueprint for developing practical semantic QA systems for the Persian language.
Papers List
List of archived papers
Graph Representation Learning Towards Patents Network Analysis
Mohammad Heydari - Babak Teimourpour
A Review on Secure Data Storage and Data Sharing Technics in Blockchain-based IoT Healthcare Systems
Seyedeh Somayeh Fatemi Nasab - Davoud Bahrepour - Seyed Reza Kamel Tabbakh
Reliability Evaluation of 4:2 Compressors Based on Hammock Networks
Farshad Safaei - Mohammad mahdi Emadi Kouchak - Sara Talebpour
Paddy Plant Stress Identification Using Few-Shot Learning Framework
Ervin Gubin Moung - Pavindrah Naidu a/l Narayanasamy Naiidu - Maisarah Mohd Sufian - Valentino Liaw - Ali Farzamnia - Lorita Angeline
An Ensemble CNN for Brain Age Estimation based on Hippocampal Region Applicable to Alzheimer's Diagnosis
Zahra Qodrati - Seyedeh Masoumeh Taji - Habibollah Danyali - Kamran Kazemi
Divide and Conquer Approach to Long Genomic Sequence Alignment
Mahmoud Naghibzadeh - Samira Babaei - Behshid Behkmal - Mojtaba Hatami
The Internet of Things-Enabled Smart City: An In-Depth Review of Its Domains and Applications
Amir Meydani - Ali Ramezani - Alireza Meidani
Attentional Bi-LSTM for Multivariate Time Series Forecasting on Edge Devices: A Case Study on NanoPi Neo Plus2
Navid Hajizadeh - Saeed Yazdani - Sara Ershadi-Nasab
DEW-WIN: A Dynamic Energy-aware Window-based Scheduler for Mixed-criticality Systems
Mahin Moradiyan - Yasser Sedaghat - Pouria Hosseini - Yousef Rezazadeh
LLM-Driven AutoML for Cross-Lingual Handwritten OCR: Closed-Loop Neural Architecture Search with GPT-5, GPT-4o, and Claude Sonnet 4
Mobina Kashaniyan - Amirhossein Ghassemi - Nasser Mozayani
more
Samin Hamayesh - Version 43.7.0