0% Complete
Home
/
14th International Conference on Computer and Knowledge Engineering
Enhancing Persian Word Sense Disambiguation with Large Language Models: Techniques and Applications
Authors :
Fatemeh Zahra Arshia
1
Saeedeh Sadat Sadidpour
2
1- Faculty of Electronic & Computer Engineering, Malek Ashtar University of Technology
2- Faculty of Electronic & Computer Engineering, Malek Ashtar University of Technology
Keywords :
Word Sense Disambiguation (WSD)،Large Language Models (LLMs)،Persian Disambiguation
Abstract :
WSD means the task of word sense disambiguation, which is a very important task in NLP. It assigns not only the meaningful word to the source text but also the proper meaning of the word according to the context. Hence, it is key to the proper accomplishment of NLP in Persian—a language rich in morphology and great polysemy. The recent improvements in LMs have greatly advanced the capabilities of NLP, opening further improvement avenues in WSD performance. This paper presents the integration of LLMs for improving WSD in Persian, considering the linguistic challenges related to this language. In this study, we consider four models of the Persian language: FaBERT, AriaBERT, GPT-2 Persian, and PersianMind-v1.0. We use the supervised fine-tuning method on the SBU-WSD-Corpus. Our methodology will consist of preprocessing the Persian WSD corpus, then fine-tuning the models with the mentioned corpus, and measuring their performance. Results indicate that methods using LLMs significantly improve WSD accuracy against traditional methods, with FaBERT achieving the best accuracy. We have further expounded on their real-life applications, such as sentiment analysis, to show the consequential effect of this advancement on general NLP tasks. The study is concluded with some insights into future research directions, underlining the potential role that LLMs can play in further transforming WSD and related fields.
Papers List
List of archived papers
Weakly Supervised Learning in a Group of Learners with Communication
Ali Ganjbakhsh - Ahad Harati
REMA: Reinforced Exponential Moving Average for Real-Time Anomaly Detection in Sensor Data
Mohammad Hossein Jafari Naeimi - Ali Norouzi - Athena Abdi
Learning to Classify Messier Astronomical Objects with Limited Data: A Few-Shot Learning Approach
AMIRREZA ROUHBAKHSHMEGHRAZI - Shayan Nalbandian - Ghazal Alizadeh - Sheida Shadman - Shuyuan Yang - Bo Li
Adaptive Hybrid TRCA–CORRCA algorithm for enhanced accuracy in SSVEP-based brain-computer interfaces
Sepehr Tayebeh Khabbaz - Sina Tayebeh Khabbaz - Arshia Barani - Arsalan Ganjeh - Sasan Harifi - Seyed Mohsen Mirhosseini
Virtual Network Embedding based on Univariate Distribution Estimation
Arezoo Jahani
Dynamic Hand Gesture Recognition with 2DCNN-LSTM and Improved Keyframe Extraction
Narjes Heidari - Javid Norouzi - Mohammad Sadegh Helfroush - Habibollah Danyal
Collaborative LLM Reasoning for Vulnerability Detection in Smart Contracts
Amirreza Samari - Parsa Hedayatnia - Seyyed Javad Bozorgzadeh Razavi - Mohammad Allahbakhsh - Haleh Amintoosi
Dynamic Knowledge Enhanced Neural Fashion Trend Forecasting with Quantile Loss
Fatemeh Rooholamini - Reza Azmi - Mobina Khademhossein - Maral Zarvani
Lempel-Ziv-based Hyper-Heuristic Solution for Longest Common Subsequence Problem
Mahdi Nasrollahi - Reza Shami Tanha - Mohsen Hooshmand
AIRSPAN-X: Federated XGBoost with Sequential Anomaly Detection for Explainable Urban Air Quality Prediction
Saghar Shafaati - S. Hossein Erfani
more
Samin Hamayesh - Version 43.7.0