0% Complete
Home
/
15th International Conference on Computer and Knowledge Engineering
Persian Legal Text Simplification Leveraging Transformer-Based Models
Authors :
Mohammadreza Joneidi Jafari
1
Saedeh Tahery
2
Amirhossein Nikoofard
3
1- Department of Electronics, Faculty of Electrical Engineering
2- Department of Artificial Intelligence, Faculty of Computer Engineering
3- Department of Systems and Control, Faculty of Electrical Engineering
Keywords :
Text Simplification،Persian Legal Documents،ChatGPT،Large Language Models،Transformer-Based Models
Abstract :
Legal documents often use complex and domain-specific language, which limits their accessibility to the general public. Despite growing interest in text simplification within natural language processing, the legal domain in Persian remains largely unexplored due to the lack of annotated corpora and domain-adapted models. This study presents a practical approach to Persian legal text simplification by leveraging synthetic supervision alongside fine-tuned transformer-based models. This paper creates a new labeled dataset by generating simplified versions of legal rulings using ChatGPT, and validate a subset of the outputs through expert review to ensure data quality. To address resource constraints common in real-world applications, we fine-tune lightweight encoder-decoder models, enabling efficient deployment without requiring large-scale annotation or extensive inference infrastructure. Our results show that a compact model such as ParsT5 outperforms zero-shot large language models like PersianLLaMA. The model is also enhanced with an existing attention extension, which enables efficient processing of long inputs without truncation. As a result, this work introduces the first benchmark for Persian legal text simplification, demonstrating that well-adapted, efficient models can achieve high performance in low-resource and domain-specific scenarios. By taking this solid first step, it paved the way for future research and development in natural language processing for Persian legal texts. The released dataset and code are publicly available at https://github.com/mrjoneidi/Simplification-Legal-Texts.
Papers List
List of archived papers
The Effect of Network Environment on Traffic Classification
Abolghasem Rezaei Khesal - Mehdi Teimouri
Prediction of West Texas Intermediate Crude-oil Price Using Hybrid Attention-based Deep Neural Networks: A Comparative Study
Alireza Jahandoost - Mahboobeh Houshmand - Seyyed Abed Hosseini
Deep Inside Tor: Exploring Website Fingerprinting Attacks on Tor Traffic in Realistic Settings
Amirhossein Khajehpour - Farid Zandi - Navid Malekghaini - Mahdi Hemmatyar - Naeimeh Omidvar - Mahdi Jafari Siavoshani
Deep Learning-Based Malaysian Sign Language (MSL) Recognition: Exploring the Impact of Color Spaces
Ervin Gubin Moung - Precilla Fiona Suwek - Maisarah Mohd Sufian - Valentino Liaw - Ali Farzamnia - Wei Leong Khong
A 2D-CNN Architecture for Improving the Classification Accuracy of an Electronic Nose with Different Sensor Positions
Hannaneh Mahdavi - Reza Goldoust - Saeideh Rahbarpour
Maximum diffusion of news in social media with the approach of reducing the search space
Masoud Karian
Automating Theory of Mind Assessment with a LLaMA-3-Powered Chatbot: Enhancing Faux Pas Detection in Autism
Avisa Fallah - Ali Keramati - Mohammad Ali Nazari - Fatemeh Sadat Mirfazeli
Sum Rate Analysis and Power Allocation in Massive MIMO Systems with Power Constraints
Abdolrasoul Sakhaei Gharagezlou - Mahdi Nangir
SGFL: A Federated Learning Approach for Non-IID Data Using Semi-Supervised DCGAN
Alireza Rabiee - Abolfazl Ajdarloo - Mohsen Rahmani
MCRS-SAE : multi criteria recommender system based on sparse autoencoder
Amir reza Kalantarnezhad - Javad Hamidzadeh
more
Samin Hamayesh - Version 43.7.0