0% Complete
Home
/
15th International Conference on Computer and Knowledge Engineering
Persian Legal Text Simplification Leveraging Transformer-Based Models
Authors :
Mohammadreza Joneidi Jafari
1
Saedeh Tahery
2
Amirhossein Nikoofard
3
1- Department of Electronics, Faculty of Electrical Engineering
2- Department of Artificial Intelligence, Faculty of Computer Engineering
3- Department of Systems and Control, Faculty of Electrical Engineering
Keywords :
Text Simplification،Persian Legal Documents،ChatGPT،Large Language Models،Transformer-Based Models
Abstract :
Legal documents often use complex and domain-specific language, which limits their accessibility to the general public. Despite growing interest in text simplification within natural language processing, the legal domain in Persian remains largely unexplored due to the lack of annotated corpora and domain-adapted models. This study presents a practical approach to Persian legal text simplification by leveraging synthetic supervision alongside fine-tuned transformer-based models. This paper creates a new labeled dataset by generating simplified versions of legal rulings using ChatGPT, and validate a subset of the outputs through expert review to ensure data quality. To address resource constraints common in real-world applications, we fine-tune lightweight encoder-decoder models, enabling efficient deployment without requiring large-scale annotation or extensive inference infrastructure. Our results show that a compact model such as ParsT5 outperforms zero-shot large language models like PersianLLaMA. The model is also enhanced with an existing attention extension, which enables efficient processing of long inputs without truncation. As a result, this work introduces the first benchmark for Persian legal text simplification, demonstrating that well-adapted, efficient models can achieve high performance in low-resource and domain-specific scenarios. By taking this solid first step, it paved the way for future research and development in natural language processing for Persian legal texts. The released dataset and code are publicly available at https://github.com/mrjoneidi/Simplification-Legal-Texts.
Papers List
List of archived papers
Crack Segmentation in Civil Structure Images Using a Deep Learning Based Multi-Classifier System
Mohammadreza Asadi - Seyedeh Sogand Hashemi - Mohammad Taghi Sadeghi
Graph-Theoretic Approach and Advanced Data Balancing for Liver Disease Diagnosis Improvement
Soheib Kiani - Sadegh Sulaimany
Improving the classification of high dimensional class-imbalanced data using the Chaos particle swarm optimization with Levy Flight
Mohammad Ali Zarif - Javad Hamidzadeh
A Review on Machine Learning Methods for Workload Prediction in Cloud Computing
Mohammad Yekta - Hadi Shahriar Shahhoseini
Exploring 3D Transfer Learning CNN Models for Alzheimer’s Disease Diagnosis from MRI Images
Fatemehsadat Ghanadi Ladani - Hamidreza Baradaran Kashani
AIRSPAN-X: Federated XGBoost with Sequential Anomaly Detection for Explainable Urban Air Quality Prediction
Saghar Shafaati - S. Hossein Erfani
Improvement of Credit Scoring by LSTM Autoencoder Model
Milad Sattari Maleki - Seyedeh Niusha Motevallian - Faezehsadat Hosseini - Mohammad Sabokrou - Hamidreza Soltanalizadeh Maleki
A Review on Secure Data Storage and Data Sharing Technics in Blockchain-based IoT Healthcare Systems
Seyedeh Somayeh Fatemi Nasab - Davoud Bahrepour - Seyed Reza Kamel Tabbakh
Improving performance of multi-label classification using ensemble of feature selection and outlier detection
Mohammad Ali Zarif - Javad Hamidzadeh
Diagnosis of Depression Based on New Features Extractive from the Frequency Space of the EEG
Melika Changizi - Saeid Rashidi
more
Samin Hamayesh - Version 43.7.0