0% Complete
Home
/
15th International Conference on Computer and Knowledge Engineering
PersianILP: Construction and Evaluation of a Standard Persian Dataset for Inductive Link Prediction
Authors :
Mohammad Rahimi
1
Afsaneh Fatemi
2
Ahmad Baraani
3
1- Dept. of Software Engineering
2- Dept. of Software Engineering
3- Dept. of Software Engineering
Keywords :
Inductive Link Prediction،Knowledge Graph Completion،PersianILP Dataset
Abstract :
Link prediction in knowledge graphs is a key task aimed at addressing the challenge of graph sparsity. In inductive link prediction, a model is trained on one graph and evaluated on another containing unseen entities. While twelve inductive datasets have been introduced for English to benchmark models in this domain, no such dataset exists for Persian. This study introduces PersianILP, the first Persian dataset designed for inductive link prediction. PersianILP is constructed through a purposeful combination of real-world data extracted from the FarsiBase knowledge graph and synthetic data generated using the DeepSeek language model. To evaluate PersianILP, key criteria such as structural and semantic diversity, statistical alignment between synthetic and real data, and adherence to inductive evaluation principles were considered. The dataset is compared with twelve benchmark datasets, including WN18RR, FB237, and NELL995. PersianILP contains 16,306 semantic triples, 10,693 entities, and 432 unique relations, exhibiting a highly sparse structure with a sparsity rate of 0.99. Evaluation using a baseline inductive link prediction model confirms the dataset’s high quality and effectiveness. Statistical analyses further demonstrate that PersianILP meets all essential requirements for research in inductive link prediction and can serve as a standard resource for studies in Persian language processing, semantic web, and recommender systems.
Papers List
List of archived papers
TCAR: Thermal and Congestion-Aware Routing Algorithm in a Partially Connected 3D Network on Chip
Majid Nezarat - Masoomeh Momeni
Investigation of topological characteristics of Iranian railway network: A network science approach
Sina Firuzbakht - Mohammad Khansari
AVID: A VARIATIONAL INFERENCE DELIBERATION FOR META-LEARNING
Alireza Javaheri - Arsham Gholamzadeh Khoee - Saeed Reza Kheradpisheh - Hadi Farahani - Mohammad Ganjtabesh
Weakly Supervised Learning in a Group of Learners with Communication
Ali Ganjbakhsh - Ahad Harati
EfficientNetB0’s Hybrid Approach for Brain Tumor Classification from MRI Images Using Deep Learning and Bagging Trees
Yeganeh Modaresnia - Farhad Abedinzadeh Torghabeh - Seyyed Abed Hosseini
Segmentation of Hard Exudates in Retinal Fundus Images Using BCDU-Net
Nafise Ameri - Nasser Shoeibi - Mojtaba Abrishami
Cloud Service Composition Using Genetic Algorithm and Particle Swarm Optimization
Javad Dogani - Farshad Khunjush
Deep Learning Feature Extraction for COVID-19 Detection Algorithm using Computerized Tomography Scan
Maisarah Mohd Sufian - Ervin Gubin Moung - Chong Joon Hou - Ali Farzamnia
A New Time Series Approach in Churn Prediction with Discriminatory Intervals
Hedieh Ahmadi - Seyed Mohammad Hossein Hasheminejad
Robat-e-Beheshti: A Persian Wake Word Detection Dataset for Robotic Purposes
Parisa Ahmadzadeh Raji - Yasser Shekofteh
more
Samin Hamayesh - Version 43.7.0