0% Complete
Home
/
11th International Conference on Computer and Knowledge Engineering
FarSick: A Persian Semantic Textual Similarity And Natural Language Inference Dataset
Authors :
Zahra Ghasemi
1
Mohammad Ali Keyvanrad
2
1- Original author
2- coauthored
Keywords :
Persian dataset, Semantic Textual Similarity, Natural Language Inference, paraphrase expressions, plagiarism detection, deep learning, Natural Language Processing
Abstract :
Semantic textual similarity(STS) and natural language inference(NLI) are important tasks in natural language processing(NLP) such as information retrieval, text classification, subject extraction, text summarization, machine translation and plagiarism detection. Lack of appropriate datasets in Persian is a major obstacle to progress in this area. Therefore, in this paper, we present a new dataset for STS and NLI tasks in the Persian language. It includes 9804 pairs of Persian sentences with labels for similarity and inference for each pair of sentences. This dataset is collected by translating and editing the sentences of SICK dataset. We also measured the performance of traditional, statistical and deep learning models on it, e.g. transformers, Convolution Neural Networks, Bidirectional LSTMs, weighted average of word vectors, etc. We used different pre-trained embeddings, word2vec, glove, fastText and Bert sentence transformer. We used accuracy metric to test NLI tasks and Pearson metric to test STS tasks.
Papers List
List of archived papers
DTranIDS: A Two-Tiered Intrusion Detection System for RPL-based IoT Networks based on Decision Tree and Transformer Models
Mohammad Fazeli - Mohsen Raji - Mohammad Mahdi Fazeli
HV-RCE: Reducing Network Bandwidth Usage for Video Transmission via HEVC/VVC Features in Resource-Constrained Environments
Yaghoub Saberi - Mohammadreza Forghani - Sharifeh Sadat Mirkhalaf
Token-Based Access Control for Inter-organization Collaboration in Hyperldger Fabric
Parsa Hedayatnia - Mohammad Ata Jalilian - Mohammad Allahbakhsh - Haleh Amintoosi
Hybrid Flow-Rule Placement Method of Proactive and Reactive in SDNs
Mohammadreza Khoobbakht - Mohammadreza Noei - Mohammadreza Parvizimosaed
Robustness Scan of Digital Circuits Using Convolutional Neural Networks
Mobin Vaziri - Mohammad Mehdi Rahimifar - Hadi Jahanirad
A Novel Hybrid Method for Clustering Text Documents using Evolutionary Optimization
Muhammad Naderi - Maryam Amiri
Towards Transparent and Accurate Story Point Estimation via Interpretable BERT-based Modeling
Seyed Emad Baradaran Hosseini - Maryam Khodabakhsh - Alireza Tajary - Seyedehfatemeh Karimi
A Review on Machine Learning Methods for Workload Prediction in Cloud Computing
Mohammad Yekta - Hadi Shahriar Shahhoseini
Trust Management Enhancement for the Internet of Things: a Smart Contract Approach
Amin Rouzbahani - Fattaneh Taghiyareh
Hybrid navigation based on GPS data and SIFT-based place recognition using Biologically-inspired SLAM
Sahar Salimpour Kasebi - Hadi Seyedarabi - Javad Musevi Niya
more
Samin Hamayesh - Version 43.7.0