0% Complete
Home
/
11th International Conference on Computer and Knowledge Engineering
FarSick: A Persian Semantic Textual Similarity And Natural Language Inference Dataset
Authors :
Zahra Ghasemi
1
Mohammad Ali Keyvanrad
2
1- Original author
2- coauthored
Keywords :
Persian dataset, Semantic Textual Similarity, Natural Language Inference, paraphrase expressions, plagiarism detection, deep learning, Natural Language Processing
Abstract :
Semantic textual similarity(STS) and natural language inference(NLI) are important tasks in natural language processing(NLP) such as information retrieval, text classification, subject extraction, text summarization, machine translation and plagiarism detection. Lack of appropriate datasets in Persian is a major obstacle to progress in this area. Therefore, in this paper, we present a new dataset for STS and NLI tasks in the Persian language. It includes 9804 pairs of Persian sentences with labels for similarity and inference for each pair of sentences. This dataset is collected by translating and editing the sentences of SICK dataset. We also measured the performance of traditional, statistical and deep learning models on it, e.g. transformers, Convolution Neural Networks, Bidirectional LSTMs, weighted average of word vectors, etc. We used different pre-trained embeddings, word2vec, glove, fastText and Bert sentence transformer. We used accuracy metric to test NLI tasks and Pearson metric to test STS tasks.
Papers List
List of archived papers
Optimizing Foreign Exchange Trading Performance Through Reinforcement Machine Learning Framework
Ervin Gubin Moung - Hani Yasmin Binti Murnizam - Maisarah Mohd Sufian - Valentino Liaw - Ali Farzamnia - Lorita Angeline
Span-prediction of Unknown Values for Long-sequence Dialogue State Tracking
Marzieh Naghdi Dorabati - Reza Ramezani - Mohammad Ali Nematbakhsh
An effective hybrid algorithm for locating splicing forgery image
Seyed Hesamoddin Hosseini - Amene Vatanparast - Amir Hossein Taherinia
A Deep CNN Model Based Ensemble Approach for Semantic and Instance Segmentation of Indoor Environment
Sajad Rezaei - Jafar Tanha - Zahra Jafari - SeyedEhsan Roshan - Mohammad-Amin Memar Kochebagh
Enhanced Atrial Fibrillation (AF) Detection via Data Augmentation with Diffusion Model
Arash Vashagh - Amirhossein Akhoondkazemi - Sayed Jalal Zahabi - Davood Shafie
Ramp Progressive Secret Image Sharing using Ensemble of Simple Methods
Atieh Mokhtari - Mohammad Taheri
The Internet of Things-Enabled Smart City: An In-Depth Review of Its Domains and Applications
Amir Meydani - Ali Ramezani - Alireza Meidani
Robust Learning to Learn Graph Topologies
Navid Akhavan Attar - Ali Fahim
Automatic Detection and Risk Assessment of Session Management Vulnerabilities in Web Applications
Nasrin Garmabi - Mohammad Ali Hadavi
Standardized ReACT Logits: An Effective Approach for Anomaly Segmentation in Self-driving Cars
Mahdi Farhadi - Seyede Mahya Hazavei - Shahriar Baradaran Shokouhi
more
Samin Hamayesh - Version 42.2.1