0% Complete
Home
/
11th International Conference on Computer and Knowledge Engineering
FarSick: A Persian Semantic Textual Similarity And Natural Language Inference Dataset
Authors :
Zahra Ghasemi
1
Mohammad Ali Keyvanrad
2
1- Original author
2- coauthored
Keywords :
Persian dataset, Semantic Textual Similarity, Natural Language Inference, paraphrase expressions, plagiarism detection, deep learning, Natural Language Processing
Abstract :
Semantic textual similarity(STS) and natural language inference(NLI) are important tasks in natural language processing(NLP) such as information retrieval, text classification, subject extraction, text summarization, machine translation and plagiarism detection. Lack of appropriate datasets in Persian is a major obstacle to progress in this area. Therefore, in this paper, we present a new dataset for STS and NLI tasks in the Persian language. It includes 9804 pairs of Persian sentences with labels for similarity and inference for each pair of sentences. This dataset is collected by translating and editing the sentences of SICK dataset. We also measured the performance of traditional, statistical and deep learning models on it, e.g. transformers, Convolution Neural Networks, Bidirectional LSTMs, weighted average of word vectors, etc. We used different pre-trained embeddings, word2vec, glove, fastText and Bert sentence transformer. We used accuracy metric to test NLI tasks and Pearson metric to test STS tasks.
Papers List
List of archived papers
Data Clustering using Chimp Optimization Algorithm
SAYED PEDRAM HAERI BOROUJENI - ELNAZ PASHAEI
Adaptive Channel Estimation for MIMO-OFDM Systems in Impulsive Noise Environments
Mojtaba Hajiabadi
Camouflage Object Segmentation with Attention-Guided Pix2Pix and Boundary Awareness
Erfan Akbarnezhad Sany - Fatemeh Naserizadeh - Parsa Sinichi - Seyyed Abed Hosseini
Trust Management Enhancement for the Internet of Things: a Smart Contract Approach
Amin Rouzbahani - Fattaneh Taghiyareh
A Graph-based Feature Selection using Class-Feature Association Map (CFAM)
Motahare Akhavan - Seyed Mohammad Hossein Hasheminejad
Multi-Task Transformer for Stock Market Trend Prediction
Seyed Morteza Mirjebreili - Ata Solouki - Hamidreza Soltanalizadeh - Mohammad Sabokrou
Robustness Scan of Digital Circuits Using Convolutional Neural Networks
Mobin Vaziri - Mohammad Mehdi Rahimifar - Hadi Jahanirad
A Hybrid Echo State Network for Hypercomplex Pattern Recognition, Classification, and Big Data Analysis
Mohammad Jamshidi - Fatemeh Daneshfar
GroupRec: Group Recommendation by Numerical Characteristics of Groups in Telegram
Davod Karimpour - Mohammad Ali Zare Chahooki - Ali Hashemi
Optimizing the controller placement problem in SDN with uncertain parameters with robust optimization
Mohammad Kazemi - AhmadReza Montazerolghaem
more
Samin Hamayesh - Version 42.2.1