0% Complete
Home
/
11th International Conference on Computer and Knowledge Engineering
FarSick: A Persian Semantic Textual Similarity And Natural Language Inference Dataset
Authors :
Zahra Ghasemi
1
Mohammad Ali Keyvanrad
2
1- Original author
2- coauthored
Keywords :
Persian dataset, Semantic Textual Similarity, Natural Language Inference, paraphrase expressions, plagiarism detection, deep learning, Natural Language Processing
Abstract :
Semantic textual similarity(STS) and natural language inference(NLI) are important tasks in natural language processing(NLP) such as information retrieval, text classification, subject extraction, text summarization, machine translation and plagiarism detection. Lack of appropriate datasets in Persian is a major obstacle to progress in this area. Therefore, in this paper, we present a new dataset for STS and NLI tasks in the Persian language. It includes 9804 pairs of Persian sentences with labels for similarity and inference for each pair of sentences. This dataset is collected by translating and editing the sentences of SICK dataset. We also measured the performance of traditional, statistical and deep learning models on it, e.g. transformers, Convolution Neural Networks, Bidirectional LSTMs, weighted average of word vectors, etc. We used different pre-trained embeddings, word2vec, glove, fastText and Bert sentence transformer. We used accuracy metric to test NLI tasks and Pearson metric to test STS tasks.
Papers List
List of archived papers
Multi Model CNN Based Gas Meter Characters Recognition
Sanaz Tarhib - Jafar Tanha - Soodabeh Imanzadeh - Sahar Hassanzadeh Mostafaei
Towards Study of Research Topics Evolution in Artificial Intelligence based on Topic Embedding
Seyyed Reza Taher Harikandeh - Sadegh Aliakbary - Soroush Taheri
Mitochondrial Segmentation in Microscopy Images Using UNet-VGG19
Zerek Sediq Hossein - Rojiar Pir Mohammadiani - Saadat Izadi
Intracranial Hemorrhage Classification using CBAM Attention Module and Convolutional Neural Networks
Parnian Rahimi - Marjan Naderan - Amir Jamshidnezhad - Shahram Rafie
Virtual machine consolidation using SLA-aware genetic algorithm placement for data centers with non-stationary workloads
Hossein Monshizadeh Naeen
Enhancing Lighter Neural Network Performance with Layer-wise Knowledge Distillation and Selective Pixel Attention
Siavash Zaravashan - Sajjad Torabi - Hesam Zaravashan
Multi-Task Transformer for Stock Market Trend Prediction
Seyed Morteza Mirjebreili - Ata Solouki - Hamidreza Soltanalizadeh - Mohammad Sabokrou
Supervised Contrastive Learning for Short Text Classification in Natural Language Processing
Mitra Esmaeili - Hamed Vahdat nejad
Blind image quality assessment based on Multi-resolution Local Structures
Seyed Majid Khorashadizadeh - Mehdi Sadeghi Bakhi - Fatemeh Seifishahpar - AliMohammad Latif
Data Clustering using Chimp Optimization Algorithm
SAYED PEDRAM HAERI BOROUJENI - ELNAZ PASHAEI
more
Samin Hamayesh - Version 42.4.1