0% Complete
Home
/
11th International Conference on Computer and Knowledge Engineering
A Weighted TF-IDF-based Approach for Authorship Attribution
Authors :
Ali Abedzadeh
1
Reza Ramezani
2
Afsaneh Fatemi
3
1- university of isfahan
2- university of isfahan
3- university of isfahan
Keywords :
Authorship Attribution, Author Identification, Information Retrieval, Term Frequency, TF-IDF
Abstract :
Authorship Attribution (AA) is a task in which a disputed text is automatically assigned to an author chosen from a list of candidate authors. To this end, a model is trained on a dataset of textual documents with known authors, which can be considered as a multi-class single-label classification task. In this paper, we approach this task differently by extending information retrieval techniques to train an AA model. It is based on weighting the AARR technique, presented in our previous study, to relax the value of term frequency. The efficiency of the proposed solution has been evaluated by conducting several experiments on six datasets. The results show the superiority of the proposed solution by improving the accuracy of IMDB, Gutenberg books, Poetry, Blogs, PAN2011, and Twitter datasets by 33%, 31%, 31%, 19%, 6%, and 1%, respectively, where the average improvement is 19.94% over all datasets. The best accuracy over these datasets is 88%, 82%, 67%, 90%, 65%, and 81% in the same respect. In addition, compared to the baseline system, the computation time of the proposed solution has been improved significantly (21.44X) by employing a dictionary-based indexing technique.
Papers List
List of archived papers
Improving Soft Error Reliability of FPGA-based Deep Neural Networks with Reduced Approximate TMR
Anahita Hosseinkhani - Behnam Ghavami
Computational Microscopy Based on Fourier Ptychography using Embedded Architecture
Rezvan Mir - Abedin Vahedian
Analyzing the Impact of COVID-19 on Economy from the Perspective of User’s Reviews
Fatemeh Salmani - Hamed Vahdat-Nejad - Hamideh Hajiabadi
Optimizing Text-Based Protocol Clustering in Reverse Engineering with Auto-Encoders and Fine-Tuned Parameters
Shiva Mahmoudzadeh - Mohaddese Nemati - Mehdi Teimouri
Farsi Text in Scene: A new dataset
Ali Salmasi - Ehsanollah Kabir
An Analysis of Botnet Detection Using Graph Neural Network
Faezeh Alizadeh - Mohammad Khansari
A Genetic-based Fusion Approach of Persian and Universal Phonetic results for Spoken Language Identification
Ashkan Moradi - Yasser Shekofteh - Saeed Zarei
Enhanced Autoencoder-based Clustering for Message Analysis in Binary Protocols
Mohaddese Nemati - Shiva Mahmoudzadeh - Mehdi Teimouri
PowerLinear Activation Functions with application to the first layer of CNNs
Kamyar Nasiri - Kamaledin Ghiasi-Shirazi
Joint ADC-less Analog Demodulator and Decoder for Extended Binary (8, 4, 4) Hamming Channel Code
Mir Mahdi Safari - Jafar Pourrostam - Behzad Mozaffari Tazehkand
more
Samin Hamayesh - Version 42.2.1