0% Complete
Home
/
11th International Conference on Computer and Knowledge Engineering
A Weighted TF-IDF-based Approach for Authorship Attribution
Authors :
Ali Abedzadeh
1
Reza Ramezani
2
Afsaneh Fatemi
3
1- university of isfahan
2- university of isfahan
3- university of isfahan
Keywords :
Authorship Attribution, Author Identification, Information Retrieval, Term Frequency, TF-IDF
Abstract :
Authorship Attribution (AA) is a task in which a disputed text is automatically assigned to an author chosen from a list of candidate authors. To this end, a model is trained on a dataset of textual documents with known authors, which can be considered as a multi-class single-label classification task. In this paper, we approach this task differently by extending information retrieval techniques to train an AA model. It is based on weighting the AARR technique, presented in our previous study, to relax the value of term frequency. The efficiency of the proposed solution has been evaluated by conducting several experiments on six datasets. The results show the superiority of the proposed solution by improving the accuracy of IMDB, Gutenberg books, Poetry, Blogs, PAN2011, and Twitter datasets by 33%, 31%, 31%, 19%, 6%, and 1%, respectively, where the average improvement is 19.94% over all datasets. The best accuracy over these datasets is 88%, 82%, 67%, 90%, 65%, and 81% in the same respect. In addition, compared to the baseline system, the computation time of the proposed solution has been improved significantly (21.44X) by employing a dictionary-based indexing technique.
Papers List
List of archived papers
Persis: A Persian Font Recognition Pipeline Using Convolutional Neural Networks
Mehrdad Mohammadian - Neda Maleki - Tobias Olsson - Fredrik Ahlgren
A Hybrid Architecture to Optimize Persian FAQ Retrieval using Semantic Similarity Search
Seyed Amir Mohammad Hosseini - Fatemeh Dehbashi - Setare Kahnemuee - Mohsen Kahani - Morteza Fardin
A Systematic Embedded Software Design Flow for Robotic Applications
Navid Mahdian - Seyed-Hosein Attarzadeh-Niaki - Armin Salimi-Badr
Performance Evaluation Study of Color Space Selection In Video Based Facial Expression Recognition Using Deep Neural Networks For Sentiment Analysis
Phee Wei Qin - Ervin Gubin Moung - Ali Farzamnia - Farashazillah Yahya - John Julius Danker Khoo - Maisarah Mohd Sufian
Multi-Layer Collaborative Graph with BPR Similarity Embedding for Recommender System
Mostafa Ghorbani - Azadeh Mansouri
InfOnto: An ontology for fashion influencer marketing based on Instagram
Somaye Sultani - Mohsen Kahani
Identifying novel disease genes based on protein complexes and biological features
Mahshad Hashemi - Eghbal Mansoori
A Review on Machine Learning Methods for Workload Prediction in Cloud Computing
Mohammad Yekta - Hadi Shahriar Shahhoseini
Extreme Gradient Boosting (XGBoost) Regressor and Shapley Additive Explanation for Crop Yield Prediction in Agriculture
Dennis A/L Mariadass - Ervin Gubin Moung - Maisarah Mohd Sufian - Ali Farzamnia
Bridging Knowledge and Language Models in Healthcare: A RAG Survey
Seyedali Hasanzadeh - Fahimeh Ghasemian - Elham Shabaninia
more
Samin Hamayesh - Version 43.7.0