Manuscript Title:

EXPLORING AUTHOR PROFILING FOR PLAGIARISM DETECTION: LEVERAGING PERSONALITY TRAITS AND ENSEMBLE METHODS

Author:

VRUSHALI BHUYAR, SACHIN DESHMUKH

DOI Number:

DOI:10.5281/zenodo.12198393

Published : 2024-06-23

About the author(s)

1. VRUSHALI BHUYAR - Department of Computer Applications, Maharashtra Institute of Technology, Dr. Babasaheb Ambedkar Marathwada University, Chh. Sambhajinagar, India.
2. SACHIN DESHMUKH - Department of Computer Science and IT, Dr. Babasaheb Ambedkar Marathwada University, Chh. Sambhajinagar, India.

Full Text : PDF

Abstract

Plagiarism poses a significant challenge in academic circle, as individuals frequently pass off internet content as their own without proper attribution. Traditional detection methods, reliant on established databases, falter when the source material is absent. Author profiling emerges as a crucial tool, analyzing collective language patterns to discern traits like gender, age, native language, and personality. This paper focuses on leveraging personality traits for both plagiarism detection and author profiling. Employing machine learning, particularly ensemble methods, offers promising solutions to these intricate challenges. A dataset of 67 technical research papers, annotated with OCEAN personality traits and plagiarism percentages, underwent preprocessing including outlier detection and normalization. Ensemble techniques, like Extended Gradient Boosting Regressor, Bagging Regressor, Gradient Regressor and AdaBoost Regressor, were applied as base models, with Random Forest Regressor serving as the meta model. Findings reveal notable RMSE values: 0.29 for stacking, 0.93 for averaging, and 2.39 for max voting. Comparison with non-ensemble methods underscores the effectiveness of ensemble learning, notably with Random Forest Regressor achieving a commendable RMSE of 0.29 post-training. Novelty lies in integrating plagiarism detection with personality-based author profiling, providing a comprehensive approach for tackling academic misconduct. By melding machine learning with personality insights, novel avenues for improving detection accuracy emerge. Moreover, ensemble methods enhance the robustness of the approach, showcasing innovative strategies for maintaining academic integrity. This study's findings, integrating plagiarism detection with personality-based author profiling, promise to enhance academic integrity and scholarly conduct, offering valuable insights for refining detection tools and informing decisionmaking in diverse domains.


Keywords

Plagiarism Detection, Author Profiling, Personality Traits, Machine Learning, Ensemble Methods.