Manuscript Title:

ADVANCING DATA ANALYSIS THROUGH FUZZY PRINCIPAL COMPONENT ANALYSIS: A COMPREHENSIVE EXPLORATION IN CANCER DATA INSIGHTS

Author:

HASSANIA HAMZAOUI, BOUCHRA DAOUDI, MOUNIR GOUIOUEZ

DOI Number:

DOI:10.5281/zenodo.10301651

Published : 2023-12-10

About the author(s)

1. HASSANIA HAMZAOUI - LPAIS, Faculty of Sciences, Sidi Mohamed Ben Abdellah University, Fez, Morocco.
2. BOUCHRA DAOUDI - LPAIS, Faculty of Sciences, Sidi Mohamed Ben Abdellah University, Fez, Morocco.
3. MOUNIR GOUIOUEZ - LPAIS, Faculty of Sciences, Sidi Mohamed Ben Abdellah University, Fez, Morocco.

Full Text : PDF

Abstract

This article introduces a novel paradigm in quantitative data analysis by proposing an algorithm that integrates fuzzy set theory with the established Principal Component Analysis (PCA) methodology. This integration is designed to optimize and enhance the latter’s ability to capture and interpret complex patterns within quantitative data that may exhibit varying degrees of fuzziness, and imprecision in real-world datasets. Fuzzy set theory, pioneered by Zadeh in 1965, provides a formalism for representing and manipulating uncertainty and imprecision in data. By incorporating this theory into the PCA algorithm, we seek to resolve the practical limitations posed by the deterministic nature of traditional PCA. The proposed algorithm’s efficacy is evaluated using a real-world dataset focused on cancer, with results systematically compared against those obtained through the PCA algorithm. The assessment extends to two additional datasets concerning cardiovascular disease and diabetes, ensuring the generalizability and robustness of the findings. Empirical evidence substantiates that the Fuzzy Principal Component Analysis algorithm surpasses the PCA algorithm in terms of efficiency and performance. This superiority persists even when confronted with datasets of increased dimensionality. This research contributes to augmenting analytics capacity for handling imprecise and uncertain data, and substantiating these enhancements through empirical validation on diverse datasets.


Keywords

Fuzzy Set Theory, Fuzzy C-Means Clustering Algorithm, Membership Function, Fuzzy Principal Component Analysis.