A composite Feature Selection Method to improve Classifying Imbalanced Big Data

Ghayda A.A. Al-Talib; Shaymaa Ahmed Razoqi

A composite Feature Selection Method to improve Classifying Imbalanced Big Data

Section: Research Paper

Issue

Vol. 18 No. 2 (2024): Volume 18 Issue 2

Published

Jun 25, 2025

Pages

70-81

Abstract

Feature selection is one of the methods used to improve the performance of machine learning algorithms, especially when classifying the big data. the fined of new method was be more needed when dealing with the imbalanced big data. An imbalance in the data appears when there is a discrepancy in the sampling distribution between the two data classes in the training set. To solve the imbalance problem, there are several methods used, some of which depend on redistributing the data and others of which depend on improving the classification algorithm itself. The feature selection can also affect the improvement of imbalanced data classification results when the features are chosen carefully. Therefore, this research proposed a composed feature selection method using the filter feature selection technique and permutation-based important features with the ensemble learning method. Three classifiers were used with three performance metrics (AUC, G-means, and F-score ) to show the effect of proposed feature selection method with imbalanced big data. The results of using proposed method led to improved classification on five standard imbalanced data sets.

Authors

Ghayda A.A. Al-Talib

Department of Computer Science, College of Computer Science and Mathematics, University of Mosul, Mosul, IRAQ

ORCID

Shaymaa Ahmed Razoqi

ORCID

Identifiers

Download this PDF file

PDF

Statistics

How to Cite

A.A. Al-Talib, G., غیداء, Ahmed Razoqi, S., & شیماء. (2025). A composite Feature Selection Method to improve Classifying Imbalanced Big Data. AL-Rafidain Journal of Computer Sciences and Mathematics, 18(2), 70–81. Retrieved from https://rjps.uomosul.edu.iq/index.php/csmj/article/view/19700