Hybrid efficient genetic algorithm for big data feature selection problems

[ X ]

Tarih

2020

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Springer

Erişim Hakkı

info:eu-repo/semantics/closedAccess

Özet

Due to the huge amount of data being generating from different sources, the analyzing and extracting of useful information from these data becomes a very complex task. The difficulty of dealing with big data optimization problems comes from many factors such as the high number of features, and the existing of lost data. The feature selection process becomes an important step in many data mining and machine learning algorithms to reduce the dimensionality of the optimization problems and increase the performance of the classification or clustering algorithms. In this paper, a set of hybrid and efficient genetic algorithms are proposed to solve feature selection problem, when the handled data has a large feature size. The proposed algorithms use a new gene-weighted mechanism that can adaptively classify the features into strong relative features, weak or redundant features, and unstable features during the evolution of the algorithm. Based on this classification, the proposed algorithm gives the strong features high priority and the weak features less priority when generating new candidate solutions. In the same time, the proposed algorithm tries to more concentrate on unstable features that sometimes appear and sometimes disappear from the best solutions of the population. The performance of proposed algorithms is investigated by using different datasets and feature selection algorithms. The results show that our proposed algorithms can outperform the other feature selection algorithms and effectively enhance the classification performance over the tested datasets.

Açıklama

Anahtar Kelimeler

Feature Selection, Evolutionary Algorithms, Big Data Analyzing, Artificial Neural Networks

Kaynak

Foundations of Science

WoS Q Değeri

Q2

Scopus Q Değeri

Q1

Cilt

25

Sayı

4

Künye