Comparison of Sequential Feature Selection Performance with Various Dimensional Data to Produce Optimal Classification

Rahajoe, Ani Dijah and salim, Agus and Setyaningsih, Emy and Mandyartha, Eka Prakarsa and Akbar, Fawwaz and Muttaqin, Faisal (2022) Comparison of Sequential Feature Selection Performance with Various Dimensional Data to Produce Optimal Classification. In: Prosiding.

[img] Text (similarity)
Cek_Comparison_of_Sequential,,.pdf - Accepted Version

Download (1MB)
[img] Text (korespondensi)
Korespondensi ITIS 22 Comparison.pdf - Accepted Version

Download (416kB)
[img] Text (naskah)
Naskah lengkap Sequential feature selection.pdf - Accepted Version
Restricted to Registered users only

Download (1MB)

Abstract

Feature selection is one part of preprocessing which aims to reduce data dimensions. This study aims to produce optimal performance of the best feature selection method implemented on low- and high-dimensional data. The feature selection methods used are Sequential Forward Selection (SFwS) and Sequential Backward Selection (SBwS) algorithms. Meanwhile, to test the results of the best feature selection algorithm using the classification algorithm Logistic Regression, Linear Regression, k-Nearest Neighbor (kNN). The low dimensional data test results show that the SFS algorithm with the kNN classifier and the Regression group has the best average score. The SFwS algorithm with Logistic Regression has the best average score in the high dimensional data set. The minimal number of feature selections resulted from the SFwS algorithm rather than the SBwF algorithm. Although it should have the same accuracy or number of selected features, in this study, the results were different except for SFS-Linear Regression. The highest average accuracy score for low dimensional data is the Wine dataset of 0.994 (SFS-kNN), and for high dimensional data is the Parkinson’s disease dataset of 1 (SFS-LGR). The least number of feature selections obtained from the SFwS-Logistic Regression algorithm is one feature with an accuracy score of 0.8083. The Sequential Backward Selection algorithm generally has a longer running time than the Sequential Forward Selection.

Item Type: Conference or Workshop Item (Other)
Subjects: Z Bibliography. Library Science. Information Resources > ZA Information resources
Divisions: Fakultas Teknologi Informasi & Bisnis > Rekayasa Sistem Komputer (S1)
Depositing User: Emy Setyaningsih
Date Deposited: 21 Mar 2023 06:38
Last Modified: 01 Nov 2023 02:57
URI: http://eprints.akprind.ac.id/id/eprint/1748

Actions (login required)

View Item View Item