Investigation of Extending Feature Selection Algorithms to Explicit Feature Selection in Kernel Space
Feature selection methods play important roles in the area of machine learning. Being a part of prepossessing, the technology of feature selection can select useful information from raw data. A good feature selection method can significantly improve performance of a prediction model. However, most feature selection methods only work well with linear data. Although nonlinear data can be transformed into linear data by being projected into a high dimensional space, the computation cost of calculating in high dimensional space is quite high. Kernel Trick is a method in model training. It reduces greatly the computational cost of calculating the inner product of high dimension data, and thus is usually used to solve nonlinear problems. However, data in the high dimension space generated by Kernel Trick, the so called Kernel Space, are usually implicit. Therefore, most feature selection methods cannot use Kernel Trick to process nonlinear data. Cao et al. provided a new method that can explicitly select features in Kernel Space with limited additional cost by extending a famous margin-based feature selection method Relief. Inspired by their results, we propose a method that transforms the original space to a so called Explicit Kernel Space (EKS). Our method successfully extends and broadens the idea by Cao et al. With EKS, most traditional feature selection algorithms can use Kernel Trick to deal with real world data. Based on several tests with different types of data and algorithms, the usefulness of EKS is verified; some of its properties are also presented and discussed.