Sitemap

Variable Selection using Python — Vote based approach

3 min read
·
May 8, 2018

Variable selection is one of the key process in predictive modeling process. It is an art. To put is simple terms, variable selection is like picking a soccer team to win the World cup. You need to have the best player in each position and you don’t want two or many players who plays the same position.

In python, we have different techniques to select variables. Some of them include Recursive feature elimination, Tree based selection, L1 based feature selection. The scikit-learn documentation provided below walks through these techniques.

Vote based approach for variable selection

The idea here is to apply a variety of techniques to select variables. When a algorithm picks a variable, we give a vote for the variable. At the end, we calculate the total votes for each variables and then pick the best ones based on votes. This way, we end up picking the best variables with minimum effort in the variable selection process.

Github Code

The following process happens during variable selection

  1. Information Value using Weight of evidence
  2. Variable Importance using Random Forest
  3. Recursive Feature Elimination
  4. Variable Importance using Extra trees classifier
  5. Chi Square best variables
  6. L1 based feature selection

Once these process are completed, we pick the best variables selected by each algorithm (vote). We count the total number of votes and then perform multicollinearity check on the variables selected. We can extend the process further to include other variable selection techniques before we count the vote.

Have fun!

I released a python package which does variable selection using vote based approach. If you are interested to use the package version read the article below.

--

--

Sundar Krishnan
Sundar Krishnan

Written by Sundar Krishnan

I am passionate about Artificial Intelligence and Data Science. I focus on 360 degree customer analytics models and machine learning workflow automation.

Responses (4)

Morty Proxy This is a proxified and sanitized view of the page, visit original site.