{ "cells": [ { "cell_type": "markdown", "id": "1c2d7e40", "metadata": {}, "source": [ "(chapter6_part2)=\n", "\n", "# Search Methods\n", "\n", "- This is a supplement material for the [Machine Learning Simplified](https://themlsbook.com) book. It sheds light on Python implementations of the topics discussed while all detailed explanations can be found in the book. \n", "- I also assume you know Python syntax and how it works. If you don't, I highly recommend you to take a break and get introduced to the language before going forward with my code. \n", "- This material can be downloaded as a Jupyter notebook (Download button in the upper-right corner -> `.ipynb`) to reproduce the code and play around with it. \n", "\n", "\n", "## 1. Required Libraries, Data & Variables\n", "\n", "Let's import the data and have a look at it:" ] }, { "cell_type": "code", "execution_count": 1, "id": "d2e19676", "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "\n", "data = pd.read_csv('https://github.com/5x12/themlsbook/raw/master/supplements/data/car_price.csv', delimiter=',', header=0)" ] }, { "cell_type": "code", "execution_count": 2, "id": "7d790fe7", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
car_IDsymbolingCarNamefueltypeaspirationdoornumbercarbodydrivewheelenginelocationwheelbase...enginesizefuelsystemboreratiostrokecompressionratiohorsepowerpeakrpmcitympghighwaympgprice
013alfa-romero giuliagasstdtwoconvertiblerwdfront88.6...130mpfi3.472.689.01115000212713495.0
123alfa-romero stelviogasstdtwoconvertiblerwdfront88.6...130mpfi3.472.689.01115000212716500.0
231alfa-romero Quadrifogliogasstdtwohatchbackrwdfront94.5...152mpfi2.683.479.01545000192616500.0
342audi 100 lsgasstdfoursedanfwdfront99.8...109mpfi3.193.4010.01025500243013950.0
452audi 100lsgasstdfoursedan4wdfront99.4...136mpfi3.193.408.01155500182217450.0
\n", "

5 rows × 26 columns

\n", "
" ], "text/plain": [ " car_ID symboling CarName fueltype aspiration doornumber \\\n", "0 1 3 alfa-romero giulia gas std two \n", "1 2 3 alfa-romero stelvio gas std two \n", "2 3 1 alfa-romero Quadrifoglio gas std two \n", "3 4 2 audi 100 ls gas std four \n", "4 5 2 audi 100ls gas std four \n", "\n", " carbody drivewheel enginelocation wheelbase ... enginesize \\\n", "0 convertible rwd front 88.6 ... 130 \n", "1 convertible rwd front 88.6 ... 130 \n", "2 hatchback rwd front 94.5 ... 152 \n", "3 sedan fwd front 99.8 ... 109 \n", "4 sedan 4wd front 99.4 ... 136 \n", "\n", " fuelsystem boreratio stroke compressionratio horsepower peakrpm citympg \\\n", "0 mpfi 3.47 2.68 9.0 111 5000 21 \n", "1 mpfi 3.47 2.68 9.0 111 5000 21 \n", "2 mpfi 2.68 3.47 9.0 154 5000 19 \n", "3 mpfi 3.19 3.40 10.0 102 5500 24 \n", "4 mpfi 3.19 3.40 8.0 115 5500 18 \n", "\n", " highwaympg price \n", "0 27 13495.0 \n", "1 27 16500.0 \n", "2 26 16500.0 \n", "3 30 13950.0 \n", "4 22 17450.0 \n", "\n", "[5 rows x 26 columns]" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data.head()" ] }, { "cell_type": "code", "execution_count": 3, "id": "21c7f273", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Index(['car_ID', 'symboling', 'CarName', 'fueltype', 'aspiration',\n", " 'doornumber', 'carbody', 'drivewheel', 'enginelocation', 'wheelbase',\n", " 'carlength', 'carwidth', 'carheight', 'curbweight', 'enginetype',\n", " 'cylindernumber', 'enginesize', 'fuelsystem', 'boreratio', 'stroke',\n", " 'compressionratio', 'horsepower', 'peakrpm', 'citympg', 'highwaympg',\n", " 'price'],\n", " dtype='object')" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data.columns" ] }, { "cell_type": "markdown", "id": "eeed5a35", "metadata": {}, "source": [ "Let's define features $X$ and a target variable $y$:" ] }, { "cell_type": "code", "execution_count": 4, "id": "11ce8e44", "metadata": {}, "outputs": [], "source": [ "data['price']=data['price'].astype('int')\n", "\n", "X = data[['wheelbase', \n", " 'carlength', \n", " 'carwidth', \n", " 'carheight', \n", " 'curbweight', \n", " 'enginesize', \n", " 'boreratio', \n", " 'stroke',\n", " 'compressionratio', \n", " 'horsepower', \n", " 'peakrpm', \n", " 'citympg', \n", " 'highwaympg']]\n", "\n", "y = data['price']\n" ] }, { "cell_type": "markdown", "id": "af901f02", "metadata": {}, "source": [ "Let's split the data:" ] }, { "cell_type": "code", "execution_count": 5, "id": "f7b6974d", "metadata": {}, "outputs": [], "source": [ "from sklearn.model_selection import train_test_split\n", "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=1)" ] }, { "cell_type": "markdown", "id": "525a0946", "metadata": {}, "source": [ "## 2. Wrapper methods\n", "\n", "The following Search methods are examined:\n", "\n", " 1. **Step Forward** Feature Selection method\n", " 2. **Step Backward** Feature Selection method\n", " 3. **Recursive Feature** Elimination method\n", "\n", "### 2.1. Step Forward Feature Selection" ] }, { "cell_type": "code", "execution_count": 6, "id": "cd926675", "metadata": {}, "outputs": [], "source": [ "# Importing required libraries\n", "from mlxtend.feature_selection import SequentialFeatureSelector as sfs\n", "from sklearn.ensemble import RandomForestClassifier" ] }, { "cell_type": "code", "execution_count": 7, "id": "32456401", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 0.1s remaining: 0.0s\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "[Parallel(n_jobs=1)]: Done 13 out of 13 | elapsed: 1.2s finished\n", "\n", "[2022-06-06 14:14:08] Features: 1/4 -- score: 0.021028951486697964[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 0.1s remaining: 0.0s\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "[Parallel(n_jobs=1)]: Done 12 out of 12 | elapsed: 1.1s finished\n", "\n", "[2022-06-06 14:14:09] Features: 2/4 -- score: 0.03501564945226917[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 0.1s remaining: 0.0s\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "[Parallel(n_jobs=1)]: Done 11 out of 11 | elapsed: 1.0s finished\n", "\n", "[2022-06-06 14:14:11] Features: 3/4 -- score: 0.04196009389671361[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 0.1s remaining: 0.0s\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "[Parallel(n_jobs=1)]: Done 10 out of 10 | elapsed: 1.0s finished\n", "\n", "[2022-06-06 14:14:12] Features: 4/4 -- score: 0.04890453834115806" ] } ], "source": [ "# Set a model (Random Forest Classifier) to use in SFFS\n", "model = RandomForestClassifier(n_estimators=100)\n", "\n", "# Set step forward feature selection\n", "sfs = sfs(model, # model (defined above) to use in SFFS\n", " k_features=4, # return top 4 features from the feature set X\n", " forward=True, # True for SFFS, False for SBFS (explained below)\n", " floating=False,\n", " verbose=2,\n", " scoring='accuracy', # metrics to use to estimate model's performance\n", " cv=2) #cross-validation=2\n", "\n", "# Perform Step Forward Feature Selection by fitting X and y\n", "sfs = sfs.fit(X_train, y_train)" ] }, { "cell_type": "code", "execution_count": 8, "id": "2387eba3", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(2, 4, 11, 12)" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Return indexes the top 4 selected features\n", "sfs.k_feature_idx_" ] }, { "cell_type": "code", "execution_count": 9, "id": "0c73306b", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Index(['carwidth', 'curbweight', 'citympg', 'highwaympg'], dtype='object')" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Return the labels of the top 4 selected features\n", "top_forward = X.columns[list(sfs.k_feature_idx_)]\n", "top_forward" ] }, { "cell_type": "markdown", "id": "edcd5b22", "metadata": {}, "source": [ "### 2.2. Step Backward Feature Selection" ] }, { "cell_type": "code", "execution_count": 10, "id": "8a06d4b3", "metadata": {}, "outputs": [], "source": [ "# Importing required libraries\n", "from mlxtend.feature_selection import SequentialFeatureSelector as sfs\n", "from sklearn.ensemble import RandomForestClassifier\n", "import numpy as np" ] }, { "cell_type": "code", "execution_count": 11, "id": "35db4020", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 0.1s remaining: 0.0s\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "[Parallel(n_jobs=1)]: Done 13 out of 13 | elapsed: 1.5s finished\n", "\n", "[2022-06-06 14:14:13] Features: 12/4 -- score: 0.7854271943189328[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 0.1s remaining: 0.0s\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "[Parallel(n_jobs=1)]: Done 12 out of 12 | elapsed: 1.3s finished\n", "\n", "[2022-06-06 14:14:14] Features: 11/4 -- score: 0.794197087848638[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 0.1s remaining: 0.0s\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "[Parallel(n_jobs=1)]: Done 11 out of 11 | elapsed: 1.2s finished\n", "\n", "[2022-06-06 14:14:16] Features: 10/4 -- score: 0.7841936036713933[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 0.1s remaining: 0.0s\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "[Parallel(n_jobs=1)]: Done 10 out of 10 | elapsed: 1.1s finished\n", "\n", "[2022-06-06 14:14:17] Features: 9/4 -- score: 0.7622686858340305[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 0.1s remaining: 0.0s\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "[Parallel(n_jobs=1)]: Done 9 out of 9 | elapsed: 1.0s finished\n", "\n", "[2022-06-06 14:14:18] Features: 8/4 -- score: 0.7873086384904935[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 0.1s remaining: 0.0s\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "[Parallel(n_jobs=1)]: Done 8 out of 8 | elapsed: 0.8s finished\n", "\n", "[2022-06-06 14:14:19] Features: 7/4 -- score: 0.7837431774909533[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 0.1s remaining: 0.0s\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "[Parallel(n_jobs=1)]: Done 7 out of 7 | elapsed: 0.7s finished\n", "\n", "[2022-06-06 14:14:19] Features: 6/4 -- score: 0.7913352479930384[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 0.1s remaining: 0.0s\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "[Parallel(n_jobs=1)]: Done 6 out of 6 | elapsed: 0.6s finished\n", "\n", "[2022-06-06 14:14:20] Features: 5/4 -- score: 0.7720578065755039[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 0.1s remaining: 0.0s\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n", "/Users/andrewwolf/Library/Caches/pypoetry/virtualenvs/themlsbook-8peXrHpY-py3.9/lib/python3.9/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=2.\n", " warnings.warn(\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "[Parallel(n_jobs=1)]: Done 5 out of 5 | elapsed: 0.5s finished\n", "\n", "[2022-06-06 14:14:20] Features: 4/4 -- score: 0.770432785371282" ] } ], "source": [ "# Set a model (Random Forest Classifier) to use in SBFS\n", "model = RandomForestClassifier(n_estimators=100)\n", "\n", "# Set step backward feature selection\n", "sfs = sfs(model, # model (defined above) to use in SBFS\n", " k_features=4, # return bottom 4 features from the feature set X\n", " forward=False, # False for SBFS, True for SFFS (explained above)\n", " floating=False, \n", " verbose=2,\n", " scoring='r2', # metrics to use to estimate model's performance (here: R-squared)\n", " cv=2) #cross-validation=2\n", "\n", "# Perform Step Backward Feature Selection by fitting X and y\n", "sfs1 = sfs.fit(np.array(X_train), y_train)" ] }, { "cell_type": "code", "execution_count": 12, "id": "84124580", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Index(['carwidth', 'curbweight', 'enginesize', 'stroke'], dtype='object')" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Return the labels of the bottom 4 selected features\n", "top_backward = X.columns[list(sfs.k_feature_idx_)]\n", "top_backward" ] }, { "cell_type": "markdown", "id": "765427b9", "metadata": {}, "source": [ "### 2.3. Recursive Feature Elimination Method" ] }, { "cell_type": "code", "execution_count": 13, "id": "2a991ddc", "metadata": {}, "outputs": [], "source": [ "# Importing required libraries\n", "from sklearn.feature_selection import RFE\n", "from sklearn.linear_model import LinearRegression" ] }, { "cell_type": "code", "execution_count": 14, "id": "97a10f54", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "RFE(estimator=LinearRegression(), n_features_to_select=4)" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Set a model (Linear Regression) to use in RFEM\n", "model = LinearRegression()\n", "\n", "# Set step backward feature selection\n", "rfe = RFE(model, \n", " n_features_to_select=4, \n", " step=1)\n", "\n", "# Perform Step Backward Feature Selection by fitting X and y\n", "rfe.fit(X, y)" ] }, { "cell_type": "code", "execution_count": 15, "id": "03614afc", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Index(['carwidth', 'boreratio', 'stroke', 'citympg'], dtype='object')\n" ] } ], "source": [ "# Return labels of the top 4 selected features\n", "top_recursive = X.columns[rfe.support_]\n", "print (top_recursive)" ] }, { "cell_type": "code", "execution_count": 16, "id": "d2e65c95", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'wheelbase': 7, 'carlength': 6, 'carwidth': 1, 'carheight': 5, 'curbweight': 10, 'enginesize': 4, 'boreratio': 1, 'stroke': 1, 'compressionratio': 2, 'horsepower': 8, 'peakrpm': 9, 'citympg': 1, 'highwaympg': 3}\n" ] } ], "source": [ "# Return labels and their scores of all features\n", "print(dict(zip(X.columns, rfe.ranking_)))" ] }, { "cell_type": "markdown", "id": "9ea84d34", "metadata": {}, "source": [ "## 3. Comparing Four Methods" ] }, { "cell_type": "code", "execution_count": 17, "id": "6dde4377", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The features selected by Step Forward Feature Selection are: \n", " \n", " \t Index(['carwidth', 'curbweight', 'citympg', 'highwaympg'], dtype='object') \n", " \n", " \n", " The features selected by Step Backward Feature Selection are: \n", " \n", " \t Index(['carwidth', 'curbweight', 'enginesize', 'stroke'], dtype='object') \n", " \n", " \n", " The features selected by Recursive Feature Elimination are: \n", " \n", " \t Index(['carwidth', 'boreratio', 'stroke', 'citympg'], dtype='object')\n" ] } ], "source": [ "print('The features selected by Step Forward Feature Selection are: \\n \\n \\t {} \\n \\n \\n The features selected by Step Backward Feature Selection are: \\n \\n \\t {} \\n \\n \\n The features selected by Recursive Feature Elimination are: \\n \\n \\t {}'.format(top_forward, top_backward, top_recursive))\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "id": "62248a26", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "jupytext": { "formats": "md:myst", "text_representation": { "extension": ".md", "format_name": "myst" } }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.12" }, "source_map": [ 11, 27, 34, 39, 41, 46, 65, 70, 73, 86, 93, 111, 117, 121, 126, 134, 152, 156, 161, 168, 182, 189, 192, 197, 204 ] }, "nbformat": 4, "nbformat_minor": 5 }