A Complete Guide to Stepwise Regression in R (2024)

Stepwise regression is a procedure we can use to build a regression model from a set of predictor variables by entering and removing predictors in a stepwise manner into the model until there is no statistically valid reason to enter or remove any more.

The goal of stepwise regression is to build a regression model that includes all of the predictor variables that are statistically significantly related to the response variable.

This tutorial explains how to perform the following stepwise regression procedures in R:

  • Forward Stepwise Selection
  • Backward Stepwise Selection
  • Both-Direction Stepwise Selection

For each example we’ll use the built-in mtcars dataset:

#view first six rows of mtcarshead(mtcars) mpg cyl disp hp drat wt qsec vs am gear carbMazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1

We will fit a multiple linear regression model using mpg(miles per gallon) as our response variable and all of the other 10 variables in the dataset as potential predictors variables.

For each example will use the built-in step() function from the stats package to perform stepwise selection, which uses the following syntax:

step(intercept-only model, direction, scope)

where:

  • intercept-only model: the formula for the intercept-only model
  • direction:the mode of stepwise search, can be either “both”, “backward”, or “forward”
  • scope:a formula that specifies which predictors we’d like to attempt to enter into the model

Example 1: Forward Stepwise Selection

The following code shows how to perform forward stepwise selection:

#define intercept-only modelintercept_only <- lm(mpg ~ 1, data=mtcars)#define model with all predictorsall <- lm(mpg ~ ., data=mtcars)#perform forward stepwise regressionforward <- step(intercept_only, direction='forward', scope=formula(all), trace=0)#view results of forward stepwise regressionforward$anova Step Df Deviance Resid. Df Resid. Dev AIC1 NA NA 31 1126.0472 115.943452 + wt -1 847.72525 30 278.3219 73.217363 + cyl -1 87.14997 29 191.1720 63.198004 + hp -1 14.55145 28 176.6205 62.66456#view final modelforward$coefficients(Intercept) wt cyl hp 38.7517874 -3.1669731 -0.9416168 -0.0180381 

Note:The argument trace=0 tells R not to display the full results of the stepwise selection. This can take up quite a bit of space if there are a large number of predictor variables.

Here is how to interpret the results:

  • First, we fit the intercept-only model. This model had an AIC of 115.94345.
  • Next, we fit every possible one-predictor model. The model that produced the lowest AIC and also had a statistically significant reduction in AIC compared to the intercept-only model used the predictor wt. This model had an AIC of73.21736.
  • Next, we fit every possible two-predictor model. The model that produced the lowest AIC and also had a statistically significant reduction in AIC compared to the single-predictor model added the predictor cyl. This model had an AIC of63.19800.
  • Next, we fit every possible three-predictor model. The model that produced the lowest AIC and also had a statistically significant reduction in AIC compared to the two-predictor model added the predictor hp. This model had an AIC of62.66456.
  • Next, we fit every possible four-predictor model. It turned out that none of these models produced a significant reduction in AIC, thus we stopped the procedure.

The final model turns out to be:

mpg ~ 38.75 – 3.17*wt – 0.94*cyl – 0.02*hyp

Example 2: Backward Stepwise Selection

The following code shows how to perform backward stepwise selection:

#define intercept-only modelintercept_only <- lm(mpg ~ 1, data=mtcars)#define model with all predictorsall <- lm(mpg ~ ., data=mtcars)#perform backward stepwise regressionbackward <- step(all, direction='backward', scope=formula(all), trace=0)#view results of backward stepwise regressionbackward$anova Step Df Deviance Resid. Df Resid. Dev AIC1 NA NA 21 147.4944 70.897742 - cyl 1 0.07987121 22 147.5743 68.915073 - vs 1 0.26852280 23 147.8428 66.973244 - carb 1 0.68546077 24 148.5283 65.121265 - gear 1 1.56497053 25 150.0933 63.456676 - drat 1 3.34455117 26 153.4378 62.161907 - disp 1 6.62865369 27 160.0665 61.515308 - hp 1 9.21946935 28 169.2859 61.30730#view final modelbackward$coefficients(Intercept) wt qsec am 9.617781 -3.916504 1.225886 2.935837

Here is how to interpret the results:

  • First, we fit a model using all p predictors. Define this asMp.
  • Next, for k = p, p-1, … 1, we fit all k models that contain all but one of the predictors in Mk, for a total of k-1 predictor variables. Next, pick the best among these k models and call it Mk-1.
  • Lastly, we pick a single best model from among M0…Mp using AIC.

The final model turns out to be:

mpg ~ 9.62 – 3.92*wt + 1.23*qsec + 2.94*am

Example 3: Both-Direction Stepwise Selection

The following code shows how to perform both-direction stepwise selection:

#define intercept-only modelintercept_only <- lm(mpg ~ 1, data=mtcars)#define model with all predictorsall <- lm(mpg ~ ., data=mtcars)#perform backward stepwise regressionboth <- step(intercept_only, direction='both', scope=formula(all), trace=0)#view results of backward stepwise regressionboth$anova Step Df Deviance Resid. Df Resid. Dev AIC1 NA NA 31 1126.0472 115.943452 + wt -1 847.72525 30 278.3219 73.217363 + cyl -1 87.14997 29 191.1720 63.198004 + hp -1 14.55145 28 176.6205 62.66456#view final modelboth$coefficients(Intercept) wt cyl hp 38.7517874 -3.1669731 -0.9416168 -0.0180381 

Here is how to interpret the results:

  • First, we fit the intercept-only model.
  • Next, we added predictors to the model sequentially just like we did in forward-stepwise selection. However, after adding each predictor we also removed any predictors that no longer provided an improvement in model fit.
  • We repeated this process until we reached a final model.

The final model turns out to be:

mpg ~ 9.62 – 3.92*wt + 1.23*qsec + 2.94*am

Note that forward stepwise selection and both-direction stepwise selection produced the same final model while backward stepwise selection produced a different model.

Additional Resources

How to Test the Significance of a Regression Slope
How to Read and Interpret a Regression Table
A Guide to Multicollinearity in Regression

A Complete Guide to Stepwise Regression in R (2024)
Top Articles
Rado HyperChrome Men's Watch für 1.087 € kaufen von einem Seller auf Chrono24
Rado Captain Cook Mens Chronograph Two-Tone Automatic Watch... für 2 960 € kaufen von einem Seller auf Chrono24
Jps Occupational Health Clinic
Https //Paperlesspay.talx.com/Gpi
Subfinder Online
Davaba19
Tc-656 Utah
Promiseb Discontinued
Uscis Fort Myers 3850 Colonial Blvd
What Is Flipping Straights Ted Lasso
Allegra Commercial Actress 2022
Anchor Martha MacCallum Talks Her 20-Year Journey With FOX News and How She Stays Grounded (EXCLUSIVE)
manhattan cars & trucks - by owner - craigslist
How Nora Fatehi Became A Dancing Sensation In Bollywood 
Demystifying the C-Suite: A Close Look at the Top Executive Roles - 33rd Square
1800Comcast
Mobiloil Woodville Tx
Atl To London Google Flights
Longfellow's Works - Evangeline
O'reilly Auto Parts Near Me Open Now
Lanie Gardner: The Rising Star Behind the Viral Fleetwood Mac Cover - Neon Music - Digital Music Discovery & Showcase Platform
Magicseaweed Capitola
Thermal Pants Mens Walmart
Tyrone's Unblocked Games Basketball
Nydf Dancesport
Eotech Eflx Torque Specs
Skechers Outlet Greensboro Nc
Stuckey Furniture
Tulare Lake’s ghostly rebirth brings wonder — and hardship. Inside a community's resilience
Dl Delta Extranet
The Legend of Maula Jatt | Rotten Tomatoes
Whatcom County Food Handlers Permit
Shipstation Commercial Actress
Laurin Funeral Home
Marie Anne Thiebaud 2019
Dvax Message Board
Sdn Ohio State 2023
Cardholder.bhnincentives.com
Sheex Sheets Review (2024) | Mattress Nerd
China Rose Plant Care: Water, Light, Nutrients | Greg App 🌱
Skip The Games Albany
Myapps Tesla Ultipro Sign In
Fineassarri
Craigslist Pets Inland Empire
M&T Bank Branch Locations
Thoren Bradley Lpsg
Einschlafen in nur wenigen Minuten: Was bringt die 4-7-8-Methode?
Is There A Sprite Zero Shortage? - (September 2024)
Lowlifesymptoms Twitter
Unintelligible Message On A Warning Sign Crossword
Breckie Hill Shower Gif
19 BEST Stops on the Drive from Te Anau to Milford Sound +Road Trip Tips!
Latest Posts
Article information

Author: Barbera Armstrong

Last Updated:

Views: 6528

Rating: 4.9 / 5 (59 voted)

Reviews: 82% of readers found this page helpful

Author information

Name: Barbera Armstrong

Birthday: 1992-09-12

Address: Suite 993 99852 Daugherty Causeway, Ritchiehaven, VT 49630

Phone: +5026838435397

Job: National Engineer

Hobby: Listening to music, Board games, Photography, Ice skating, LARPing, Kite flying, Rugby

Introduction: My name is Barbera Armstrong, I am a lovely, delightful, cooperative, funny, enchanting, vivacious, tender person who loves writing and wants to share my knowledge and understanding with you.