Multiple Linear Regression

Robin Donatello

2024-11-04

Mathematical Model

The mathematical model for multiple linear regression equates the value of the continuous outcome to a linear combination of multiple predictors each with their own slope coefficient .

where indexes the observations , and indexes the number of parameters .

Characteristic	Beta (95% CI)¹	p-value
(Intercept)	-2.8 (-5.0 to -0.51)	0.016
FAGE	-0.03 (-0.04 to -0.01)	<0.001
FHEIGHT	0.11 (0.08 to 0.15)	<0.001
Adjusted R²	0.325
No. Obs.	150
¹ CI = Confidence Interval

Characteristic	Beta (95% CI)¹	p-value
(Intercept)	-2.2 (-3.7 to -0.76)	0.003
AGE	-0.02 (-0.03 to -0.02)	<0.001
HEIGHT	0.11 (0.08 to 0.13)	<0.001
BIOL.SEX
Male	—
Female	-0.64 (-0.79 to -0.48)	<0.001
Adjusted R²	0.646
No. Obs.	300
¹ CI = Confidence Interval

Characteristic	N = 300
AREA, n (%)
Burbank	48 (16)
Lancaster	98 (33)
Long Beach	38 (13)
Glendora	116 (39)

Characteristic	Beta (95% CI)¹	p-value
(Intercept)	-2.3 (-3.7 to -0.77)	0.003
AGE	-0.02 (-0.03 to -0.01)	<0.001
HEIGHT	0.10 (0.08 to 0.12)	<0.001
BIOL.SEX
Male	—
Female	-0.64 (-0.80 to -0.49)	<0.001
AREA
Burbank	—
Lancaster	0.03 (-0.14 to 0.20)	0.71
Long Beach	0.06 (-0.14 to 0.27)	0.55
Glendora	0.12 (-0.04 to 0.28)	0.14
Adjusted R²	0.646
No. Obs.	300
¹ CI = Confidence Interval

Mathematical model

where

when AREA='Lancaster', and 0 otherwise
when AREA='Long Beach', and 0 otherwise
when AREA='Glendora', and 0 otherwise

The coefficients for the other levels of the categorical variable are interpreted as the effect of that variable on the outcome in compared to the reference level.

Characteristic	Beta (95% CI)¹	p-value
(Intercept)	-2.3 (-3.7 to -0.77)	0.003
AGE	-0.02 (-0.03 to -0.01)	<0.001
HEIGHT	0.10 (0.08 to 0.12)	<0.001
BIOL.SEX
Male	—
Female	-0.64 (-0.80 to -0.49)	<0.001
AREA
Burbank	—
Lancaster	0.03 (-0.14 to 0.20)	0.71
Long Beach	0.06 (-0.14 to 0.27)	0.55
Glendora	0.12 (-0.04 to 0.28)	0.14
Adjusted R²	0.646
No. Obs.	300
¹ CI = Confidence Interval

1 / 24

Multiple Linear Regression Robin Donatello 2024-11-04

Multiple Linear Regression
Motivation: Life is rarely bivariate.
Need to expand our model
Framework
Visualization
Mathematical Model
Assumptions
Parameter Estimation
Fitting the model
Interpreting Coefficients
Intercept The intercept...
Continuous Predictors
Binary Predictors
Reference level coding
Model with sex as a predictor
❌ Do not manually...
Interpretations
Categorical Predictors
Residental area
I do not do anything...
What’s going on?
Process for reference coding.
Mathematical model
Interpreting coefficients