Simple Linear Regression - Model
Reminder
The process of statistical analysis:
-
Identify research question and the corresponding population and parameter you are interested in.
-
Collect data.
-
Posit a statistical model based on information in the sample.
-
Draw inference about the population using your model.
Research Objective
Research Question: Is the adult height of a child determined by the height of the mother? In other words, what is the relationship between student’s height and mother’s height for all BYU students.
Population: All BYU students.
Parameter of Interest:
- Some number measurement of the “relationship” between student’s height and mother’s height.
- For this subunit we are going to focus on what a “relationship” means.
Sample: A convenience sample of 1727 BYU students who are in Stat 121.
Are there any issues with this study setup?
Simple Linear Regression Model
- Main goal: Specify the population relationship between student’s height and mother’s height.
Review: Equation of a Line
Equation you are probably used to: \[ y = mx + b\] where:
- \(m\) = slope
- \(b\) = intercept
Review: Equation of a Line
We are going to change notation to: \[ y = \beta_0 + \beta_1 x\] where \(\beta\) is pronounced “beta”,
- \(\beta_0\) = intercept
- \(\beta_1\) = slope
Why? Remember that greek letters in here will always represent population parameters so this notation is more consistent with that standard (and this is the notation that everyone uses for regression).
Review: Equation of a line
Equation of a line: \[ y = \beta_0 + \beta_1 x\]
Review: Equation of a line
Equation of a line: \[ y = \beta_0 + \beta_1 x\]
Interpretations:
-
Slope (\(\beta_1 =\) “rise over run”): As \(x\) increases/decreases by 1, \(y\) increases/decreases by \(\beta_1\). If \(x\) “runs” by 1, then \(y\) “rises” by \(\beta_1\).
-
Intercept (\(\beta_0\)): If \(x\) is 0, then \(y\) is \(\beta_0\).
Review: Equation of a line
Practice: Height vs. Mother’s Height
How would you interpret the intercept?
- If the mother is zero inches tall (\(x=0\)) then the student height is 35.653.
How would you interpret the slope?
- If the mother’s height increases by 1, then the student height goes up by 0.503.
What is \(y\) when \(x=64\)?
- Plug in \(y = 35.653 + 0.503\times 64 = 67.845\)
Review: Equation of a line
Practice: Possum lengths
How would you interpret the intercept?
- If total length is zero (\(x=0\)) then the head length is 42.71.
How would you interpret the slope?
- If the total length goes up by 1, then the head length goes up by 0.573.
What is head length when total length \(=95\)?
- Plug in \(y = 42.71 + 0.573\times 95 = 97.145\)
Practice 6.2 Question 1
How would you interpret the intercept in this example (note that the mortality is per 10 million people)?
-
As the latitude increases by 1, the mortality decreases by 5.978 people per 10 million.
-
As the mortality increases by 1, the latitude decreases by 5.978 degrees.
-
If the latitude is 0, the mortality would be 389.189 people per 10 million.
-
If the mortality is 0, the latitude would be 389.189 degrees.
Practice 6.2 Question 1 Answer
How would you interpret the intercept in this example (note that the mortality is per 10 million people)?
-
As the latitude increases by 1, the mortality decreases by 5.978 people per 10 million.
-
As the mortality increases by 1, the latitude decreases by 5.978 degrees.
-
If the latitude is 0, the mortality would be 389.189 people per 10 million.
-
If the mortality is 0, the latitude would be 389.189 degrees.
Practice 6.2 Question 2
How would you interpret the slope in this example (note that the mortality is per 10 million people)?
-
As the latitude increases by 1, the mortality decreases by 5.978 people per 10 million.
-
As the mortality increases by 1, the latitude decreases by 5.978 degrees.
-
If the latitude is 0, the mortality would be 389.189 people per 10 million.
-
If the mortality is 0, the latitude would be 389.189 degrees.
Practice 6.2 Question 2 Answer
How would you interpret the slope in this example (note that the mortality is per 10 million people)?
-
As the latitude increases by 1, the mortality decreases by 5.978 people per 10 million.
-
As the mortality increases by 1, the latitude decreases by 5.978 degrees.
-
If the latitude is 0, the mortality would be 389.189 people per 10 million.
-
If the mortality is 0, the latitude would be 389.189 degrees.
Practice 6.2 Question 3
What is mortality when latitude \(x=40.23\) (about the latitude of Provo)?
Practice 6.2 Question 3 Answer
What is mortality when latitude \(x=40.23\) (about the latitude of Provo)?
\(389.189 - 5.978*(40.23) = 148.694\)
Simple Linear Regression Model
Issue: When specifying a model for the relationship, the data do not perfectly follow a line:
Simple Linear Regression Model
Residuals
\[
\begin{align}
\text{Residual} = \epsilon_i &= \text{Observation - Predicted Value} \\
&= Y_i - (\beta_0 + \beta_1X_i)
\end{align}
\]
Visualizing the SLR Model
- \(\sigma\) is the standard deviation and controls the spread of the dots about the regression line. The bigger the \(\sigma\), the farther the dots from the line.
Interpreting the SLR Model
Slight change in interpretation:
- Intercept (\(\beta_0\)): If \(X=0\), we expect \(Y\) to be \(\beta_0\).
- Slope (\(\beta_0\)): If \(X\) goes up by 1, we expect \(Y\) to go up by \(\beta_1\).
Assumptions of the SLR Model
Easy way to remember what we are assuming about the population in a simple linear regression model:
- L - Linear relationship between \(x\) and \(y\)
- I - Independence (one obs. doesn’t impact the other)
- N - Normal residuals (distance from line is normal)
- E - Equal spread of residuals around the line
More on why these assumptions are important and how to check these in the next subunit.
Parameter Estimation
Parameters we want to estimate: \(\beta_0\) & \(\beta_1\) (which defines the line) and \(\sigma\) (so we know how spread out things are)
Goal: Find the line that goes “closest” to the data points.
Parameter Estimation
What do we mean by “line closest to points”? We want to find \(\hat{\beta}_0\) and \(\hat{\beta}_1\) so that: \[
\sum_{i=1}^n (Y_i - (\hat{\beta}_0+\hat{\beta}_1X_i))^2
\] is as small as possible. This is called the least squares regression line.
A few notes:
- We “square” distances so that, for example, a 5 “above” and 5 “below” the line are the same “distance”.
- We sum squared residuals because we look at all the data.
- We use “hats” to denote estimates from sample (for example, \(\hat{\beta}_1\) is our estimate of \(\beta_1\))
Parameter Estimation
Parameter Estimation
How do we find \(\hat{\beta}_0\) and \(\hat{\beta}_1\) that minimizes \[
\sum_{i=1}^n (Y_i - (\hat{\beta}_0+\hat{\beta}_1X_i))^2?
\]
- Guess and check
- Use calculus
- In either case, we’ll let the computer do the hard work for us
The Fitted SLR Model
Fitted Regression Line Equation: \[
\hat{y} = 35.653 + 0.503\times x
\] where:
- \(\hat{y}\) is the fitted height value (the height value on the line)
- \(\hat{y} \neq y_i\) because \(y_i\) is an observed height
The Fitted SLR Model
A interesting point: The sign (postive/negative) of the correlation will always match the sign of the slope (positive/negative). Not the same number but the same sign.
Parameter Estimation
An estimate of \(\sigma\) is more complicated to explain (take more stats courses), so for purposes of this class, the computer estimates it for us.
How do we interpret \(\hat{\sigma}\)?
- On average, the actual student’s heights are about 3.776 inches away from the estimated heights.
Practice 6.2 Question 4
Which of the following is the fitted regression line (i.e. the equation of the line of best fit) for the melanoma example?
-
\(\hat{y} = - 5.9776 + 389.1894 \times x_i\)
-
\(\hat{y} = 389.1894 - 5.9776\times x_i\)
-
\(\hat{y} = 389.1894 + 19.115\times x_i\)
-
\(\hat{y} = 19.115 -5.9776\times x_i\)
Practice 6.2 Question 4 Answer
Which of the following is the fitted regression line (i.e. the equation of the line of best fit) for the melanoma example?
-
\(\hat{y} = - 5.9776 + 389.1894 \times x_i\)
-
\(\hat{y} = 389.1894 - 5.9776\times x_i\)
-
\(\hat{y} = 389.1894 + 19.115\times x_i\)
-
\(\hat{y} = 19.115 -5.9776\times x_i\)
Practice 6.2 Question 5
How would you interpret the estimated slope in the melanoma example?
-
As mortality increases by 1, the latitude is expected to decrease by 5.9776.
-
As latitude increases by 1, the mortality is expected to decrease by 5.9776.
-
As latitude increases by 1, the mortality will decrease by 5.9776.
-
As mortality increases by 1, the latitude is expected to increase by 389.1894.
-
As latitude increases by 1, the mortality is expected to increase by 389.1894.
-
The average distance from the observed mortality to the expected (predicted) mortality is 19.115.
Practice 6.2 Question 5 Answer
How would you interpret the estimated slope in the melanoma example?
-
As mortality increases by 1, the latitude is expected to decrease by 5.9776.
-
As latitude increases by 1, the mortality is expected to decrease by 5.9776.
-
As latitude increases by 1, the mortality will decrease by 5.9776.
-
As mortality increases by 1, the latitude is expected to increase by 389.1894.
-
As latitude increases by 1, the mortality is expected to increase by 389.1894.
-
The average distance from the observed mortality to the expected (predicted) mortality is 19.115.
Assessing Model Fit
Coming back to the student height example, we had \(\hat{\sigma} =\) 3.776 which we interpret to be the difference between the actual heights and the predicted heights. Does \(\hat{\sigma} =\) 3.776 mean that the observations are “close” to the line or not?
- It’s hard to tell just from \(\hat{\sigma}\) if this is “good” or “bad” because it depends on the problem. A better measure would be a standardized measure that can be used for all regression problems.
Assessing Model Fit
Mathematical formula: \[
R^2 = 1 - \frac{\sum_{i=1}^n (Y_i - (\hat{\beta}_0 + \hat{\beta}_1X_i))^2}{\sum_{i=1}^n (Y_i - \bar{y})^2} = 0.12802
\]
Intuition:
- Formal interpretation: The percent of variability in \(Y\) that is explained by \(X\).
- \(R^2\) is between 0 and 1 with 1 meaning the data perfectly follow a line and 0 meaning the data don’t follow the line at all.
- Intuition: you can think of \(R^2\) as a “grade” for your regression line where \(R^2 = 1\) is a perfect line and \(R^2 = 0\) is a terrible line.
Practice 6.2 Question 6
What is the correct interpretation of \(R^2 = 0.6798\) in the Melanoma example?
-
As latitude increases by 1, mortality increases by 0.6798 on average.
-
If latitude is zero (on the equator), mortality is expected to be 0.6798.
-
Observed mortality is about 0.6798 away from the expected (predicted) mortality.
-
About 67.98% of the variation in mortality is explained by latitude.
-
About 67.98% of the variation in latitude is explained by mortality
Practice 6.2 Question 6 Answer
What is the correct interpretation of \(R^2 = 0.6798\) in the Melanoma example?
-
As latitude increases by 1, mortality increases by 0.6798 on average.
-
If latitude is zero (on the equator), mortality is expected to be 0.6798.
-
Observed mortality is about 0.6798 away from the expected (predicted) mortality.
-
About 67.98% of the variation in mortality is explained by latitude.
-
About 67.98% of the variation in latitude is explained by mortality
Additional SLR Practice
Does a higher GPA lead to better pay? Use a the salary data and a simple linear regression model to answer the following questions:
- What is the estimated pay for someone who completely fails college (0.0 GPA)?
- For two people who differ by 1.0 GPA, how much higher (or lower) should the pay be for person with the higher GPA on average?
- On average, how far away are pay amounts from estimated pay amounts?
- How well does the GPA explain pay?
Additional SLR Practice Answers
Does a higher GPA lead to better pay? Use a simple linear regression model (and the course app) to answer the following questions (Salary dataset):
- What is the estimated pay for someone who completely fails college (0.0 GPA)?
-
\(\hat{\beta}_0 = 51135.68\)
- For two people who differ by 1.0 GPA, how much higher (or lower) should the pay be for person with the higher GPA on average?
-
\(\hat{\beta}_1 = 6510.04\)
- On average, how far away are pay amounts from estimated pay amounts?
-
\(\hat{\sigma} = 10353.03\)
- How well does the GPA explain pay?
Key Terminology
- Least squares
- Simple linear regression model
- Slope
- Intercept
- \(R^2\)
- Relationship between correlation and slope
- Spread about regression line (\(\sigma\))