Step 0: Substantive Meaning
This section shows key components involved in the substantive meaning of a regression: the equation, the variables, and the unit of observation.
The example below shows the relationship between each county’s shift towards Donald Trump in 2016 and its median income. The simple linear regression function used to estimate this relationship can be expressed as
\[\begin{align} y &= \beta x + \beta_0 + \epsilon \\ \text{shift to Trump} & = \beta * \text{median income (\$1,000s)} + \beta_0+ \epsilon, \end{align}\]where
- the unit of observation is one county,
- \(y\) represents the shift to Trump,
- How much more the county voted for Trump in 2016 than it voted for Romney in 2012.
- \(x\) represents median income ($1,000s).
- County median income is expressed in units of $1,000. That is, a county with a $50,000 median income will be coded as 50. This is done to improve your ability to read the regression results.
The interactive graphic below shows each county as well as the best fitting regression line. The gray area around the regression line represents an idea of the statistical significance of the effect of county median income.