A multiple linear regression
A multiple linear regression is an expansion of a simple linear regression. It creates a flat regression surface, where that regression surface has as many dimensions as there are \(x\) variables in the equation. We can visualize a regression surface with two \(x\) variables using a 3 dimensional image, as shown below. Since we live in a three dimensional world, visualizations of regression surfaces with more than two \(x\) variables tend to be very difficult/impossible interpret.
In a multiple linear regression function,
- \(y\) is the outcome/dependent variable,
- \(n\) is the number of \(x\) variables,
- \(x_i\), where \(i \in \{1,n\}\), is an explanatory/independent variable,
- \(\beta_i\) is a coefficient for the corresponding \(x_i\) variable,
- \(\beta_0\) is the y intercept, and
- \(\epsilon\) is the error term that captures our uncertainty over the estimated values.
In the case of two \(x\) variables, as in the example shown next, the function can be expressed as
\[\begin{align} y &= \beta_1 x_1 + \beta_2 x_2 + \beta_0 + \epsilon. \end{align}\]The regression surface for the example used in this section is shown in the interactive graphic below.
- Hover over any point to see information about the county that point represents.
- Click and drag on the graphic to rotate it.
- When needed, you can reset the image to its starting point. Hover over the image and click on the house symbol at the top right above the graphic title. Alternately, you can reload the page.
- The regression surface is shown in the flat dark blue surface.
- The gray curved surfaces above and below the regression surface are the 95% confidence intervals.
For those primarily interested in a visualization of marginal effects, jump to Step 2: Direction of Each Marginal Effect.
Note that nonlinear regression functions exist, and are more complex than a straight line or flat regression surface. The substantive interpretation of these nonlinear regression surfaces is more complex than the interpretation of a flat regression surface/line, but the interpretation of statistical significance remains the same.