Fitting a linear model
In the real world, it is often unfeasable to measure every observation in a population. We can approximate a model fit by withdrawing samples from the population instead.
Model fit uses the following notation to differentiate between sample and population model, which is:
Coefficient formulas
or, in matrix form,
An estimator for error variance
The best estimate of is the variance of residual , which is the estimate of error


Comparison of regression line using population data and sample data
| Regression Equation | Parameters (estimates) | Data | Notes |
|---|---|---|---|
| Population | Rarely done, since population is not always available | ||
| Sample | are estimators of | ||
| Error | Population | Not Known | |
| Residual | Sample | Estimator of |
Formulas
Theoretically,
Emprically,
Where is the population size, and SSE is sum of squares error.
Where is the amount of predictor variables, and SSR is sum of squares regression.
Interpretation of
We can anticipate that although the predicted value is different from the actual value, 95% of the actual values of are within . A good model should give small .