# Mean and predicted response

In linear regression, mean response and predicted response are values of the dependent variable calculated from the regression parameters and a given value of the independent variable. The values of these two responses are the same, but their calculated variances are different.

## Background

In straight line fitting, the model is

$y_{i}=\alpha +\beta x_{i}+\varepsilon _{i}\,$ where $y_{i}$ is the response variable, $x_{i}$ is the explanatory variable, εi is the random error, and $\alpha$ and $\beta$ are parameters. The mean, and predicted, response value for a given explanatory value, xd, is given by

${\hat {y}}_{d}={\hat {\alpha }}+{\hat {\beta }}x_{d},$ while the actual response would be

$y_{d}=\alpha +\beta x_{d}+\varepsilon _{d}\,$ Expressions for the values and variances of ${\hat {\alpha }}$ and ${\hat {\beta }}$ are given in linear regression.

## Mean response

Since the data in this context is defined to be (x, y) pairs for every observation, the mean response at a given value of x, say xd, is an estimate of the mean of the y values in the population at the x value of xd, that is ${\hat {E}}(y\mid x_{d})\equiv {\hat {y}}_{d}\!$ . The variance of the mean response is given by

$\operatorname {Var} \left({\hat {\alpha }}+{\hat {\beta }}x_{d}\right)=\operatorname {Var} \left({\hat {\alpha }}\right)+\left(\operatorname {Var} {\hat {\beta }}\right)x_{d}^{2}+2x_{d}\operatorname {Cov} \left({\hat {\alpha }},{\hat {\beta }}\right).$ This expression can be simplified to

$\operatorname {Var} \left({\hat {\alpha }}+{\hat {\beta }}x_{d}\right)=\sigma ^{2}\left({\frac {1}{m}}+{\frac {\left(x_{d}-{\bar {x}}\right)^{2}}{\sum (x_{i}-{\bar {x}})^{2}}}\right),$ where m is the number of data points.

To demonstrate this simplification, one can make use of the identity

$\sum (x_{i}-{\bar {x}})^{2}=\sum x_{i}^{2}-{\frac {1}{m}}\left(\sum x_{i}\right)^{2}.$ ## Predicted response

The predicted response distribution is the predicted distribution of the residuals at the given point xd. So the variance is given by

$\operatorname {Var} \left(y_{d}-\left[{\hat {\alpha }}+{\hat {\beta }}x_{d}\right]\right)=\operatorname {Var} (y_{d})+\operatorname {Var} \left({\hat {\alpha }}+{\hat {\beta }}x_{d}\right).$ The second part of this expression was already calculated for the mean response. Since $\operatorname {Var} (y_{d})=\sigma ^{2}$ (a fixed but unknown parameter that can be estimated), the variance of the predicted response is given by

{\begin{aligned}\operatorname {Var} \left(y_{d}-\left[{\hat {\alpha }}+{\hat {\beta }}x_{d}\right]\right)&=\sigma ^{2}+\sigma ^{2}\left({\frac {1}{m}}+{\frac {\left(x_{d}-{\bar {x}}\right)^{2}}{\sum (x_{i}-{\bar {x}})^{2}}}\right)\\[4pt]&=\sigma ^{2}\left(1+{\frac {1}{m}}+{\frac {(x_{d}-{\bar {x}})^{2}}{\sum (x_{i}-{\bar {x}})^{2}}}\right).\end{aligned}} ## Confidence intervals

The $100(1-\alpha )\%$ confidence intervals are computed as $y_{d}\pm t_{{\frac {\alpha }{2}},m-n-1}{\sqrt {\operatorname {Var} }}$ . Thus, the confidence interval for predicted response is wider than the interval for mean response. This is expected intuitively – the variance of the population of $y$ values does not shrink when one samples from it, because the random variable εi does not decrease, but the variance of the mean of the $y$ does shrink with increased sampling, because the variance in ${\hat {\alpha }}$ and ${\hat {\beta }}$ decrease, so the mean response (predicted response value) becomes closer to $\alpha +\beta x_{d}$ .

This is analogous to the difference between the variance of a population and the variance of the sample mean of a population: the variance of a population is a parameter and does not change, but the variance of the sample mean decreases with increased samples.

## General linear regression

The general linear model can be written as

$y_{i}=\sum _{j=1}^{n}X_{ij}\beta _{j}+\varepsilon _{i}\,$ Therefore, since $y_{d}=\sum _{j=1}^{n}X_{dj}{\hat {\beta }}_{j}$ the general expression for the variance of the mean response is

$\operatorname {Var} \left(\sum _{j=1}^{n}X_{dj}{\hat {\beta }}_{j}\right)=\sum _{i=1}^{n}\sum _{j=1}^{n}X_{di}S_{ij}X_{dj},$ where S is the covariance matrix of the parameters, given by

$\mathbf {S} =\sigma ^{2}\left(\mathbf {X^{\mathsf {T}}X} \right)^{-1}.$ 