Least Squares in R
$$
\[ \begin{aligned} \mu(w,x) &= \frac{1}{m_{w,x}}\sum_{j:w_j=w,x_j=x} y_j \qfor m_{w,x} = \sum_{j: w_j=w, x_j=x} 1 \\ &= \mathop{\mathrm{E}}[Y_i \mid W_i=w, X_i=x] \qfor (W_i,X_i,Y_i) \qqtext{sampled uniformly-at-random} \end{aligned} \]
\[ \begin{aligned} \hat\mu(w,x) &= \mathop{\mathrm{argmin}}_{m \in \mathcal{M}} \sum_{i=1}^n \qty{Y_i - m(W_i,X_i)}^2 \\ &= \frac{1}{N_{w,x}}\sum_{i:W_i=w,X_i=x} Y_i \qqtext{ when } \mathcal{M}= \{ \text{all functions of } (w,x) \} \end{aligned} \]
\[ \begin{aligned} \mathcal{M}&= \{ \text{all functions } \ m(x) \} && \text{all functions} \\ \mathcal{M}&= \{ \text{all functions } \ m(c(x)) \} && \text{all functions of $c(x)$, a coarsened version of $x$} \\ \mathcal{M}&= \{ m(x) = a + bx \ : \ a,b \in \mathbb{R} \} && \text{all lines} \end{aligned} \]
lm
1 to do least squares with linear models.lm(y~1+x, data=unisam)
above, y~1+x
is the formula and unisam
is the data.
unisam$x[i]
and unisam$y[i]
for \(i=1 \ldots n\).lm
gives you is not really the function \(\hat\mu\).
predict
.predict
to calculate \(\hat\mu(8)\), \(\hat\mu(10)\), and \(\hat\mu(12)\).muhat
that calls predict
on its argument.
lm
returns an object we call fitted.model
.predict
is one of its methods.b.0
and b.1
to use as my basis.b.1
in the formula with sin
.b.2(x)
to the formula y ~ b.0(x) + b.1(x)
above.b.0(x)
and b.1(x)
.x^2
won’t do in place of b.2(x)
.
I(x^2)
instead.I
says ‘interpret this the normal way.’y ~ b.0(x) + b.1(x) + b.2(x)
…
b.0
, b.1
, or b.2
.b.0(x)
in the formula y~b.0(x)+b.1(x)
above.y ~ 0 + b.1(x)
instead.
‘lm’ stands for ‘linear model’