StoneTemplePython said:
If you want to think clearly, you need to write the math clearly.
Fair enough.
What I'm trying to do is estimate a linear fit through all of my data. Fitting through the mean is appropriate for normally distributed data. That is, if we assume ##\mathbf{y}_i## (##i##th row of the matrix ##\mathbf{Y}##) is distributed ##~N(\mu_y(x_i),\sigma_y)##, then we can just say that ##\mu_y(x_i) = a + bx_i##. Finding these fit coefficients would be done according to the process laid out earlier:
$$\mathbf{a} = \begin{bmatrix}a\\b\end{bmatrix}=\mathbf{R}^{−1}\mathbf{Q}^{T}\mathbf{μ}_y$$
where ##\mathbf{μ}_y = \begin{bmatrix}\mu_y(x_1) & \mu_y(x_2) & \cdots & \mu_y(x_n)\end{bmatrix}^T## and ##\mathbf{X}=\mathbf{QR}## is the ##\mathit{QR}## factorization of ##\mathbf{X}##. This would be the best-fit line through the means of the data, and I suspect this also minimizes the L2 norm over all the data, since we are assuming a symmetric distribution at each ##x_i## (though I've not attempted to work through the math).
In the case of the data that I am working with, I cannot assume a normal distribution, nor can I necessarily assume a symmetric distribution. So, I am a little hesitant to simply fit through the means or medians of the data. Furthermore, I cannot blindly extend the above outlined approach to my matrix of dependent variables. If I tried to do so, this would result in ##m## estimates for ##a## and ##m## estimates for ##b##. That is if
$$\mathbf{Y} = \begin{bmatrix}y_{11} & y_{12} & \cdots & y_{1m} \\ y_{21} & y_{22} & \cdots & y_{2m} \\ \vdots & \vdots & \ddots & \vdots \\ y_{n1} & y_{n2} & \cdots & y_{nm}\end{bmatrix}$$
then
$$\mathbf{R}^{-1}\mathbf{Q}^{T}\mathbf{Y} = \begin{bmatrix} a_1 & a_2 & \cdots & a_m\\ b_1 & b_2 & \cdots & b_m\end{bmatrix}$$
This seems to be clearly undesirable because it raises the question: Which coefficient value should I use? Furthermore, it treats each column of the ##\mathbf{Y}## matrix as a single sequence, and each pair ##\left\{a_i, b_i\right\}## corresponds to the ##i##th column.
StoneTemplePython said:
t's not clear what robust ##a## and ##b## actual means -- maybe you meant stable not robust here? They are different
Agreed; I was not clear on what I wanted. I am hoping to get a few more options on how to linearly fit through the data. One obvious option is what I mentioned: Fit through the means and/or medians of each ##\mathbf{y}_i## (##i##th row of data matrix ##\mathbf{Y}##). Perhaps another option would be pulling random samples of each row, fitting through the means and/or median of each row subsample, then doing this over and over until I get a distribution on the linear fit coefficients. Are there other options? I would like something that I can defend whether through L1 or L2 norms (if this is even possible).
Thanks again for all the help.