Register to reply

Poly Regression Matrix

by jjj888
Tags: matrix, poly, regression
Share this thread:
jjj888
#1
Dec27-12, 12:04 PM
P: 23
I'm trying to understand the derivation of polynomial regression.

Given data points: [(-1,-1),(2,-1),(6,-2)]. So a 2nd degree curve will be a concave downward parabola. My calculator produces the equation: -0.0357x2+0.0357x-0.9285. Which fits the data good. But if I try to do it manually in matrix form I run into an error. My summed up matrix looks like this:

| 3 7 39| |a| = 0.3436
| 7 39 233| |b| = 0.1237
|39 233 1311| |c| = -0.1512

Which leads me to think the equation should look like: 0.3436x2+0.1237x-0.1512. Of course this is wrong. Obviously I must have set the matrix up wrong. Can anyone offer any clarification?

Thanks
Phys.Org News Partner Science news on Phys.org
Security CTO to detail Android Fake ID flaw at Black Hat
Huge waves measured for first time in Arctic Ocean
Mysterious molecules in space
jjj888
#2
Dec27-12, 12:11 PM
P: 23
I wrote the matrix down wrong. Here is what I had.

| 3 7 39| |a| = -4
| 7 39 233| |b| = -28
|39 233 1311| |c| = -156

a = 0.3436
b = 0.1237
c = -0.1512
TheoMcCloskey
#3
Dec27-12, 12:32 PM
P: 182
I think you may want to calculate those 'normal-equation' elements again. Can you show more detail?

jjj888
#4
Dec27-12, 01:29 PM
P: 23
Poly Regression Matrix

| n Ʃx Ʃx2| |a| |y|
| Ʃx Ʃx2 Ʃx3| |b| |Ʃxy|
| Ʃx2 Ʃx3 Ʃx4| |c| |x2y|

This was the base matrix I used. From the partial derivs of the sqrared errors and such.
TheoMcCloskey
#5
Jan2-13, 08:20 AM
P: 182
Your structure is correct, but I don't get the matrix elements you have for the given data points. I do agree with your calculator results.

Data points:
(-1,-1)
(2,-1)
(6,-2)

[tex]n = 3 \quad \Sigma \,x = 7 \quad \Sigma \,x^2 = 41[/tex]
[tex]\quad \Sigma \, x^3 = 223 \quad \Sigma \, x^4 = 1313[/tex]

etc ... Recompute your elements (both left and right hand side) and solve system again.
jjj888
#6
Jan2-13, 08:25 AM
P: 23
Thanks.

I did discover my x sums were wrong. Although I noticed the the matrix values and the calculator values were slightly different going down the decimal line. Am I to assume the matrix would be more accurate because the calculator uses some other algo? Or did your values match exactly?
TheoMcCloskey
#7
Jan2-13, 08:52 AM
P: 182
Well, I did my calculation via Excel to get a quick check on your results - that is, I actually formed the normal equations (A'A c = A'y) and solved by computing the inverse (yech!) of A'A.

I would not usually do this as the normal equations can tend to be ill-conditioned, but this is a small enough system such that the results shouldn't be impacted too much. Certainly, however, a Cholesky Decomposition would typically be preferred on these types of problems instead of computing/using Inverse. For larger problems, more stable methods are usually employed.

The values I get for the coefficients are as follows:
c0 = -0.928571429
c1 = 0.035714286
c2 = -0.035714286

These results also agree (to decimals places listed) with Excell's 'Regression' solver (from their 'Data Analysis' package).

I'm not sure what method your calculator is using. Older calculators would typically use a very similar technique to 'computing the inverse'. Newer, more advanced calculators may perform Cholesky. Also, there may be a significant difference in the computational arithmetic precision between calculator and, say, a computer (eg, 10 decimal digits in calculator vs 16 digits in computer). It all depends on calculator and method.

Try displaying additional decimal places on calculator.

Lastly, note that you are fitting three points to a quadratic, which requires three coefficients. Hence, you're really computing an interpolation in the end - and there are other more direct methods to come upe with these coefficients in that case.
jjj888
#8
Jan2-13, 09:08 AM
P: 23
I know this is going to get a little beyond my grasp, but what causes the matrix to be ill-conditioned? I understand that in the end you are still trying to fit something to something else and there is no exact definition.

What are these more direct methods for finding coefficients for a quadratic? You're saying that with a small number of points a different method might be easier?
TheoMcCloskey
#9
Jan2-13, 10:29 AM
P: 182
A system can become ill-conditioned if the columns start to become linerarly dependent. That is, the solution can't determine, within any numerical precision, the difference between two columns. As a consequence, a small change in the coefficients can yield very large changes in the solution. See for example http://engrwww.usask.ca/classes/EE/8...%20Systems.pdf

In the case with least squares solutions, using the monomials [itex]x, x^2, x^3, x^4 , \dots[/itex] as a basis for the regression can lead to ill-condition systems that are hard to solve with any meaningful (accurate) results. One way to visualize this trouble is to consider the case for [itex]x \in (0,1)[/itex] and how much the monomials look like each other as [itex]x\rightarrow1[/itex].

Your particular case is small enough to resonably avoid these problems. However, if you had additional data and should you try for degree fits, say on the order of 6 or more, you may find trouble numerically solving via the normal equations.

I said more direct methods are available for your particular problem with three points since you have three coefficients to solve for and three data points. For example, just write out the quadratic expression for each of the data points in terms of the unknown coefficients. This will yield three linear equations in three unknowns (coefficients). Solve the system for the coefficients. You should yield the following:
c0 = -13/14
c1 = 1/28
c2 = -c1 = -1/28

Search the topic "interpolation" for more info/methods.
jjj888
#10
Jan8-13, 01:57 PM
P: 23
What if the I have a large number of data points but they are all in a rough shape of a parabola, will I have the same problem with contitioning / dependency?
SteamKing
#11
Jan8-13, 02:58 PM
Emeritus
Sci Advisor
HW Helper
Thanks
PF Gold
P: 6,329
Not necessarily.

Normal equations can blow up, but sometimes they can be made to work, e.g., by scaling the data.

If you don't want to take the chance of having problems with normal equations, polynomial regression can be carried out by using the original data combined with the desired polynomial model, and then calculating the regression coefficients using the QR algorithm on the resulting rectangular matrices of polynomial coefficients.


Register to reply

Related Discussions
Matrix of a module endomorphism and its char/min poly Linear & Abstract Algebra 0
Example of 3x3 matrix for which min.poly is square x Linear & Abstract Algebra 0
Silly Matrix Algebra/Regression Question Linear & Abstract Algebra 2
Proof involving char. poly., a basis, and similarity to an upper-triangular matrix Calculus & Beyond Homework 0
Least squares Ridge Regression - how to do with solver that takes matrix input? Linear & Abstract Algebra 0