Minimizing square of deviation / curve fitting

In summary, the conversation discusses fitting a data set to a curve by minimizing the square of the deviation and using matrices. The equations for calculating the deviation and the system of equations to find the local minima are provided. The methods of Cramer's Rule, LU, and GE are mentioned for solving the system, with a recommendation for using Cramer's Rule due to the simplicity of the 2x2 system.
  • #1
exmachina
44
0

Homework Statement



Given some data set, (x,y), fit to the the curve [tex]y=bx^2+a[/tex] by minimizing the square of the deviation. Preferred to use matrices.

Homework Equations


The deviation for the ith data point is simply:

[tex]d_i^2=(y_i-y)^2=(y-bx_i^2-a)^2[/tex]

The Attempt at a Solution



If I understand correctly, I want to minimize [tex]\sum{d_i^2}[/tex] by differentiating wrt to a and b and setting to zero to find the local minima. So far I have differentiated to yield the following system of equations:

[tex]\sum_{i}^n y_i = an + \sum_{i}^n bx_i^2 [/tex]

[tex]\sum_{i}^n{x_i^2y_i}=a\sum_{i}^n x_i^2 + b \sum_{i}^n x_n^4[/tex]So I'm stuck using either Cramer's Rule, LU, or GE. Cramer's Rule is the easiest to implement, but I don't know how much slower it will be compared to LU/GE. I have around 300 data points
 
Last edited:
Physics news on Phys.org
  • #2
It doesn't really matter what method you use, there are only two unknowns, a and b. It's a 2x2 system. Once you have found the summations of the various powers of your data points, it's easy.
 
  • #3
Solving the system using Cramer's rule is very simple.

[tex] a = \frac { \sum x^{4}_{i} \sum y_{i} - \sum x^{2}_{i} \sum x^{2}_{i} y_{i}} {n \sum x^{4}_{i }- ( \sum x^{2}_{i})^2} [/tex]


[tex] b = \frac { n \sum x^{2}_{i} y_{i} - \sum x^{2}_{i} \sum y_{i}} {n \sum x^{4}_{i }- ( \sum x^{2}_{i})^2} [/tex]

and then calculate the sums using a spreadsheet
 

1. What is the purpose of minimizing the square of deviation in curve fitting?

The purpose of minimizing the square of deviation in curve fitting is to find the best fit line or curve that represents the relationship between two variables in a dataset. By minimizing the square of deviation, we are finding the line or curve that has the smallest overall distance from each data point, and thus provides the most accurate representation of the relationship.

2. How is the square of deviation calculated in curve fitting?

The square of deviation is calculated by taking the difference between the observed data point and the corresponding predicted value on the fitted line or curve, squaring this difference, and then summing up these squared differences for all data points. This gives us the total squared deviation for the fitted line or curve.

3. What is the relationship between minimizing square of deviation and finding the best fit line or curve?

Minimizing the square of deviation is directly related to finding the best fit line or curve. The line or curve with the smallest overall squared deviation is considered the best fit, as it has the least amount of error or distance from the data points. By minimizing the squared deviation, we are essentially finding the line or curve that best represents the relationship between the variables in the dataset.

4. Can minimizing the square of deviation result in overfitting?

Yes, minimizing the square of deviation can potentially result in overfitting the data. Overfitting occurs when the fitted line or curve is too closely aligned with the individual data points, and may not accurately represent the overall trend in the data. It is important to balance minimizing the square of deviation with avoiding overfitting by using techniques such as cross-validation.

5. Are there any limitations to using minimizing square of deviation in curve fitting?

Yes, there are some limitations to using minimizing square of deviation in curve fitting. For example, it assumes that the relationship between the variables is linear, and may not accurately fit non-linear relationships. It also does not take into account any potential outliers in the data, which may affect the accuracy of the fitted line or curve. Additionally, minimizing the square of deviation may not be the most appropriate method for all types of datasets, and other techniques such as least absolute deviations may be more suitable.

Similar threads

  • Calculus and Beyond Homework Help
Replies
3
Views
995
  • Calculus and Beyond Homework Help
Replies
2
Views
2K
  • Calculus and Beyond Homework Help
Replies
26
Views
2K
  • Calculus and Beyond Homework Help
Replies
2
Views
896
Replies
6
Views
1K
Replies
1
Views
624
Replies
3
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
846
  • General Math
Replies
2
Views
720
Back
Top