How do you create best fit line?

  • Context: Undergrad 
  • Thread starter Thread starter edmondng
  • Start date Start date
  • Tags Tags
    Fit Line
Click For Summary

Discussion Overview

The discussion revolves around the methods for creating a best fit line for a set of data points, particularly focusing on polynomial fitting, such as second-order polynomials. Participants explore mathematical approaches, including least squares fitting and cubic spline interpolation, while also expressing interest in foundational methods beyond software solutions like Excel.

Discussion Character

  • Exploratory
  • Technical explanation
  • Mathematical reasoning

Main Points Raised

  • One participant inquires about the mathematical process for determining a best fit line when given multiple data points, particularly for polynomial fits rather than simple linear equations.
  • Another participant explains that the choice of curve type (e.g., straight line, parabola, polynomial) is crucial and that the fitting process involves minimizing the sum of squares of the differences between the data points and the curve.
  • A later reply clarifies that the inquiry may specifically pertain to least squares fitting for parabolas rather than linear fits.
  • Another participant introduces cubic spline interpolation as an alternative method for fitting data, describing how it involves fitting cubic polynomials to intervals defined by x-coordinates while ensuring continuity at the endpoints.

Areas of Agreement / Disagreement

Participants present various methods for fitting data, with no consensus on a single approach. Multiple competing views on the best fitting techniques remain, including least squares and cubic spline interpolation.

Contextual Notes

The discussion does not resolve the complexities of choosing fitting methods, such as the assumptions regarding measurement errors in data pairs or the appropriateness of different polynomial degrees.

edmondng
Messages
159
Reaction score
0
Like the title says, if you have a bunch of data you can create a best fit line
For example back in school, for a linear line, y = mx + c, you just need 2 points to get a line. But what if you have more than 2 points, say 10, what is the best fit line equation or how do you do it mathematically.

I'm sure excel can do it through the trendline method but back to basics, how do people do it. I'm looking more for polynomial, 2nd order. And data is not like 1,2,4,8,16 which you can deduce to y = x^2. But more like 'double' or 'float', so trying to get a best fit line and future prediction is a lot more harder

Just wondering if anyone ever look at something like this or how people find coefficients based on data information they have accumulated (which is how in real life happens and then deduce your own equation to reflect the change in information)

Is there a book somewhere or resources i could look at would be helpful too

Thanks
 
Physics news on Phys.org
There are lots and lots of books on this topic.

First you will have to decide what kind of curve you want to fit to your data, say a straight line or a parabola or a polynomial of degree 27 or some exponential function(though this might be harder). Then each function from the "pool" you decided to choose from (for example the straight lines) is determinded by a certain number of parameters (for straight lines there are two of them) and the goal is to find the "best" values for these parameters.

For this you have to think about which parameters are "good" and which are "bad", that is you have to define some measure of how "well" a given curve (corresponding to a certain function in your "pool") approximates your data. One way to do that is to interpret your data pairs (it should be pairs) as measurements (x,m(x)). Certainly if a function f is to approximate these data well f(x) should be about equal to m(x), so one very common measure one uses the the sum of the squares (f(x)-m(x))^2 (summed over all your data points). you then want to find the function which minimizes this sum, which is why the method is called "least square fit".
In the case you're fitting a straight line there is a rather easy general formula giving you the best values for the two parameters, if you consider other families of approximating functions such formulas might be long or might not exist at all.
Note that in this method you do not treat the two components of your data pairs the same way, rather it inherently assumes that one coordinate is the measurement and thus has an error while the other coordinate does not have an error. This is often a reasonable asumptions, sometimes it is not, in which case you will have to modify the procedure.

So mathematically it is all about minimization and best approximation in function spaces endowed with some topology, which is why the methods used in the theoretical analysis of your problem are typically those of functional analysis.
 
Last edited:
thanks for the explanation and the link. will look into it
 
Least squares can be generalized for polynomials. But a common approach to fitting data that you might wish to use is cubic spline interpolation--

http://www.physics.utah.edu/~detar/phycs6720/handouts/cubic_spline/cubic_spline/node1.html

The idea is that you use your x-coordinates to create intervals, and you fit cubic polynomials to each interval such that the function and it's derivative are continuous at the end points of each interval.
 
Last edited by a moderator:

Similar threads

  • · Replies 5 ·
Replies
5
Views
8K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 12 ·
Replies
12
Views
7K
  • · Replies 14 ·
Replies
14
Views
4K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 16 ·
Replies
16
Views
3K
  • · Replies 13 ·
Replies
13
Views
4K