Equation of a graph, determined by (x,y) coordinates

  • Thread starter Thread starter superduke1200
  • Start date Start date
  • Tags Tags
    Coordinates Graph
AI Thread Summary
To determine the equation of a graph from a dataset of (x,y) coordinates, using trendlines in Excel, such as polynomial or logarithmic fits, can provide approximate equations but may not yield the "true" equation. A physical model that incorporates relevant variables, like heart pressures, is essential for accurate modeling, as many curves can fit a finite set of points. Polynomial fits can be problematic, especially with large datasets, and may not capture the expected behavior of the data. Alternative approaches include using Fourier series for periodic data or cubic spline fits for smoother transitions between points. Without a physical model, any fitting is merely a smoothing of the data rather than an accurate representation of the underlying relationship.
superduke1200
Messages
57
Reaction score
1
Dear all,

given a dataset of (x,y) coordinates, how can someone determine the equation of a plot created with Origin or Excel that passes through all these points?

Depending on the dataset, it is safe to use a trendline ( i.e logarithmic, polynomial) in Excel. The problem is that the graph which will be generated, will provide an equation close to the one that is determined by the given dataset but not the "real" one.

It is a query that many people including me have and it would be thankful to read someone's recommendations.
 
Technology news on Phys.org
Excel's linear fit minimises the sum of the least-square residuals - that is, it picks ##m## and ##c## in ##y=mx+c## to minimise the squared difference between your y values and your modeled y values, ##\epsilon=\sum_i(y_i-(mx_i+c))^2##, where your data points are the ##x_i## and ##y_i##. I don't know for certain, but I expect its log fit works by replacing ##x_i## with ##\log(x_i)## in the sum and otherwise doing the same thing.

There's loads online about least squares linear fits. If you can do calculus, all you do is solve ##\partial\epsilon/\partial m=0## and ##\partial\epsilon/\partial c=0## simultaneously.

Edit: I've seen Excel get the fits spectacularly wrong with large amounts of data. Some care (which Excel does not take) is needed computationally in that case.
 
Last edited:
0 0,4
0,05 0,43
0,08 0,62
0,12 0,68
0,13 0,79
0,2 0,83
0,25 0,8
0,3 0,76
0,32 0,69
0,34 0,63
0,35 0,56
0,4 0,6
0,43 0,62
0,45 0,55
0,5 0,53
0,55 0,46
0,58 0,48
0,6 0,47
0,65 0,45
0,8 0,4

That is the data set that I was earlier talking about. I have included them in case someone wishes to plot them using Excel, Origin etc

I have included the graph that I have plotted using Excel. It is the Velocity profile of a carotid bifurcation artery and I want its exact equation.

My best approach goes with a 6th order polynomial fit but the ideal would be this exact scheme's equation
Καταγραφή.PNG
 

Attachments

  • Καταγραφή.PNG
    Καταγραφή.PNG
    4.3 KB · Views: 537
Last edited:
Καταγραφή.PNG


My attempt using a 6th order polynomial fit
 

Attachments

  • Καταγραφή.PNG
    Καταγραφή.PNG
    3.8 KB · Views: 504
superduke1200 said:
My best approach goes with a 6th order polynomial fit but the ideal would be this exact scheme's equation
In that case, you're going to need a physical model. There are infinitely many curves that will fit a finite set of points like this. You'll need to come up with a model that says "given heart systolic and diastolic pressures, I expect the velocity to be...", then find the pressures (and whatever else goes into your model - arterial diameters, heightn etc) that best fit your data.

If you don't have a model, all you are doing is smoothing your data somehow.

Note that your fit clearly has problems - presumably we're expecting this to cycle, and the slope at the right hand end is very different from the left hand end.
 
Svein said:
It looks like a Poisson distribution...

Adj R square when applying a Poisson distribution is 0.79.

With the polynomial fit it is 0.92
Ibix said:
In that case, you're going to need a physical model. There are infinitely many curves that will fit a finite set of points like this. You'll need to come up with a model that says "given heart systolic and diastolic pressures, I expect the velocity to be...", then find the pressures (and whatever else goes into your model - arterial diameters, heightn etc) that best fit your data.

If you don't have a model, all you are doing is smoothing your data somehow.

Note that your fit clearly has problems - presumably we're expecting this to cycle, and the slope at the right hand end is very different from the left hand end.

Perhaps if I am able to employ more data points, I could get a better approach with the polynomial fit. A physical model is not available. I just got these data by estimating the data points on the graph.

Another approach could be splitting the data set into pieces, according to its shape. That is, employing a polynomial fit for the first say 10 data points, a linear fit or the next three data points etc
 
superduke1200 said:
. A physical model is not available.
Then you're only guessing what the data inbetween the points looks like. That's fine - if the data's what you've got, it's what you've got. Just don't kid yourself that you know the "true form" of the graph.

Presuming that this data covers one complete heartbeat, I'd investigate modelling this with a Fourier series. That will guarantee that your result is periodic, and you can try dropping frequency components with low amplitudes if you want.
 
Thank you both for your will to share your knowledge!
 
  • #10
You should consider a cubic spline fit through the points rather than a high-degree polynomial.
 
Back
Top