Equation of a graph, determined by (x,y) coordinates

  • Thread starter Thread starter superduke1200
  • Start date Start date
  • Tags Tags
    Coordinates Graph
Click For Summary

Discussion Overview

The discussion revolves around determining the equation of a graph based on a given dataset of (x,y) coordinates, specifically in the context of fitting models to data representing a velocity profile of a carotid bifurcation artery. Participants explore various fitting techniques and the implications of using different models.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant suggests using trendlines in Excel, such as logarithmic or polynomial fits, but notes that these may not yield the "real" equation of the dataset.
  • Another participant explains how Excel's linear fit minimizes the least-square residuals and discusses the mathematical approach to finding the best fit.
  • A dataset of (x,y) coordinates is provided for plotting and analysis, with a participant expressing a preference for a 6th order polynomial fit.
  • Concerns are raised about the limitations of polynomial fits, with a participant emphasizing the need for a physical model to accurately describe the data rather than merely smoothing it.
  • One participant mentions the potential for a Poisson distribution fit and compares its adjusted R-squared value to that of the polynomial fit.
  • Suggestions are made to split the dataset into segments for different fitting approaches based on the shape of the data.
  • A later reply proposes investigating a Fourier series model to ensure periodicity in the data representation.
  • Another participant recommends considering a cubic spline fit instead of a high-degree polynomial to avoid overfitting issues.

Areas of Agreement / Disagreement

Participants express differing opinions on the best fitting approach, with no consensus on a single method. Some advocate for polynomial fits, while others suggest physical models or alternative fitting techniques like cubic splines or Fourier series.

Contextual Notes

Participants acknowledge the limitations of their approaches, including the absence of a physical model and the potential inaccuracies of high-degree polynomial fits. The discussion highlights the complexity of accurately modeling the data without a clear underlying theory.

Who May Find This Useful

This discussion may be useful for individuals interested in data fitting techniques, particularly in the context of biomedical data analysis and modeling approaches in physics and engineering.

superduke1200
Messages
57
Reaction score
1
Dear all,

given a dataset of (x,y) coordinates, how can someone determine the equation of a plot created with Origin or Excel that passes through all these points?

Depending on the dataset, it is safe to use a trendline ( i.e logarithmic, polynomial) in Excel. The problem is that the graph which will be generated, will provide an equation close to the one that is determined by the given dataset but not the "real" one.

It is a query that many people including me have and it would be thankful to read someone's recommendations.
 
Technology news on Phys.org
Excel's linear fit minimises the sum of the least-square residuals - that is, it picks ##m## and ##c## in ##y=mx+c## to minimise the squared difference between your y values and your modeled y values, ##\epsilon=\sum_i(y_i-(mx_i+c))^2##, where your data points are the ##x_i## and ##y_i##. I don't know for certain, but I expect its log fit works by replacing ##x_i## with ##\log(x_i)## in the sum and otherwise doing the same thing.

There's loads online about least squares linear fits. If you can do calculus, all you do is solve ##\partial\epsilon/\partial m=0## and ##\partial\epsilon/\partial c=0## simultaneously.

Edit: I've seen Excel get the fits spectacularly wrong with large amounts of data. Some care (which Excel does not take) is needed computationally in that case.
 
Last edited:
0 0,4
0,05 0,43
0,08 0,62
0,12 0,68
0,13 0,79
0,2 0,83
0,25 0,8
0,3 0,76
0,32 0,69
0,34 0,63
0,35 0,56
0,4 0,6
0,43 0,62
0,45 0,55
0,5 0,53
0,55 0,46
0,58 0,48
0,6 0,47
0,65 0,45
0,8 0,4

That is the data set that I was earlier talking about. I have included them in case someone wishes to plot them using Excel, Origin etc

I have included the graph that I have plotted using Excel. It is the Velocity profile of a carotid bifurcation artery and I want its exact equation.

My best approach goes with a 6th order polynomial fit but the ideal would be this exact scheme's equation
Καταγραφή.PNG
 

Attachments

  • Καταγραφή.PNG
    Καταγραφή.PNG
    4.3 KB · Views: 609
Last edited:
Καταγραφή.PNG


My attempt using a 6th order polynomial fit
 

Attachments

  • Καταγραφή.PNG
    Καταγραφή.PNG
    3.8 KB · Views: 557
superduke1200 said:
My best approach goes with a 6th order polynomial fit but the ideal would be this exact scheme's equation
In that case, you're going to need a physical model. There are infinitely many curves that will fit a finite set of points like this. You'll need to come up with a model that says "given heart systolic and diastolic pressures, I expect the velocity to be...", then find the pressures (and whatever else goes into your model - arterial diameters, heightn etc) that best fit your data.

If you don't have a model, all you are doing is smoothing your data somehow.

Note that your fit clearly has problems - presumably we're expecting this to cycle, and the slope at the right hand end is very different from the left hand end.
 
Svein said:
It looks like a Poisson distribution...

Adj R square when applying a Poisson distribution is 0.79.

With the polynomial fit it is 0.92
Ibix said:
In that case, you're going to need a physical model. There are infinitely many curves that will fit a finite set of points like this. You'll need to come up with a model that says "given heart systolic and diastolic pressures, I expect the velocity to be...", then find the pressures (and whatever else goes into your model - arterial diameters, heightn etc) that best fit your data.

If you don't have a model, all you are doing is smoothing your data somehow.

Note that your fit clearly has problems - presumably we're expecting this to cycle, and the slope at the right hand end is very different from the left hand end.

Perhaps if I am able to employ more data points, I could get a better approach with the polynomial fit. A physical model is not available. I just got these data by estimating the data points on the graph.

Another approach could be splitting the data set into pieces, according to its shape. That is, employing a polynomial fit for the first say 10 data points, a linear fit or the next three data points etc
 
superduke1200 said:
. A physical model is not available.
Then you're only guessing what the data inbetween the points looks like. That's fine - if the data's what you've got, it's what you've got. Just don't kid yourself that you know the "true form" of the graph.

Presuming that this data covers one complete heartbeat, I'd investigate modelling this with a Fourier series. That will guarantee that your result is periodic, and you can try dropping frequency components with low amplitudes if you want.
 
Thank you both for your will to share your knowledge!
 
  • #10
You should consider a cubic spline fit through the points rather than a high-degree polynomial.
 

Similar threads

  • · Replies 3 ·
Replies
3
Views
4K
Replies
2
Views
1K
  • · Replies 4 ·
Replies
4
Views
3K
Replies
4
Views
3K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 10 ·
Replies
10
Views
2K
Replies
3
Views
1K
  • · Replies 9 ·
Replies
9
Views
3K
Replies
4
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K