Regression Analysis on Theoretical Model

Click For Summary

Discussion Overview

The discussion revolves around performing regression analysis on a theoretical model to compare it with experimental data. Participants explore methods for assessing the fit of a model represented by a specific equation, addressing challenges related to error estimation and statistical evaluation.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant seeks guidance on how to perform regression analysis for a model defined as y=a/(1+b*x) and expresses difficulty in finding resources for non-linear regression.
  • Another participant suggests calculating χ² to assess the fit of the model, emphasizing the importance of estimating parameters that minimize χ².
  • A different participant questions whether random errors are considered in the model and whether errors in measurement affect both x and y variables.
  • Concerns are raised about the lack of error analysis in the published data being compared, leading to uncertainty about how to proceed with regression analysis.
  • One participant proposes fitting a linear regression to transformed data (1/y, x) to facilitate comparison with the model, while also highlighting the need for clarity on error handling.
  • Another participant describes their complex model relating mechanical performance to porosity and expresses confusion about the statistical evaluation requested by their advisor, particularly regarding the interpretation of the χ² value obtained.
  • Discussion includes a mention of the challenges associated with applying goodness of fit tests to continuous data and the need for appropriate binning of results.
  • Participants suggest looking at existing literature for examples of how to compare theoretical and experimental results and consider reaching out to authors of the experimental data for clarification on their methodology.

Areas of Agreement / Disagreement

Participants express various viewpoints on how to approach regression analysis, with no consensus on the best method to evaluate the fit of the model to the experimental data. The discussion remains unresolved regarding the handling of errors and the interpretation of statistical results.

Contextual Notes

Participants note the absence of error estimates in the experimental data, which complicates the regression analysis. There is also uncertainty about the appropriate statistical methods to apply given the nature of the data.

samee
Messages
58
Reaction score
0
Hi everyone. I'm a graduate student and am struggling with something that may possibly be trivial. So, my research is creating a mathematical model to represent a real system. I have data points from my real system that I want to compare my model to. How do I do a regression analysis and get an r^2 value for the data points fit to my model?

Excel wants to fit it to a line and then do a regression analysis. I tried to figure out how to do it by hand, but am only finding linear regression analyses...

The model is something like this;

y=a/(1+b*x) Where a and b are material constants.

I went ahead and plugged in the data points that I have from the experimental publication into my model and I know a regression analysis says something about how close my data is to my model, for the life of me I cannot find anything useful online to help me calculate this.

Can anyone help me out here?
 
Physics news on Phys.org
You simply need to assume values for your parameters,
and calculate the χ² which is the sum of the (y(xi)-yi)²/σi² on your observations.
There, xi and yi are the observations and σi the uncertainties on (y(xi)-yi).
You can then try to find out the values of the parameters that make χ² minimum.
(which you can do with the Excel solver)

The value of χ² at the minimum and how it behaves near the minimum will allow you to estimate the precision of your parameters.

Have a look at "Numerical Recipes": http://www.nr.com and http://apps.nrbook.com/c/index.html (chap 15) .
 
Last edited:
samee said:
my research is creating a mathematical model to represent a real system. I have data points from my real system that I want to compare my model to.

The model is something like this;

y=a/(1+b*x) Where a and b are material constants.

Are you modeling any random errors? Do errors in measurement occur only for y or do they also occur for x?
 
My work is only the model. I'm comparing the model to a set of published data from another research group. Aside from reading the paper, I really have no idea how they did their experiment. What I mean is, they didn't publish any error analysis for their data points, so I don't know what to do about that.

I remember working with things like this back when I was an undergrad, but it's been years. Can I do a regression analysis without an error estimation on the physical data?
 
Fit a linear regression to the values of your (1/y, x) data. That will give a model z = A*X+B, where z=1/y and A and B are the results of the linear regression. Those results can be compared to your model. As others have stated, you should be more specific about if and how you are inserting random errors into your model.
 
Ok, I've been working on this and am still confused. I think that I'm missing some key information at a very basic level and that's what's killing me. So basically, I'm modeling the mechanical performance of a material in terms of it's elastic and shear response in relation to it's porosity. I built a model that accounts for 3 different sets of inclusions within the material (using Hill's tensor) and determines the relation between porosity and each component of the mechanical performance tensors. Specifically I'm interested in the transverse and longitudinal elastic modulus, E1 and E3, and the transverse and longitudinal shear modulus, G13 and G12. Without getting into too many details, I came up with E1=E0/(1+P(H)E0) where E0 is the elastic response is there was no porosity, H is the sum of the hills tensor for the different pore types, P is the porosity of the material, and E1 is the longitudinal elastic modulus. I have similar expressions with different constants for E3, G12, and G13.

The model is done the paper is written, and at the last minute before we had submit it, my adviser asked me to add a statistical evaluation for how good of a fit our model is to the experimental data we were comparing it to. He does not have experience with statistical analysis and neither do I. He told me to do an r^2 regression analysis.

After looking at it more in depth, my understanding of the r^2 regression analysis is that it relates the x and y variables to see if there's a correlation. What I want is to see how well my model fits some data points. So I'm pretty sure he must have been mistaken when he asked me for a regression analysis r^2 value. u/maajdl suggested that I need chi^2 and I think he's completely right, I do. His formula involved x and y, but I only have one input, and his equation involved error, which I have no idea what to say on. I did not include error in my model and the published experimental data points I'm comparing my model to also seems to have no estimation on error.

SO- Wikipedia gave me this;

χ2=\sum\overline{n}\underline{i=1}\stackrel{(O<sub>i</sub>-E<sub>i</sub>)<sup>2</sup>}{E<sub>i</sub>}

And I tried to do that with the latex, but I suck at it, so in case it's illegible, chi2=sum(from 1-->n)(Oi-Ei)^2/Ei

Where E is the theoretical output and O is the experimental output. Basically, this is looking only at the experimental and theoretical values so I thought it was perfect, right? And I got a value of 2.007. Yay!

But wait. What is this, 2.007. What in the world do I do with it? So there are some graphs with some lines on wikipedia that talk about degrees of freedom and are very confusing... and I don't know how that relates to this at all.
http://en.wikipedia.org/wiki/Pearson's_chi-squared_test#Goodness_of_fit
I know a few things from my undergrad days about chi2 in general... I know that it represents the amount of dispersion that the experimental has from the theoretical. But I don't know what this says about anything or what to do with this. I'm just really lost on all of this. Any help, explained to me on a fundamental level like I'm an undergrad or a high-school student, would be really appreciated.

u/FactChecker; I'm missing how the linear model compares to my model, if you could explain a little more I would really appreciate it. I'm not sure where to go with your comment.
 
samee said:
http://en.wikipedia.org/wiki/Pearson's_chi-squared_test#Goodness_of_fit
I know a few things from my undergrad days about chi2 in general... I know that it represents the amount of dispersion that the experimental has from the theoretical. But I don't know what this says about anything or what to do with this.

That goodness of fit test assumes you have divided the into discrete "bins". Since you have continuous variates, to define a bin you'd have to specify each bin by giving intervals for the data. For example, a bin might be the set { 1.7 < x < 2.9, 0.38 < y < 9.6 }. So unless there is some "natural" way to bin your results, that method doesn't apply. )You should note the comments in the article that say not to have too few counts in bins.)

You might need a "sociological" approach to this statistical problem. Browse the Journals where your paper will be submitted and see what authors do when they compare theoretical to experimental curves - especially look at those papers where the authors don't do an elaborate job. Whatever words they say will give you a hint about what to do.

In academia, authors of papers usually respond to questions about their work from other academics. Consider contacting the authors of the experimental results and asking them about the precision of the equipment they used.

Is it correct that you have already determined values for 'a' and 'b' in the equation? If so, what method did you use to do that? Are 'a' and 'b' given from theory?
 

Similar threads

  • · Replies 13 ·
Replies
13
Views
5K
Replies
3
Views
3K
  • · Replies 23 ·
Replies
23
Views
4K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 30 ·
2
Replies
30
Views
5K
  • · Replies 24 ·
Replies
24
Views
3K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 64 ·
3
Replies
64
Views
6K
  • · Replies 2 ·
Replies
2
Views
2K