Judgement of good fit: where to stop?

  • Thread starter ssd
  • Start date
  • Tags
    Fit
In summary, when trying to fit a curve to predict Y based on X, with 50 pairs of (x,y) data given, it is important to consider the 1:1 relationship between Y and X. The nature of movement of Y is complex, involving various constants and terms such as x^a, sin(c+bx), and d^x. To find the best fit, one can start with arbitrary values for the constants and use the least squares method to minimize the residual sum of squares (rss). It is recommended to vary one constant at a time and compare the rss until it reaches a minimum value. However, there is no specific cutoff point for when the fit can be considered satisfactory, and it is important to have
  • #1
ssd
268
6
I have to fit a curve to predict Y on the basis of X. 50 pair of (x,y) is given. On free hand plot the curve it is seen (and otherwise known also) that Y and X has 1:1 relationship. The basic nature of movement of Y is quite composite and found involving x^a, sin(c+bx), d^x , one additive constant, and different constant multipliers of the first 3 terms. Here constants are real valued .
That is,
Y= p + q.x^a+ r.sin(c+dx) + s.d^x
What I did is to start with arbitrary values of a,b,c,d and find p,q,r,s by least squares and calculate the residual sum of squares (rss).
Now I varied 'a' and compared rss till it is minimum. Then repeated the same with others, one at a time and came back to 'a' ..and so on. This resulted in a nice fit but the rss value never seem to stabilize.. it is decreasing (but of course it is not becoming 0).

My question is when should I stop ... is there any objective method or a value of rss (or value of R^2) which can be used as cut off when I can say the fit is satisfactory?
PS: Frequency distributions cannot be formed to test goodness of fit.

Any idea is appreciated.
 
Physics news on Phys.org
  • #2
A good article relating to your question is at http://www.aip.org/tip/INPHFA/vol-9/iss-2/p24.html" . See esp. the sections "Nonlinear Models" and "Interpreting Results".

BTW - there are a few pretty good [free] shareware packages that can do nonlinear curve fitting. See, for example http://www.prz.rzeszow.pl/~janand/" .

jf
 
Last edited by a moderator:
  • #3
BTW - just for fun, could you provide the (x,y) data? This would be an interesting and educational (for me, at least) exercize for playing and tweaking with Mathematica's "Non-linear Regression" package. I'm kind of curious to fool around with it in OriginPro, too. I've never messed with their non-linear stuff.

jf
 
  • #4
You have a number of plausible explanations of the variance in the data. The difficulties you are facing are (1) that these explanations are not independent of one another, and (2) the models are non-linear.

One approach is to address the problem is to start with nothing (i.e., the data are just random numbers). One of the plausible explanations will most likely do a better job than any of the other explanations at reducing the residual variance. Keep repeating this step -- i.e., add one explanation at a time to your overall model. Stop when the residual variance can be attributed to measurement noise with a high degree of confidence. Note well: This means you need some kind of model of the measurement process.

Another approach is to start with a full model. Throw the kitchen sink at the problem. The first approach worked by adding terms step-by-step. This approach works by subtracting terms. Find the model term such that removing it does the least damage to the residual variance. Suppose that the residual variance can still be attributed with high confidence to measurement noise after removing this least-powerful term. That means that this term is not significant. Throw it out. Repeat throwing out terms until you finally come up against a term that is significant.
 
  • #5
jackiefrost said:
BTW - just for fun, could you provide the (x,y) data? This would be an interesting and educational (for me, at least) exercize for playing and tweaking with Mathematica's "Non-linear Regression" package. I'm kind of curious to fool around with it in OriginPro, too. I've never messed with their non-linear stuff.

jf

Sorry, it cannot be done as the data is not for public and is maintained as secret for people who are not permitted to use it.
 
  • #6
D H said:
You have a number of plausible explanations of the variance in the data. The difficulties you are facing are (1) that these explanations are not independent of one another, and (2) the models are non-linear.

One approach is to address the problem is to start with nothing (i.e., the data are just random numbers). One of the plausible explanations will most likely do a better job than any of the other explanations at reducing the residual variance. Keep repeating this step -- i.e., add one explanation at a time to your overall model. Stop when the residual variance can be attributed to measurement noise with a high degree of confidence. Note well: This means you need some kind of model of the measurement process.

Another approach is to start with a full model. Throw the kitchen sink at the problem. The first approach worked by adding terms step-by-step. This approach works by subtracting terms. Find the model term such that removing it does the least damage to the residual variance. Suppose that the residual variance can still be attributed with high confidence to measurement noise after removing this least-powerful term. That means that this term is not significant. Throw it out. Repeat throwing out terms until you finally come up against a term that is significant.

Thanks for your opinion but the apporoach like orthogonal polynomials do not work here. The basic shape has been identified in the form as I mentioned. The question is of satisfactory (objective) degree of accuracy only.
 
Last edited:

Related to Judgement of good fit: where to stop?

1. What is the "Judgement of Good Fit" in scientific research?

The judgement of good fit refers to the process of determining whether a statistical model adequately represents the data being analyzed. It involves evaluating the level of agreement between the observed data and the predicted values from the model.

2. How is the judgement of good fit determined?

The judgement of good fit is determined by comparing various statistical measures, such as the coefficient of determination (R-squared), p-values, and residual plots. These measures help to assess the level of fit between the model and the data.

3. Why is the judgement of good fit important in scientific research?

The judgement of good fit is important because it helps to determine the validity and reliability of the statistical model being used. If a model has a poor fit to the data, it may not accurately represent the relationship between the variables being studied and can lead to incorrect conclusions.

4. Where should one stop when evaluating the judgement of good fit?

There is no definitive answer to where one should stop when evaluating the judgement of good fit. It ultimately depends on the purpose of the research and the specific statistical model being used. However, it is generally recommended to stop when a satisfactory level of fit is achieved and no further improvements can be made.

5. How can one improve the judgement of good fit?

One can improve the judgement of good fit by using more advanced statistical techniques, such as adding additional variables to the model or using different model specifications. It is also important to carefully examine the data and make any necessary adjustments, such as removing outliers or transforming the variables, to improve the fit of the model.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
16
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
905
  • Set Theory, Logic, Probability, Statistics
Replies
24
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
13
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
21
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
898
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
1K
Back
Top