Parametric versus no parametric distributions

bradyj7
Messages
117
Reaction score
0
Hi there,

I'm working on a simulation of the travel patterns of cars. There are many variables and conditional probabilities in the model.

My question is, is there anything wrong with fitting all non parametric distributions to variables (both continuous and discrete)? The software I'm using fits lots (50+) different parametric distributions to data and ranks them in order of best fit. Some are very good fits but some are not very good fits. But I can't check every single distribution in a simulation, so would it be reasonable to fit all non parametric distributions?

I believe it is called fitting an Ogive distribution http://www.vosesoftware.com/ModelRi...ntinuous_distributions/Ogive_distribution.htm

Thanks
 
Physics news on Phys.org
bradyj7 said:
My question is, is there anything wrong with fitting all non parametric distributions to variables (both continuous and discrete)?

would it be reasonable to fit all non parametric distributions?
There are no theorems in mathematics that answer those questions. So there is no proof that a given distribution is wrong or that it is reasonable. Applying math to moderately complicated real world problems almost always involves making assumptions. Some people make these assumptions in an organized manner and use mathematics to deduce the proper method from them. Other people simply make assumptions as they go along, They make a long sequence of arbitrary decisions about what methods they will use. I can't take the latter kind of analysis seriously unless the person doing the work can prove the method applied to one set of data worked to predict another set that wasn't involved in the orginal analysis.

There are several approaches you can take to investigate your question in a practical manner.

The first thing you should ask is whether there is any reaonable physical model for what causes a distribution and, if so, what parameters are involved in that model.

You can try a "bootstrap" approach. Pick an important bottom line result of your project - for example, perhaps it is the total yearly useage of electricity by electric cars or the distribution of miles per year driven by drivers of electric cars, etc. Take the data you have and reepatedly pick a smaller subset of it at random and apply your methods. See how sensitive your bottom line result is to this random variation of data that is used. If your bottom line is extremely sensitive to random selection of the input data then I'd suspect that either your methods are impractical or you are dealing with a problem that is too sensitive to inputs to be reliably analyzed. (To make quantitative conclusions from "bootstrap" methods isn't straightforward mathematically. But you don't have to know how to do that to get a "feel" for how sensitive your methods are to the input data.)

You can also see how sensitive your bottom line results are to using various methods to fit distributions to all of the data. For example, would you get a drastically different number for the total yearly useage of electricity by electric cars if you used an orgive for a particular distribution than if you used a lognomal distribution?
 
Thank you Stephen
 
Hi all, I've been a roulette player for more than 10 years (although I took time off here and there) and it's only now that I'm trying to understand the physics of the game. Basically my strategy in roulette is to divide the wheel roughly into two halves (let's call them A and B). My theory is that in roulette there will invariably be variance. In other words, if A comes up 5 times in a row, B will be due to come up soon. However I have been proven wrong many times, and I have seen some...
Thread 'Detail of Diagonalization Lemma'
The following is more or less taken from page 6 of C. Smorynski's "Self-Reference and Modal Logic". (Springer, 1985) (I couldn't get raised brackets to indicate codification (Gödel numbering), so I use a box. The overline is assigning a name. The detail I would like clarification on is in the second step in the last line, where we have an m-overlined, and we substitute the expression for m. Are we saying that the name of a coded term is the same as the coded term? Thanks in advance.
Back
Top