serbring said:
I got it. Is there any way to have something similar for n different distributions?
Is your goal to do hypothesis testing or to do estimation?
A "hypothesis test" involves some yes-or-no question like: "Do all the drivers have the same probability distribution for selecting torques when they drive the same course? - yes-or-no ?" or "Does driver_A use the same probability distribution for selecting torques when he drives the same course twice? yes-or no?"
An "estimation" involves estimating the parameters of a probability distribution or a statistical model. For example, if we assume each driver selects torques from a probability distribution of a certain type, we could try to estimate the parameters ( e.g. mean, standard deviation) of the distribution that apply to each particular driver. This approach takes for granted that distributions for different drivers may have different parameters.
Applying statistics is subjective, in spite of the impressive terminology (e.g. "significance", "confidence") that it uses. Typical textbook problems involving hypothesis testing compare two distributions. (i.e. testing the hypothesis that distribution_1 is the same as distribution_2). As far as I know, there is no "standard" way to solve problems that test a grand hypothesis like "All 100 distributions are the same distribution". People can tackle such a hypothesis by first doing pairwise hypothesis tests and using the results to group the distributions into pairs that test to be the same. Then,
assuming the pairs that test the same
are indeed they compare two pairs against other, etc.
Very dignified statistical analyses will do the hypothesis testing work first and then apply estimation to groups of distributions that are judged to be identical by using the combined data from the group to do the estimation.
What do you mean for model? Weibull, Lognormal and so on or a kind of physical model with random parameters?
Your data doesn't resemble distributions that have such orderly shapes.
I mean something more complicated. I don't claim to know much about the specifics of your problem, but I'll speculate.
A simple model is that the goal of a driver is to attain a certain speed at certain times on the course. The desired speed may be motivated by complicated sub-goals (e.g. Go fast, don't wreck, enjoy the scenery, arrive exactly at 3 PM, arrive before 3 PM, etc).
To further simplify the model, we could assume the desired speed is a function only of the location of the car on the course. (e.g. We could neglect the difference between a driver reaching mile post 2 and knowing he was "behind schedule" and the same driver reaching mile post 2 and knowing he was "making good time").
Part of a drivers "style" might consist of things that could be classified as "skill". (e.g. reaction time to change gears, once he decides to change gears.) To further simplify the model, let's assume all drivers have about the same reaction times.
The model is not a specific mathematical model yet, but the conceptual form of the model says that drivers don't randomly select a torque from a probability distribution. And they don't implement a "continuous time Markov process" where each level of torque has some probability distribution of times for changing to another level of torque.
This model suggests that the way to compare drivers is to compare how they select speeds at the same places on a course if you have such data.
There can be simpler models and more complicated models. What kind of conceptual model can be invented to justify comparing frequency distributions of times-of-use of torques? (I'm assuming the labels indicating "% time" on your graphs refer to data like "183 seconds out of a total of 1236 seconds" - i.e. time as measured by the clock as opposed to "number of times a particular gear was used out of the total number of gear uses").