How Do I Determine the Probability Distribution from Time and Size Data?

  • Context: Undergrad 
  • Thread starter Thread starter serdayne
  • Start date Start date
  • Tags Tags
    Data
Click For Summary

Discussion Overview

The discussion revolves around determining the probability distribution from a dataset containing time and size measurements. Participants explore how to analyze the relationship between these two variables, considering both theoretical and practical aspects of probability distribution analysis.

Discussion Character

  • Exploratory
  • Technical explanation
  • Mathematical reasoning

Main Points Raised

  • One participant seeks clarification on how to find the probability distribution for time and size data, expressing uncertainty about their adviser's suggestion.
  • Another participant interprets the adviser's request as needing to determine the probability that a given time corresponds to a specific size.
  • A different participant suggests calculating the average time for each size and plotting these averages against size, questioning the role of probability in this approach.
  • One participant proposes plotting the probability distribution for each size and combining these to create a two-axis plot that represents average probability against average time for each size, acknowledging that this may not be the most accurate method.

Areas of Agreement / Disagreement

Participants express differing interpretations of how to incorporate size into the probability distribution analysis, with no consensus on the best approach to visualize or calculate the relationship between time and size.

Contextual Notes

Participants mention various methods and interpretations without resolving the underlying assumptions about the data distribution or the relationship between time and size.

serdayne
Messages
9
Reaction score
0
I have a program for which I am trying to analyze its performance. My adviser recommended that I find the probability distribution for the data I have. However, I am not quite sure how to do this.

The data is something like:

Code:
[U]Time[/U]      [U]Size[/U]
2.10 ms     2  
2.30 ms     2 
2.90 ms     3
3.10 ms     2
3.30 ms     4
4.10 ms     4
4.30 ms     4
5.30 ms     5
5.50 ms     6

..etc

He suggested I find the probability distribution between the times and the size. I am not really sure what he means by this.

What I've tried: I found the average and the standard deviation of the times. I then, in Excel, used the function:

Code:
NORMDIST(Time[x], avg, std dev, true)

Where x is a Time on the above list. I do this for every single time.

I then plot the distributions (on the Y) vs. the Times (on the X). With this, I get a plot that resembles the one I've attached to this post.

The question is: is the plot of distributions for each value vs. Time a meaningful plot?

Also, I do not have Size factored in. What plot would allow me to compare Time vs. Size?

Thank you.
 

Attachments

  • example.jpg
    example.jpg
    20.2 KB · Views: 482
Physics news on Phys.org
Welcome to PF :P

My interpretation is that he wants you to tell him what is the chance that a given time belongs to a given size.
 
martix said:
Welcome to PF :P

My interpretation is that he wants you to tell him what is the chance that a given time belongs to a given size.

Thanks!

However, how can I show that? The above chart does not factor in Size at all. It is NORMDIST(TIME) vs. TIME.

Thanks.
 
Also, I should mention, that for me, the best way to interpret this data would be to figure out the average Time and measure that against the size. That way, for every sample that is Size 2 I'd have an Average Time, for Size 3 and Average Time, etc.

I would plot the Average Time for Each Size vs. Each Size. That way I'd know how long, on average, each Size sample took.

I'm not sure where the probability comes into play here.
 
Yes indeed.
Since size is a discrete variable and assuming a normal distribution, you can do a plot of the probability distribution for each size and then combine these to form the average probability of a given time offset from the average of the time for a particular size belonging to that size. In other words on Y you have average prob, and on X in the center you have average time for given size(with different sizes you put different average times there).
It may not be the most accurate approach, but it does condense all the information you have in on 2-axis plot.
 

Similar threads

  • · Replies 37 ·
2
Replies
37
Views
5K
  • · Replies 18 ·
Replies
18
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
Replies
24
Views
3K
  • · Replies 9 ·
Replies
9
Views
5K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 6 ·
Replies
6
Views
2K
Replies
11
Views
2K