How Do I Determine the Probability Distribution from Time and Size Data?

serdayne · Apr 1, 2009

I have a program for which I am trying to analyze its performance. My adviser recommended that I find the probability distribution for the data I have. However, I am not quite sure how to do this.

The data is something like:

Code:

[U]Time[/U]      [U]Size[/U]
2.10 ms     2  
2.30 ms     2 
2.90 ms     3
3.10 ms     2
3.30 ms     4
4.10 ms     4
4.30 ms     4
5.30 ms     5
5.50 ms     6

..etc

He suggested I find the probability distribution between the times and the size. I am not really sure what he means by this.

What I've tried: I found the average and the standard deviation of the times. I then, in Excel, used the function:

Code:

NORMDIST(Time[x], avg, std dev, true)

Where x is a Time on the above list. I do this for every single time.

I then plot the distributions (on the Y) vs. the Times (on the X). With this, I get a plot that resembles the one I've attached to this post.

The question is: is the plot of distributions for each value vs. Time a meaningful plot?

Also, I do not have Size factored in. What plot would allow me to compare Time vs. Size?

Thank you.

martix · Apr 1, 2009

Welcome to PF :P

My interpretation is that he wants you to tell him what is the chance that a given time belongs to a given size.

serdayne · Apr 1, 2009

martix said:

Welcome to PF :P

My interpretation is that he wants you to tell him what is the chance that a given time belongs to a given size.

Thanks!

However, how can I show that? The above chart does not factor in Size at all. It is NORMDIST(TIME) vs. TIME.

Thanks.

serdayne · Apr 1, 2009

Also, I should mention, that for me, the best way to interpret this data would be to figure out the average Time and measure that against the size. That way, for every sample that is Size 2 I'd have an Average Time, for Size 3 and Average Time, etc.

I would plot the Average Time for Each Size vs. Each Size. That way I'd know how long, on average, each Size sample took.

I'm not sure where the probability comes into play here.

martix · Apr 2, 2009

Yes indeed.
Since size is a discrete variable and assuming a normal distribution, you can do a plot of the probability distribution for each size and then combine these to form the average probability of a given time offset from the average of the time for a particular size belonging to that size. In other words on Y you have average prob, and on X in the center you have average time for given size(with different sizes you put different average times there).
It may not be the most accurate approach, but it does condense all the information you have in on 2-axis plot.

How Do I Determine the Probability Distribution from Time and Size Data?

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Attachments

Similar threads

Graduate Hypothesis testing: Defining H0, HA hypotheses so that ( H_A)_A' makes sense

Undergrad My basic understanding of set theory

Undergrad The problem of points

Graduate Expected numbers of cards of a last color remaining

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect