Trying to analyze some data

  • Thread starter serdayne
  • Start date
  • Tags
    Data
In summary, the person is trying to analyze the performance of a program and their adviser recommended finding the probability distribution for the data, which includes a list of times and sizes. The person has tried using the NORMDIST function in Excel, but it does not factor in size. The person suggests plotting the average time for each size against the size to determine the average time for each size sample. They are unsure how probability comes into play in this situation. The suggested approach is to plot the probability distribution for each size and combine them to form the average probability of a given time offset from the average time for that particular size.
  • #1
serdayne
9
0
I have a program for which I am trying to analyze its performance. My adviser recommended that I find the probability distribution for the data I have. However, I am not quite sure how to do this.

The data is something like:

Code:
[U]Time[/U]      [U]Size[/U]
2.10 ms     2  
2.30 ms     2 
2.90 ms     3
3.10 ms     2
3.30 ms     4
4.10 ms     4
4.30 ms     4
5.30 ms     5
5.50 ms     6

..etc

He suggested I find the probability distribution between the times and the size. I am not really sure what he means by this.

What I've tried: I found the average and the standard deviation of the times. I then, in Excel, used the function:

Code:
NORMDIST(Time[x], avg, std dev, true)

Where x is a Time on the above list. I do this for every single time.

I then plot the distributions (on the Y) vs. the Times (on the X). With this, I get a plot that resembles the one I've attached to this post.

The question is: is the plot of distributions for each value vs. Time a meaningful plot?

Also, I do not have Size factored in. What plot would allow me to compare Time vs. Size?

Thank you.
 

Attachments

  • example.jpg
    example.jpg
    20.2 KB · Views: 404
Physics news on Phys.org
  • #2
Welcome to PF :P

My interpretation is that he wants you to tell him what is the chance that a given time belongs to a given size.
 
  • #3
martix said:
Welcome to PF :P

My interpretation is that he wants you to tell him what is the chance that a given time belongs to a given size.

Thanks!

However, how can I show that? The above chart does not factor in Size at all. It is NORMDIST(TIME) vs. TIME.

Thanks.
 
  • #4
Also, I should mention, that for me, the best way to interpret this data would be to figure out the average Time and measure that against the size. That way, for every sample that is Size 2 I'd have an Average Time, for Size 3 and Average Time, etc.

I would plot the Average Time for Each Size vs. Each Size. That way I'd know how long, on average, each Size sample took.

I'm not sure where the probability comes into play here.
 
  • #5
Yes indeed.
Since size is a discrete variable and assuming a normal distribution, you can do a plot of the probability distribution for each size and then combine these to form the average probability of a given time offset from the average of the time for a particular size belonging to that size. In other words on Y you have average prob, and on X in the center you have average time for given size(with different sizes you put different average times there).
It may not be the most accurate approach, but it does condense all the information you have in on 2-axis plot.
 

1. How do you choose the right statistical analysis method for your data?

The choice of statistical analysis method depends on the type of data you have, the research question you are trying to answer, and the level of measurement of your variables. It is important to carefully consider these factors before selecting a method and consult with a statistician if needed.

2. How do you ensure the accuracy and reliability of your data analysis?

To ensure accuracy and reliability, it is important to have a well-defined research question, a clear data collection process, and use appropriate statistical methods. Additionally, it is important to check for outliers, conduct sensitivity analyses, and validate the results using different statistical methods if possible.

3. What are some common mistakes to avoid when analyzing data?

Some common mistakes to avoid when analyzing data include using the wrong statistical method, not checking for outliers, not considering the assumptions of the chosen method, and not validating the results using alternative methods. It is also important to properly label and organize the data to avoid confusion and errors.

4. How do you interpret the results of a statistical analysis?

The interpretation of results depends on the type of data and the chosen statistical method. It is important to carefully read the output and understand the meaning of the results. Visual aids such as graphs and charts can also help in interpreting the findings. It is also important to consider the limitations of the data and the assumptions of the chosen method when interpreting the results.

5. How do you communicate the results of data analysis effectively?

To effectively communicate the results of data analysis, it is important to use clear and concise language and avoid technical jargon. Visual aids such as graphs and charts can also help in presenting the findings in a more understandable way. It is also important to provide context and limitations of the results and to be transparent about the methods and data used in the analysis.

Similar threads

  • Set Theory, Logic, Probability, Statistics
2
Replies
37
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
24
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
18
Views
2K
  • Introductory Physics Homework Help
Replies
24
Views
1K
  • Calculus and Beyond Homework Help
Replies
4
Views
1K
  • MATLAB, Maple, Mathematica, LaTeX
Replies
12
Views
3K
  • Astronomy and Astrophysics
Replies
1
Views
1K
  • Engineering and Comp Sci Homework Help
Replies
6
Views
2K
  • MATLAB, Maple, Mathematica, LaTeX
Replies
1
Views
3K
  • MATLAB, Maple, Mathematica, LaTeX
Replies
2
Views
1K
Back
Top