Investigating Linearity of Event Occurrences Over Time

  • Thread starter Thread starter Master1022
  • Start date Start date
  • Tags Tags
    Linearity Time
AI Thread Summary
The discussion focuses on investigating whether event occurrences increase linearly over time using statistical analysis. The user has distribution data for event occurrences across various time horizons and is considering using median values for analysis due to non-normal distribution. They seek advice on appropriate statistical methods and visualization techniques, suggesting box plots with error bars and a linear trend line for clarity. The user acknowledges the limitations of proving linearity but is interested in using linear regression to assess the slope's significance. Overall, the inquiry aims to establish a scientific approach to understanding the relationship between time and event occurrences.
Master1022
Messages
590
Reaction score
116
Homework Statement
How to show that the occurrences of an event does not increase linearly over time?
Relevant Equations
Mean, standard deviation
Hi,

I think this is a simple question, but I just wanted to ask how I could go about showing this in a scientific manner. I will try to use an analogy later on which is, I hope, a simple way to understand what I am doing.

What I am trying to do:
I am trying to investigate whether the occurrences of an event increase linearly with time (i.e. if it happens 10 times in 1 month, then it will happen 20 times in 2 months).

What I have:
I have distribution data about the number of event occurrences for different time horizons (1 month, 2 months, ... , 6 months). From this distribution data, I can calculate all the common stats: mean, median, standard deviation, upper quartile, lower quartile, etc.

Why is it a distribution? Here is where I draw upon an analogy.
Let us imagine we want to measure the number of times some particles, in a box, collide in a certain time-frame. We can number the particles from 1 to ##n##, and then record over, for example, 1 minute, how many times they each bump into one another. Then we have a distribution from which we can calculate statistics. (Don't worry about any double counting here, that is not an issue in my problem, and this was just a simple analogy I made up in my head). Then we can run the experiment again for 2 minutes, 3 minutes, etc. and look at the distributions for each of the time horizons.

Now, returning to my problem:

What I am confused about:
1. What statistics should I be using to investigate this claim? If the data is normally distributed, should I be using the mean? Otherwise, should I be using the median?
- My data isn't normally distributed so I am leaning towards the median
- For example, we could look at the median (or mean) for 1 month and then look how the medians for later horizons compare to ##k \times ## 1 month.

2. What would be a nice way to visualize this?
- The idea I currently have is to visualize this is to have box plots (or even just points with error bars) to represent the median and LQ/UQ at each time horizon
- In background, I can have a simple linear trend line to represent what the median should look like if it were increasing linearly (basically just a line that passes through the integer multiples of the 1 month data).

I hope this makes sense and I would appreciate any insight or advice.
 
Physics news on Phys.org
If there is a clear non-linear trend of the mean, then you might be able to show that with some probability. If it is linear, then you can only show that there is some probability of a small slope. You can not prove that the slope is zero, but you can use linear regression to show with some probability that the slope is not very large.
 
Back
Top