Want to (roughly) predict future behavior of a system

Click For Summary

Discussion Overview

The discussion revolves around predicting the future behavior of a system based on past event data collected at discrete intervals throughout a day. Participants explore the methodology for estimating probabilities of events occurring at specific times, the aggregation of data from multiple days, and the evaluation of prediction accuracy. The focus is on statistical emulation rather than precise forecasting.

Discussion Character

  • Exploratory
  • Technical explanation
  • Conceptual clarification
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant proposes a method for predicting the probability of events based on historical data, suggesting that using more days will yield a more accurate representation of typical behavior.
  • Another participant emphasizes the importance of precisely defining events and populations when applying probability to real-life scenarios, noting that vague definitions can lead to misleading conclusions.
  • A different participant agrees with the initial approach but warns about the potential pitfalls of assuming independence among events, suggesting that relationships between events should be considered to avoid unrealistic simulations.
  • Concerns are raised about how to measure the accuracy of predictions as the number of days included in the aggregate probability increases.

Areas of Agreement / Disagreement

Participants express a mix of agreement and differing views on the approach to predicting future behavior. While some find the proposed method reasonable, others highlight the need for careful consideration of event definitions and dependencies, indicating that the discussion remains unresolved on certain aspects.

Contextual Notes

Limitations include the need for precise definitions of events and populations, as well as the challenge of accounting for dependencies between events, which may affect the realism of simulations.

Who May Find This Useful

Individuals interested in statistical modeling, event prediction, and simulation methodologies may find this discussion relevant.

oneamp
Messages
222
Reaction score
0
Hello

I am building a set of data. It is composed of events that occur at discrete times throughout a day. There is tolerance, eg. per 5 seconds. I want to be able to predict the probability that an event will occur at a given time on a future day, by taking the probability derived from past days accumulated. Then I want to compare it to a random number and simulate a future day.

So, if I use one day, my prediction will be the same as this day, since the probability of each event is one (it all occurred). If I include two days, then I can superimpose them, coming up with a new aggregate probability to compare my random numbers to. I think the more days I use, the more accurate a picture of typical behavior I will have, and the more my prediction will match a real future day.

Note that I don't want to predict the future. I want to emulate a future day, which statistically seems probable, not abnormal.

The behavior is not entirely random. It is for example: 'what time I went to the kitchen that day'. It happens several times a day, and mostly happens around the same times.


1) Is my approach meaningful? If not, what direction do I need to look in instead?

2) How can i correlate the days to provide an aggregate probability that an event will happen between time and (time - window)?

3) How can I measure how accurate my prediction is, as a function of number of days included in the aggregate probability?

Thank you
 
Physics news on Phys.org
A hard part of applying probability to real life is to define the events precisely. For example, if you ask "What is the probability that I go to the kitchen between 12:00 and 12:05 PM", this doesn't define a precise event because it doesn't state the population of outcomes that is involved. You'd have to say something like "On a day selected at random from all the days of the year, what is the probability that I go to the kitchen between 12:00 and 12:05 PM" or "On a weekday selected at random from all the days in the year, what is the probability that I go to the kitchen between 12:00 and 12:05 PM".

If you are interested in anything involving the relation between two events, you must define the population so the events can have that relation. For example if you are interested in the wear on the rug between the kitchen and the living room, you can't investigate this by simulating the events of the day by drawing an event every 5 minutes at random from a population defined by "On a day selected at random,...). In a real day, the event at 12:00-12:05 PM and the event at 12:05-12:10 PM aren't selected at random as if they were from completely different days.

To get the best advice, you should explain what bottom line results you want to investigate with a simulation.
 
Thanks that's enough
 
Sure. Your approach is reasonable. If you have several days of statistics, you can calculate the standard deviation of the number of events that happen in a fixed time period. That will give you an idea of how much variation there is from day to day. One thing to think about is the non-random aspects of the events. (i.e. Does eating a late breakfast tend to imply eating a late lunch? Does the arrival of the morning paper rule out the arrival of an evening paper?). If you choose to ignore those relationships and assume that every event is independent of the others it may make your simulated day unrealistic (eating 6 meals in a day?). But mimicking those relationships can quickly get out of hand. You will have to be judicious in what you assume. There are entire computer languages and systems that people use to simulate complicated things.
 
Last edited:

Similar threads

  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 45 ·
2
Replies
45
Views
6K
  • · Replies 13 ·
Replies
13
Views
2K
  • · Replies 66 ·
3
Replies
66
Views
8K
  • · Replies 11 ·
Replies
11
Views
3K
  • · Replies 23 ·
Replies
23
Views
3K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 31 ·
2
Replies
31
Views
3K
  • · Replies 1 ·
Replies
1
Views
1K