I am building a set of data. It is composed of events that occur at discrete times throughout a day. There is tolerance, eg. per 5 seconds. I want to be able to predict the probability that an event will occur at a given time on a future day, by taking the probability derived from past days accumulated. Then I want to compare it to a random number and simulate a future day.

So, if I use one day, my prediction will be the same as this day, since the probability of each event is one (it all occured). If I include two days, then I can superimpose them, coming up with a new aggregate probability to compare my random numbers to. I think the more days I use, the more accurate a picture of typical behavior I will have, and the more my prediction will match a real future day.

Note that I don't want to predict the future. I want to emulate a future day, which statistically seems probable, not abnormal.

The behavior is not entirely random. It is for example: 'what time I went to the kitchen that day'. It happens several times a day, and mostly happens around the same times.

1) Is my approach meaningful? If not, what direction do I need to look in instead?

2) How can i correlate the days to provide an aggregate probability that an event will happen between time and (time - window)?

3) How can I measure how accurate my prediction is, as a function of number of days included in the aggregate probability?

Thank you

# Want to (roughly) predict future behavior of a system

