When will ball 25 drop? Predicting future observations using data

  • Context: High School 
  • Thread starter Thread starter DaveC426913
  • Start date Start date
  • Tags Tags
    Data
Click For Summary

Discussion Overview

The discussion revolves around predicting the timing of future observations based on a dataset of past events, specifically focusing on the dropping of billiard balls and later transitioning to the observation of license plates. Participants explore various methods for prediction, including averaging time intervals, linear fitting, and the implications of missing data.

Discussion Character

  • Exploratory
  • Technical explanation
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • One participant suggests using the average time between observed events to predict future drops, while others question the accuracy of ignoring intermediate observations.
  • Another participant proposes that a linear fit might provide a better estimate for future observations, emphasizing the importance of considering the variability in data over time.
  • There is a discussion about the potential for using a cost function to determine the most accurate prediction method, with references to different statistical approaches such as mean and median.
  • Concerns are raised about the impact of missing observations on the prediction accuracy, with some participants noting that the first and last observed events can disproportionately affect the results.
  • Participants discuss the need for understanding the underlying rules governing the observations, such as whether the timing of events follows a specific distribution.
  • One participant expresses a desire for a simple algorithmic approach rather than complex formulas, indicating a preference for practical implementation over theoretical understanding.
  • There is mention of the challenges in estimating the time from the introduction of a license plate to its observation, highlighting the influence of personal observation habits on data collection.

Areas of Agreement / Disagreement

Participants express a range of views on the best methods for prediction, with no consensus on a single approach. Some advocate for linear fitting while others prefer simpler averaging methods. The discussion remains unresolved regarding the optimal strategy for handling missing data and the implications of varying observation conditions.

Contextual Notes

Participants note the limitations of their approaches, including the potential biases introduced by missing observations and the varying rates of events over time. There are also discussions about the assumptions underlying their predictive models, which remain unverified.

Who May Find This Useful

This discussion may be of interest to those involved in data analysis, predictive modeling, or anyone seeking to understand the complexities of forecasting based on incomplete datasets.

  • #31
mfb said:
Sure. Your dataset has "plate 54: date, plate 55: date, plate 57: date, plate 58: date, ...
Right. Which is why you originally introduced a distinct index.

I was treating the array index as the x-axis.
i.e. my values would end up being:
item[23]: CDJ
item[24]: CDK
item[25]: CDM
So, CDL is not there, and therefore no gap where there should be a gap, putting the next item out of line.

For you:
item[23]: {index:23, plate: CDJ}
item[24]: {index:24, plate: CDK}
item[25]: {index:26, plate: CDM}
So the trend is conserved.

I see now.
 
Physics news on Phys.org
  • #32
I think what you may be looking for is Kalman Filter.
 

Similar threads

Replies
2
Views
3K
Replies
2
Views
834
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 7 ·
Replies
7
Views
4K
  • · Replies 24 ·
Replies
24
Views
2K
  • · Replies 19 ·
Replies
19
Views
2K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 3 ·
Replies
3
Views
8K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 8 ·
Replies
8
Views
2K