Probability observed value not in range for prediction

Math33 · Apr 21, 2017

Homework Statement

Hello all, I created a predictive model from a data set of observed values and am looking for probabilities for accuracy. Data set A (observed) and data set B (predictive model) have a correlation of 84 % using linear regression. Data set A and B are both normally distributed, also for every predicted B value there is an assigned A value for prediction mapping. Ex: B model produces a score for a data point of 420 and the closest A score to that is 410. Now, let's say in the future this data point is able to be observed (let's call this F.) What is the probability that F is in between 410 and 420.

Homework Equations

P(410<F<420). A and B are two separate normal distributions with two different means and standard deviations.

The Attempt at a Solution

I found the probability of A for 410 in the first normal distribution (let's say P(A)=0.56) then I found the probability of B for 420 in the second normal distribution (let's say P(B)=0.67) I then subsracted P(B)- P(A) to get 0.11. Then I substrated 1-0.11 to get 0.89. So the probability that F is going to be in the range of 410 to 420 is 89%. I am not sure if I'm doing this right . Thanks in advance.

andrewkirk · Apr 21, 2017

To comment usefully on this we'd need more information. A predictive model usually takes the form of an equation with an error term, like
$$X_{B,i}=X_{A,i}+\varepsilon_i$$
where ##X_{A,i}## and ##X_{B,i}## are the ##i##th ##A## and ##B## values respectively and ##\varepsilon_i## is a random variable called the 'error term', usually independent between different values of ##i##. ##\varepsilon_i## has a known distribution - usually, but not always, normal - which is usually the same for all ##i##.

What is the equation for your model?

Math33 · Apr 21, 2017

andrewkirk said:

To comment usefully on this we'd need more information. A predictive model usually takes the form of an equation with an error term, like
$$X_{B,i}=X_{A,i}+\varepsilon_i$$
where ##X_{A,i}## and ##X_{B,i}## are the ##i##th ##A## and ##B## values respectively and ##\varepsilon_i## is a random variable called the 'error term', usually independent between different values of ##i##. ##\varepsilon_i## has a known distribution - usually, but not always, normal - which is usually the same for all ##i##.

What is the equation for your model?

Hi Andrew, thank you for the response. I actually just got back from work so I don't have the equation in front of me but it is a 5th order polynomial, non-linear model. I fitted the model using excel trendline and actually found a correlation of 92%. Yes, I already did plot the error term (residuals) for each predicted value and a very good random pattern was shown for N=40 data points with virtually no correlation.

What my model is attempting to do is trying to find the relationship between the adaptation of a specific hazardous substance law that is present throughout the globe, and the GDP of any given country. A little background is that I created a metric system that assigns a Risk score based on how much of this hazardous substance law a country adopts. I did this with many existing countries that have this law already and generated a Risk score for each of them. My 92 % correlation is between GDP and Risk score of existing countries that have the law. What I want to try to do is to predict what the hazardous substance law is going to be for a country that might adopt it in the future. So let's say a country F that doesn't have this law has a Risk score of 284 given from the model. Then the closest score country that already has the law has a Risk score of 270. Both predicted and actual data sets are normally distributed. So I am trying to find out once country F adopts the law, what is the probability the final Risk score will be between 270 and 284. Thank you for your time.

Probability observed value not in range for prediction

SUMMARY

PREREQUISITES

NEXT STEPS

USEFUL FOR

Homework Statement

Homework Equations

The Attempt at a Solution

Similar threads

Hi! Can someone explain about Differential Equations?

Deriving spatial derivatives

Is this the correct general solution of the given PDE?

What does "compute Aut(G)" mean?

J_1(x) = (x^2/10)*(J_1(x) + J_3(x)) How to solve?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect