Simple conditional probability question

bradyj7 · Jun 22, 2012

Hello,

I'm trying to work out a conditional probability.

I have hundreds of measurements of two variables (1) Start Time and (2) Journey time.

I've created a frequency table.

https://dl.dropbox.com/u/54057365/All/forum.JPG

How can I work out the Journey time given a start time?

P(JT | ST) = P(JT n ST)/P(ST)

How would you work out these?

For example given 8am what would be the probable journey time?

Thanks for your help

John

viraltux · Jun 22, 2012

bradyj7 said:

How would you work out these?

For example given 8am what would be the probable journey time?

Thanks for your help

John

Hi John,

Simply gather all the data you have for 8am an check its distribution, then you can calculate a confidence interval for the expected value with the data. That's it.

bradyj7 · Jun 22, 2012

Hello,

Thanks very much for your reply and your suggestion.

I'm trying to understand how to work it out manually. Is this possible using the table?

The frequency table shows the number of occurrences for each variable pair in my data set. For example there were 5 journeys that started at 8am and lasted 5 minutes.

Is there a way to manually calculate the probably journey time given the start time?

For example 9am

P(JT | ST) = P(JT n ST)/P(ST)

P( JT | 9am) = ?

Thanks for your help

bradyj7 · Jun 22, 2012

Hi Viraltux,

Just wondering if you had any further thoughts on this?

Regards

John

viraltux · Jun 22, 2012

bradyj7 said:

Hello,

Thanks very much for your reply and your suggestion.

I'm trying to understand how to work it out manually. Is this possible using the table?

The frequency table shows the number of occurrences for each variable pair in my data set. For example there were 5 journeys that started at 8am and lasted 5 minutes.

Is there a way to manually calculate the probably journey time given the start time?

For example 9am

P(JT | ST) = P(JT n ST)/P(ST)

P( JT | 9am) = ?

Thanks for your help

I think you want the expected value, not the probability.

To calculate the expected value of the journey at 8am using the table simply do this calculation:

(1*2 + 2*7 + 3*6 + 4*5 + 5*5 + ... + 10*6 ) / ( 2 + 7 + 6 + 5 + 5 + ... + 6)

bradyj7 · Jun 22, 2012

Hello viraltux,

Yes it is the expected value that I was looking to calculate, thank you.

I sort of understand the calculation.

Can I expand it a little.

Say I wanted to calculate the expected journey time given Journey Start Time and Distance travelled.

P(JT | ST, D)

I've expanded the frequency table with actual measurements.

For example,there were 25 journeys that started at 8am, were 3 miles long with a journey time of 5 minutes.

https://dl.dropbox.com/u/54057365/All/forum.1JPG.JPG

So to calculate the expected journey time given that the start time is 8am and the distance is 3 miles would the calculation be:

(3*3 +3*25+...+3*2) / (3+25+33+7+2) = 3 minutes

Really appreciate the help

Thanks

John

viraltux · Jun 22, 2012

bradyj7 said:

For example,there were 25 journeys that started at 8am, were 3 miles long with a journey time of 5 minutes.

So to calculate the expected journey time given that the start time is 8am and the distance is 3 miles would the calculation be:

(3*3 +3*25+...+3*2) / (3+25+33+7+2) = 3 minutes

Really appreciate the help

Thanks

John

Hi John,

That is not right, you have 5 minutes time gaps, so you should do

(5*3 + 10*25 + 15*33 + 20*7 + 25*2 ) / ( 3 + 25 + 33 + 7 + 2)

In general the formula for the time expected value will be

(time1 * number1 + time2 * number 2 + ... ) / (number1 + number2 + ... )

bradyj7 · Jun 22, 2012

viraltux said:

o

(5*3 + 10*25 + 15*33 + 20*7 + 25*2 ) / ( 3 + 25 + 33 + 7 + 2)

In general the formula for the time expected value will be

(time1 * number1 + time2 * number 2 + ... ) / (number1 + number2 + ... )

Hi viraltux,

Would it not be?

(0*3 + 5*25 + 10*33 + 15*7 + 20*2 ) / ( 3 + 25 + 33 + 7 + 2)

The journey times are rounded, so the zero refers to journeys that were less than 5 minutes.

Thanks

John

viraltux · Jun 22, 2012

bradyj7 said:

Hi viraltux,

Would it not be?

(0*3 + 5*25 + 10*33 + 15*7 + 20*2 ) / ( 3 + 25 + 33 + 7 + 2)

The journey times are rounded, so the zero refers to journeys that were less than 5 minutes.

Thanks

John

Well, then you better go for the middle point in each gap to minimize the error:

(5/2*3 + (5 + 5/2)*25 + (10+5/2)*33 + (15+5/2)*7 + (20+5/2)*2 ) / ( 3 + 25 + 33 + 7 + 2)

bradyj7 · Jun 23, 2012

Thanks for the help and advice, I appreciate your time.

Regards

John

bradyj7 · Jun 23, 2012

Hi Viraltux,

Would you mind if I asked you one more question?

I'm just trying to understand the theory.

Is this type of problem a conditional expectation problem? as described here http://en.wikipedia.org/wiki/Conditional_expectation

Thanks

John

viraltux · Jun 23, 2012

bradyj7 said:

Hi Viraltux,

Would you mind if I asked you one more question?

I'm just trying to understand the theory.

Is this type of problem a conditional expectation problem? as described here http://en.wikipedia.org/wiki/Conditional_expectation

Thanks

John

You're just right John.

bradyj7 · Jun 26, 2012

Hi Viraltux,

Can I ask you what the difference is practically between a conditional expectation and conditional probability question?

Could you work this question out alternatively by dividing each cell by the total cell count? and summing up certain cells? - I am not entirely sure of the theory. would that be possible or is that just wrong?

Thanks for the help

John

haruspex · Jun 26, 2012

A couple of thoughts...
By only using the data for the specific distance you want the expected time for, you're not getting the most of the data. You could estimate a speed for each time of day, e.g.
Presumably the distances are also rounded somehow. If not rounded to nearest, need to make some adjustment there.

bradyj7 · Jun 27, 2012

Hi Viraltux and Haruspex,

Just wondering if you had a change to read my question above?

Is this a conditional expectation problem or a conditional probability problem? What is the difference?

I asked this question in another thread https://www.physicsforums.com/showthread.php?t=580126&page=2

and the answer was that it was a conditional probability problem? and the respondent describes a different way to calculate it.

I'm confused now as to which method is correct? and the theory behind them.

Maybe both methods are correct?

Thanks

John

chiro · Jun 27, 2012

bradyj7 said:

Hi Viraltux and Haruspex,

Just wondering if you had a change to read my question above?

Is this a conditional expectation problem or a conditional probability problem? What is the difference?

I asked this question in another thread https://www.physicsforums.com/showthread.php?t=580126&page=2

and the answer was that it was a conditional probability problem? and the respondent describes a different way to calculate it.

I'm confused now as to which method is correct? and the theory behind them.

Maybe both methods are correct?

Thanks

John

This is an expectation problem since you are trying to find an average or a mean.

Also in the other thread in case you're wondering, to find the expectation I calculated the probability distribution first and then the mean (or average) using the probability. You don't have to do it this way and you can do it as viraltux has mentioned, but they are the same thing.

The only difference is that my method converted the frequency data to proper probabilities first whereas viraltux goes straight from frequency data to the mean (or average) in one step.

bradyj7 · Jun 27, 2012

Hello Chiro,

Thank you for your reply - it must be late in Australia.

I understand the two problems are related now. You worked out the probabilities first and applied them to frequency data to get the expected value.

Thanks for clearing that up for me.

John

viraltux · Jun 27, 2012

chiro said:

This is an expectation problem since you are trying to find an average or a mean.

Also in the other thread in case you're wondering, to find the expectation I calculated the probability distribution first and then the mean (or average) using the probability. You don't have to do it this way and you can do it as viraltux has mentioned, but they are the same thing.

The only difference is that my method converted the frequency data to proper probabilities first whereas viraltux goes straight from frequency data to the mean (or average) in one step.

That's exactly so, I went straight to the point to minimize the coding in John's spreadsheet, nonetheless, John, the difficulties you have are quite basic so I would recommend you to follow an introductory course in Statistics so that you can make some exercises/questions and get a better understanding of the basics. I think it is a better strategy that random learning concepts here and there... If you do it you'll see how everything falls into place.

haruspex · Jun 27, 2012

bradyj7 said:

Is this a conditional expectation problem or a conditional probability problem? What is the difference?

Just to add a little to chiro's reply:
If you determine the conditional probabilities then you have everything, and conditional expectation is just one of many numbers you can derive from that. But in many problems, such as this, you only care about the expectation, and it can be much easier to go straight to that and forget about the detailed probabilities. In short, it's a shortcut.

bradyj7 · Jul 24, 2012

Hi Guys,

Could I ask another question with regards to probability theory?

I have recorded some journey distances for cars. Here is an example. The total distance for the day is in the first column and the individual distances are in the adjacent columns. This table contains data for a total travel distance of 12 miles.

https://dl.dropbox.com/u/54057365/All/table.JPG

I have a program that simulates a total daily travel distance and the number of journeys.

For example it could simulate a total travel distance of 12 miles and 3 journeys.

I am trying to determine the individual distances and order of of the individual journeys from my observed data.

For example it could be 3 miles + 3 miles + 6 miles = 12 miles

Could you advise me how you would do this? I believe it is a branch of probability called stick breaking construction.

I believe I would begin by determining x1 from the f(x1), and then x2 from f(x2|x1), and then x3 = D-x1-x2

I would be grateful if perhaps you could demonstrate a quick example from the above table?

Kind Regards

John

haruspex · Jul 24, 2012

bradyj7 said:

I have recorded some journey distances for cars. Here is an example. The total distance for the day is in the first column and the individual distances are in the adjacent columns. This table contains data for a total travel distance of 12 miles.

https://dl.dropbox.com/u/54057365/All/table.JPG

I have a program that simulates a total daily travel distance and the number of journeys.
For example it could simulate a total travel distance of 12 miles and 3 journeys.

Not sure what you're trying to do. Are you treating the program as a black box, and trying to discover the distributions of what it generates? If so, what arranged that the totals were all 12? Is that an input to the program, or is the table the set of outputs that happened to have a total of 12?

I am trying to determine the individual distances and order of of the individual journeys from my observed data.

You mean, probabilities of distances, right? Order of journeys? If they're not x1-x2-x3... then what info do you have? Or do you mean their ranking by length?

I believe I would begin by determining x1 from the f(x1), and then x2 from f(x2|x1), and then x3 = D-x1-x2

What do you mean by "determining x1"? x1 is a random variable, and I'm guessing f(x1) is the observed distribution.

Simple conditional probability question

Graduate Expected numbers of cards of a last color remaining

Undergrad The problem of points

Graduate Probability puzzle

Undergrad The countability paradox of computable numbers

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Simple conditional probability question

Similar threads