## Simple conditional probability question

Hello,

I'm trying to work out a conditional probability.

I have hundreds of measurements of two variables (1) Start Time and (2) Journey time.

I've created a frequency table.

How can I work out the Journey time given a start time?

P(JT | ST) = P(JT n ST)/P(ST)

How would you work out these?

For example given 8am what would be the probable journey time?

John

 PhysOrg.com science news on PhysOrg.com >> Scientist finds topography of Eastern Seaboard muddles ancient sea level changes>> Stacking 2-D materials produces surprising results>> Facebook and Twitter jump on Google glasses (Update)

 Quote by bradyj7 How would you work out these? For example given 8am what would be the probable journey time? Thanks for your help John
Hi John,

Simply gather all the data you have for 8am an check its distribution, then you can calculate a confidence interval for the expected value with the data. That's it.

 Hello, Thanks very much for your reply and your suggestion. I'm trying to understand how to work it out manually. Is this possible using the table? The frequency table shows the number of occurrences for each variable pair in my data set. For example there were 5 journeys that started at 8am and lasted 5 minutes. Is there a way to manually calculate the probably journey time given the start time? For example 9am P(JT | ST) = P(JT n ST)/P(ST) P( JT | 9am) = ? Thanks for your help

## Simple conditional probability question

Hi Viraltux,

Just wondering if you had any further thoughts on this?

Regards

John

 Quote by bradyj7 Hello, Thanks very much for your reply and your suggestion. I'm trying to understand how to work it out manually. Is this possible using the table? The frequency table shows the number of occurrences for each variable pair in my data set. For example there were 5 journeys that started at 8am and lasted 5 minutes. Is there a way to manually calculate the probably journey time given the start time? For example 9am P(JT | ST) = P(JT n ST)/P(ST) P( JT | 9am) = ? Thanks for your help
I think you want the expected value, not the probability.

To calculate the expected value of the journey at 8am using the table simply do this calculation:

(1*2 + 2*7 + 3*6 + 4*5 + 5*5 + ..... + 10*6 ) / ( 2 + 7 + 6 + 5 + 5 + ... + 6)

 Hello viraltux, Yes it is the expected value that I was looking to calculate, thank you. I sort of understand the calculation. Can I expand it a little. Say I wanted to calculate the expected journey time given Journey Start Time and Distance travelled. P(JT | ST, D) I've expanded the frequency table with actual measurements. For example,there were 25 journeys that started at 8am, were 3 miles long with a journey time of 5 minutes. So to calculate the expected journey time given that the start time is 8am and the distance is 3 miles would the calculation be: (3*3 +3*25+...+3*2) / (3+25+33+7+2) = 3 minutes Really appreciate the help Thanks John

 Quote by bradyj7 For example,there were 25 journeys that started at 8am, were 3 miles long with a journey time of 5 minutes. So to calculate the expected journey time given that the start time is 8am and the distance is 3 miles would the calculation be: (3*3 +3*25+...+3*2) / (3+25+33+7+2) = 3 minutes Really appreciate the help Thanks John
Hi John,

That is not right, you have 5 minutes time gaps, so you should do

(5*3 + 10*25 + 15*33 + 20*7 + 25*2 ) / ( 3 + 25 + 33 + 7 + 2)

In general the formula for the time expected value will be

(time1 * number1 + time2 * number 2 + ..... ) / (number1 + number2 + .... )

 Quote by viraltux o (5*3 + 10*25 + 15*33 + 20*7 + 25*2 ) / ( 3 + 25 + 33 + 7 + 2) In general the formula for the time expected value will be (time1 * number1 + time2 * number 2 + ..... ) / (number1 + number2 + .... )
Hi viraltux,

Would it not be?

(0*3 + 5*25 + 10*33 + 15*7 + 20*2 ) / ( 3 + 25 + 33 + 7 + 2)

The journey times are rounded, so the zero refers to journeys that were less than 5 minutes.

Thanks

John

 Quote by bradyj7 Hi viraltux, Would it not be? (0*3 + 5*25 + 10*33 + 15*7 + 20*2 ) / ( 3 + 25 + 33 + 7 + 2) The journey times are rounded, so the zero refers to journeys that were less than 5 minutes. Thanks John
Well, then you better go for the middle point in each gap to minimize the error:

(5/2*3 + (5 + 5/2)*25 + (10+5/2)*33 + (15+5/2)*7 + (20+5/2)*2 ) / ( 3 + 25 + 33 + 7 + 2)

 Thanks for the help and advice, I appreciate your time. Regards John
 Hi Viraltux, Would you mind if I asked you one more question? I'm just trying to understand the theory. Is this type of problem a conditional expectation problem? as described here http://en.wikipedia.org/wiki/Conditional_expectation Thanks John

 Quote by bradyj7 Hi Viraltux, Would you mind if I asked you one more question? I'm just trying to understand the theory. Is this type of problem a conditional expectation problem? as described here http://en.wikipedia.org/wiki/Conditional_expectation Thanks John
You're just right John.

 Hi Viraltux, Can I ask you what the difference is practically between a conditional expectation and conditional probability question? Could you work this question out alternatively by dividing each cell by the total cell count? and summing up certain cells? - I am not entirely sure of the theory. Woudl that be possible or is that just wrong? Thanks for the help John
 Recognitions: Homework Help Science Advisor A couple of thoughts... By only using the data for the specific distance you want the expected time for, you're not getting the most of the data. You could estimate a speed for each time of day, e.g. Presumably the distances are also rounded somehow. If not rounded to nearest, need to make some adjustment there.
 Hi Viraltux and Haruspex, Just wondering if you had a change to read my question above? Is this a conditional expectation problem or a conditional probability problem? What is the difference? I asked this question in another thread http://www.physicsforums.com/showthr...=580126&page=2 and the answer was that it was a conditional probability problem? and the respondent describes a different way to calculate it. I'm confused now as to which method is correct? and the theory behind them. Maybe both methods are correct? Thanks John

 Quote by bradyj7 Hi Viraltux and Haruspex, Just wondering if you had a change to read my question above? Is this a conditional expectation problem or a conditional probability problem? What is the difference? I asked this question in another thread http://www.physicsforums.com/showthr...=580126&page=2 and the answer was that it was a conditional probability problem? and the respondent describes a different way to calculate it. I'm confused now as to which method is correct? and the theory behind them. Maybe both methods are correct? Thanks John
This is an expectation problem since you are trying to find an average or a mean.

Also in the other thread in case you're wondering, to find the expectation I calculated the probability distribution first and then the mean (or average) using the probability. You don't have to do it this way and you can do it as viraltux has mentioned, but they are the same thing.

The only difference is that my method converted the frequency data to proper probabilities first whereas viraltux goes straight from frequency data to the mean (or average) in one step.

 Hello Chiro, Thank you for your reply - it must be late in Australia. I understand the two problems are related now. You worked out the probabilities first and applied them to frequency data to get the expected value. Thanks for clearing that up for me. John