Simple conditional probability question

In summary: Thanks for your time, I appreciate it. JohnYes, this is definitely a conditional expectation problem.
  • #1
bradyj7
122
0
Hello,

I'm trying to work out a conditional probability.

I have hundreds of measurements of two variables (1) Start Time and (2) Journey time.

I've created a frequency table.

https://dl.dropbox.com/u/54057365/All/forum.JPG

How can I work out the Journey time given a start time?

P(JT | ST) = P(JT n ST)/P(ST)

How would you work out these?

For example given 8am what would be the probable journey time?

Thanks for your help

John
 
Last edited by a moderator:
Physics news on Phys.org
  • #2
bradyj7 said:
How would you work out these?

For example given 8am what would be the probable journey time?

Thanks for your help

John

Hi John,

Simply gather all the data you have for 8am an check its distribution, then you can calculate a confidence interval for the expected value with the data. That's it.
 
  • #3
Hello,

Thanks very much for your reply and your suggestion.

I'm trying to understand how to work it out manually. Is this possible using the table?

The frequency table shows the number of occurrences for each variable pair in my data set. For example there were 5 journeys that started at 8am and lasted 5 minutes.

Is there a way to manually calculate the probably journey time given the start time?

For example 9am

P(JT | ST) = P(JT n ST)/P(ST)

P( JT | 9am) = ?

Thanks for your help
 
  • #4
Hi Viraltux,

Just wondering if you had any further thoughts on this?

Regards

John
 
  • #5
bradyj7 said:
Hello,

Thanks very much for your reply and your suggestion.

I'm trying to understand how to work it out manually. Is this possible using the table?

The frequency table shows the number of occurrences for each variable pair in my data set. For example there were 5 journeys that started at 8am and lasted 5 minutes.

Is there a way to manually calculate the probably journey time given the start time?

For example 9am

P(JT | ST) = P(JT n ST)/P(ST)

P( JT | 9am) = ?

Thanks for your help

I think you want the expected value, not the probability.

To calculate the expected value of the journey at 8am using the table simply do this calculation:

(1*2 + 2*7 + 3*6 + 4*5 + 5*5 + ... + 10*6 ) / ( 2 + 7 + 6 + 5 + 5 + ... + 6)
 
  • #6
Hello viraltux,

Yes it is the expected value that I was looking to calculate, thank you.

I sort of understand the calculation.

Can I expand it a little.

Say I wanted to calculate the expected journey time given Journey Start Time and Distance travelled.

P(JT | ST, D)

I've expanded the frequency table with actual measurements.

For example,there were 25 journeys that started at 8am, were 3 miles long with a journey time of 5 minutes.

https://dl.dropbox.com/u/54057365/All/forum.1JPG.JPG

So to calculate the expected journey time given that the start time is 8am and the distance is 3 miles would the calculation be:

(3*3 +3*25+...+3*2) / (3+25+33+7+2) = 3 minutes

Really appreciate the help

Thanks

John
 
Last edited by a moderator:
  • #7
bradyj7 said:
For example,there were 25 journeys that started at 8am, were 3 miles long with a journey time of 5 minutes.

So to calculate the expected journey time given that the start time is 8am and the distance is 3 miles would the calculation be:

(3*3 +3*25+...+3*2) / (3+25+33+7+2) = 3 minutes

Really appreciate the help

Thanks

John

Hi John,

That is not right, you have 5 minutes time gaps, so you should do

(5*3 + 10*25 + 15*33 + 20*7 + 25*2 ) / ( 3 + 25 + 33 + 7 + 2)

In general the formula for the time expected value will be

(time1 * number1 + time2 * number 2 + ... ) / (number1 + number2 + ... )
 
  • #8
viraltux said:
o

(5*3 + 10*25 + 15*33 + 20*7 + 25*2 ) / ( 3 + 25 + 33 + 7 + 2)

In general the formula for the time expected value will be

(time1 * number1 + time2 * number 2 + ... ) / (number1 + number2 + ... )

Hi viraltux,

Would it not be?

(0*3 + 5*25 + 10*33 + 15*7 + 20*2 ) / ( 3 + 25 + 33 + 7 + 2)

The journey times are rounded, so the zero refers to journeys that were less than 5 minutes.

Thanks

John
 
  • #9
bradyj7 said:
Hi viraltux,

Would it not be?

(0*3 + 5*25 + 10*33 + 15*7 + 20*2 ) / ( 3 + 25 + 33 + 7 + 2)

The journey times are rounded, so the zero refers to journeys that were less than 5 minutes.

Thanks

John

Well, then you better go for the middle point in each gap to minimize the error:

(5/2*3 + (5 + 5/2)*25 + (10+5/2)*33 + (15+5/2)*7 + (20+5/2)*2 ) / ( 3 + 25 + 33 + 7 + 2)
 
  • #10
Thanks for the help and advice, I appreciate your time.

Regards

John
 
  • #11
  • #12
bradyj7 said:
Hi Viraltux,

Would you mind if I asked you one more question?

I'm just trying to understand the theory.

Is this type of problem a conditional expectation problem? as described here http://en.wikipedia.org/wiki/Conditional_expectation

Thanks

John

You're just right John.
 
  • #13
Hi Viraltux,

Can I ask you what the difference is practically between a conditional expectation and conditional probability question?

Could you work this question out alternatively by dividing each cell by the total cell count? and summing up certain cells? - I am not entirely sure of the theory. Woudl that be possible or is that just wrong?

Thanks for the help

John
 
  • #14
A couple of thoughts...
By only using the data for the specific distance you want the expected time for, you're not getting the most of the data. You could estimate a speed for each time of day, e.g.
Presumably the distances are also rounded somehow. If not rounded to nearest, need to make some adjustment there.
 
  • #15
Hi Viraltux and Haruspex,

Just wondering if you had a change to read my question above?

Is this a conditional expectation problem or a conditional probability problem? What is the difference?

I asked this question in another thread https://www.physicsforums.com/showthread.php?t=580126&page=2

and the answer was that it was a conditional probability problem? and the respondent describes a different way to calculate it.

I'm confused now as to which method is correct? and the theory behind them.

Maybe both methods are correct?

Thanks

John
 
  • #16
bradyj7 said:
Hi Viraltux and Haruspex,

Just wondering if you had a change to read my question above?

Is this a conditional expectation problem or a conditional probability problem? What is the difference?

I asked this question in another thread https://www.physicsforums.com/showthread.php?t=580126&page=2

and the answer was that it was a conditional probability problem? and the respondent describes a different way to calculate it.

I'm confused now as to which method is correct? and the theory behind them.

Maybe both methods are correct?

Thanks

John

This is an expectation problem since you are trying to find an average or a mean.

Also in the other thread in case you're wondering, to find the expectation I calculated the probability distribution first and then the mean (or average) using the probability. You don't have to do it this way and you can do it as viraltux has mentioned, but they are the same thing.

The only difference is that my method converted the frequency data to proper probabilities first whereas viraltux goes straight from frequency data to the mean (or average) in one step.
 
  • #17
Hello Chiro,

Thank you for your reply - it must be late in Australia.

I understand the two problems are related now. You worked out the probabilities first and applied them to frequency data to get the expected value.

Thanks for clearing that up for me.

John
 
  • #18
chiro said:
This is an expectation problem since you are trying to find an average or a mean.

Also in the other thread in case you're wondering, to find the expectation I calculated the probability distribution first and then the mean (or average) using the probability. You don't have to do it this way and you can do it as viraltux has mentioned, but they are the same thing.

The only difference is that my method converted the frequency data to proper probabilities first whereas viraltux goes straight from frequency data to the mean (or average) in one step.

That's exactly so, I went straight to the point to minimize the coding in John's spreadsheet, nonetheless, John, the difficulties you have are quite basic so I would recommend you to follow an introductory course in Statistics so that you can make some exercises/questions and get a better understanding of the basics. I think it is a better strategy that random learning concepts here and there... If you do it you'll see how everything falls into place. :smile:
 
  • #19
bradyj7 said:
Is this a conditional expectation problem or a conditional probability problem? What is the difference?
Just to add a little to chiro's reply:
If you determine the conditional probabilities then you have everything, and conditional expectation is just one of many numbers you can derive from that. But in many problems, such as this, you only care about the expectation, and it can be much easier to go straight to that and forget about the detailed probabilities. In short, it's a shortcut.
 
  • #20
Hi Guys,

Could I ask another question with regards to probability theory?

I have recorded some journey distances for cars. Here is an example. The total distance for the day is in the first column and the individual distances are in the adjacent columns. This table contains data for a total travel distance of 12 miles.

https://dl.dropbox.com/u/54057365/All/table.JPG

I have a program that simulates a total daily travel distance and the number of journeys.

For example it could simulate a total travel distance of 12 miles and 3 journeys.

I am trying to determine the individual distances and order of of the individual journeys from my observed data.

For example it could be 3 miles + 3 miles + 6 miles = 12 miles

Could you advise me how you would do this? I believe it is a branch of probability called stick breaking construction.

I believe I would begin by determining x1 from the f(x1), and then x2 from f(x2|x1), and then x3 = D-x1-x2

I would be grateful if perhaps you could demonstrate a quick example from the above table?

Kind Regards

John
 
Last edited by a moderator:
  • #21
bradyj7 said:
I have recorded some journey distances for cars. Here is an example. The total distance for the day is in the first column and the individual distances are in the adjacent columns. This table contains data for a total travel distance of 12 miles.

https://dl.dropbox.com/u/54057365/All/table.JPG

I have a program that simulates a total daily travel distance and the number of journeys.
For example it could simulate a total travel distance of 12 miles and 3 journeys.
Not sure what you're trying to do. Are you treating the program as a black box, and trying to discover the distributions of what it generates? If so, what arranged that the totals were all 12? Is that an input to the program, or is the table the set of outputs that happened to have a total of 12?
I am trying to determine the individual distances and order of of the individual journeys from my observed data.
You mean, probabilities of distances, right? Order of journeys? If they're not x1-x2-x3... then what info do you have? Or do you mean their ranking by length?
I believe I would begin by determining x1 from the f(x1), and then x2 from f(x2|x1), and then x3 = D-x1-x2
What do you mean by "determining x1"? x1 is a random variable, and I'm guessing f(x1) is the observed distribution.
 
Last edited by a moderator:

1. What is conditional probability?

Conditional probability is the likelihood of an event occurring given that another event has already occurred. It is calculated by dividing the probability of both events occurring by the probability of the first event occurring.

2. How is conditional probability different from regular probability?

Regular probability is the likelihood of an event occurring without any prior knowledge or conditions. Conditional probability takes into account a specific condition or event that has already occurred and adjusts the probability accordingly.

3. Can you give an example of a simple conditional probability question?

An example of a simple conditional probability question is: If a fair coin is tossed twice, what is the probability of getting two heads given that the first toss resulted in a head? The answer would be 1/2, as the first toss does not affect the outcome of the second toss.

4. How is conditional probability useful in scientific research?

Conditional probability is useful in scientific research as it allows researchers to make predictions and draw conclusions based on specific conditions or events. It is often used in experiments and studies to determine the likelihood of certain outcomes.

5. What is the formula for calculating conditional probability?

The formula for conditional probability is P(A|B) = P(A and B) / P(B), where P(A|B) represents the probability of event A occurring given that event B has already occurred, P(A and B) represents the probability of both events occurring, and P(B) represents the probability of event B occurring.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
815
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
2K
  • General Math
Replies
2
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
10
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
342
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
1K
Back
Top