Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Simple conditional probability question

  1. Jun 22, 2012 #1

    I'm trying to work out a conditional probability.

    I have hundreds of measurements of two variables (1) Start Time and (2) Journey time.

    I've created a frequency table.

    https://dl.dropbox.com/u/54057365/All/forum.JPG [Broken]

    How can I work out the Journey time given a start time?

    P(JT | ST) = P(JT n ST)/P(ST)

    How would you work out these?

    For example given 8am what would be the probable journey time?

    Thanks for your help

    Last edited by a moderator: May 6, 2017
  2. jcsd
  3. Jun 22, 2012 #2
    Hi John,

    Simply gather all the data you have for 8am an check its distribution, then you can calculate a confidence interval for the expected value with the data. That's it.
  4. Jun 22, 2012 #3

    Thanks very much for your reply and your suggestion.

    I'm trying to understand how to work it out manually. Is this possible using the table?

    The frequency table shows the number of occurrences for each variable pair in my data set. For example there were 5 journeys that started at 8am and lasted 5 minutes.

    Is there a way to manually calculate the probably journey time given the start time?

    For example 9am

    P(JT | ST) = P(JT n ST)/P(ST)

    P( JT | 9am) = ?

    Thanks for your help
  5. Jun 22, 2012 #4
    Hi Viraltux,

    Just wondering if you had any further thoughts on this?


  6. Jun 22, 2012 #5
    I think you want the expected value, not the probability.

    To calculate the expected value of the journey at 8am using the table simply do this calculation:

    (1*2 + 2*7 + 3*6 + 4*5 + 5*5 + ..... + 10*6 ) / ( 2 + 7 + 6 + 5 + 5 + ... + 6)
  7. Jun 22, 2012 #6
    Hello viraltux,

    Yes it is the expected value that I was looking to calculate, thank you.

    I sort of understand the calculation.

    Can I expand it a little.

    Say I wanted to calculate the expected journey time given Journey Start Time and Distance travelled.

    P(JT | ST, D)

    I've expanded the frequency table with actual measurements.

    For example,there were 25 journeys that started at 8am, were 3 miles long with a journey time of 5 minutes.

    https://dl.dropbox.com/u/54057365/All/forum.1JPG.JPG [Broken]

    So to calculate the expected journey time given that the start time is 8am and the distance is 3 miles would the calculation be:

    (3*3 +3*25+...+3*2) / (3+25+33+7+2) = 3 minutes

    Really appreciate the help


    Last edited by a moderator: May 6, 2017
  8. Jun 22, 2012 #7
    Hi John,

    That is not right, you have 5 minutes time gaps, so you should do

    (5*3 + 10*25 + 15*33 + 20*7 + 25*2 ) / ( 3 + 25 + 33 + 7 + 2)

    In general the formula for the time expected value will be

    (time1 * number1 + time2 * number 2 + ..... ) / (number1 + number2 + .... )
  9. Jun 22, 2012 #8
    Hi viraltux,

    Would it not be?

    (0*3 + 5*25 + 10*33 + 15*7 + 20*2 ) / ( 3 + 25 + 33 + 7 + 2)

    The journey times are rounded, so the zero refers to journeys that were less than 5 minutes.


  10. Jun 22, 2012 #9
    Well, then you better go for the middle point in each gap to minimize the error:

    (5/2*3 + (5 + 5/2)*25 + (10+5/2)*33 + (15+5/2)*7 + (20+5/2)*2 ) / ( 3 + 25 + 33 + 7 + 2)
  11. Jun 23, 2012 #10
    Thanks for the help and advice, I appreciate your time.


  12. Jun 23, 2012 #11
  13. Jun 23, 2012 #12
    You're just right John.
  14. Jun 26, 2012 #13
    Hi Viraltux,

    Can I ask you what the difference is practically between a conditional expectation and conditional probability question?

    Could you work this question out alternatively by dividing each cell by the total cell count? and summing up certain cells? - I am not entirely sure of the theory. Woudl that be possible or is that just wrong?

    Thanks for the help

  15. Jun 26, 2012 #14


    User Avatar
    Science Advisor
    Homework Helper
    Gold Member

    A couple of thoughts...
    By only using the data for the specific distance you want the expected time for, you're not getting the most of the data. You could estimate a speed for each time of day, e.g.
    Presumably the distances are also rounded somehow. If not rounded to nearest, need to make some adjustment there.
  16. Jun 27, 2012 #15
    Hi Viraltux and Haruspex,

    Just wondering if you had a change to read my question above?

    Is this a conditional expectation problem or a conditional probability problem? What is the difference?

    I asked this question in another thread https://www.physicsforums.com/showthread.php?t=580126&page=2

    and the answer was that it was a conditional probability problem? and the respondent describes a different way to calculate it.

    I'm confused now as to which method is correct? and the theory behind them.

    Maybe both methods are correct?


  17. Jun 27, 2012 #16


    User Avatar
    Science Advisor

    This is an expectation problem since you are trying to find an average or a mean.

    Also in the other thread in case you're wondering, to find the expectation I calculated the probability distribution first and then the mean (or average) using the probability. You don't have to do it this way and you can do it as viraltux has mentioned, but they are the same thing.

    The only difference is that my method converted the frequency data to proper probabilities first whereas viraltux goes straight from frequency data to the mean (or average) in one step.
  18. Jun 27, 2012 #17
    Hello Chiro,

    Thank you for your reply - it must be late in Australia.

    I understand the two problems are related now. You worked out the probabilities first and applied them to frequency data to get the expected value.

    Thanks for clearing that up for me.

  19. Jun 27, 2012 #18
    That's exactly so, I went straight to the point to minimize the coding in John's spreadsheet, nonetheless, John, the difficulties you have are quite basic so I would recommend you to follow an introductory course in Statistics so that you can make some exercises/questions and get a better understanding of the basics. I think it is a better strategy that random learning concepts here and there... If you do it you'll see how everything falls into place. :smile:
  20. Jun 27, 2012 #19


    User Avatar
    Science Advisor
    Homework Helper
    Gold Member

    Just to add a little to chiro's reply:
    If you determine the conditional probabilities then you have everything, and conditional expectation is just one of many numbers you can derive from that. But in many problems, such as this, you only care about the expectation, and it can be much easier to go straight to that and forget about the detailed probabilities. In short, it's a shortcut.
  21. Jul 24, 2012 #20
    Hi Guys,

    Could I ask another question with regards to probability theory?

    I have recorded some journey distances for cars. Here is an example. The total distance for the day is in the first column and the individual distances are in the adjacent columns. This table contains data for a total travel distance of 12 miles.

    https://dl.dropbox.com/u/54057365/All/table.JPG [Broken]

    I have a program that simulates a total daily travel distance and the number of journeys.

    For example it could simulate a total travel distance of 12 miles and 3 journeys.

    I am trying to determine the individual distances and order of of the individual journeys from my observed data.

    For example it could be 3 miles + 3 miles + 6 miles = 12 miles

    Could you advise me how you would do this? I believe it is a branch of probability called stick breaking construction.

    I believe I would begin by determining x1 from the f(x1), and then x2 from f(x2|x1), and then x3 = D-x1-x2

    I would be grateful if perhaps you could demonstrate a quick example from the above table?

    Kind Regards

    Last edited by a moderator: May 6, 2017
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook