How to work out expected frequency from normal distribution

question dude · Aug 26, 2015

attachment.php?attachmentid=455065&d=1440381068.jpg

How is the expected frequency column worked out for each interval of trains?

2) My attempt

Take the first interval, 60 - 62, I thought about doing this:

(62 - mean) / standard deviation

(62 - 67.45) / 2.92 = - 1.866

using Z score < - 1.886, from the normal distribution table, I get:

1 - 0.9686 = 0.0314

0.0314*(100) = 3.14

please note 100 is the total observed frequency

As you can see, I get 3.14 instead of 4.13 as given in the expected frequency column.

Ray Vickson · Aug 26, 2015

question dude said:

How is the expected frequency column worked out for each interval of trains?

2) My attempt

Take the first interval, 60 - 62, I thought about doing this:

(62 - mean) / standard deviation

(62 - 67.45) / 2.92 = - 1.866

using Z score < - 1.886, from the normal distribution table, I get:

1 - 0.9686 = 0.0314

0.0314*(100) = 3.14

please note 100 is the total observed frequency

As you can see, I get 3.14 instead of 4.13 as given in the expected frequency column.

I don't get your answers; I don't get the tabulated expected frequencies, either, but I come close to the latter.

The number of trains is integer-valued (i.e, whole numbers) but you are approximating its distribution by a continuous distribution (the normal). So, the statement {60 ≤ trains ≤ 62} is the same as {59.5 ≤ trains ≤ 62.5} for actual, physical trains. If you use the normal distribution on the interval (59.5,62.5) you will get an expected frequency of 100* 0.04178 ≈ 4.178, which is not that far from the tabulate value of 4.13. For the interval (63 → 65 ) = (62.5 → 65.5) I get an expected frequency of 100 * 0.2071 = 20.71, which is close to the tabulated 20.68.

I used Maple to do accurate computations; if the tabulator used cruder tools he/she could get less accurate answers.

BTW: in goodness-of-fit tests we do NOT usually round off the "expected frequencies" to whole numbers; for the first cell we would typically leave it as 4.178 (or maybe 4.18, or maybe 4.2). The reason for this is that there is no reason at all to assume the expected frequencies to be integers. This has nothing at all to do with whether or not the distribution is discrete (for whole numbers) or for a continuous (like the normal): the expected cell frequency for an integer-valued random variable can --- and usually is --- a non-integer quantity.

How to work out expected frequency from normal distribution

Similar threads

The optimal way of dividing the bet three ways

"Critical" Triangle Problem

What does "compute Aut(G)" mean?

Hedging on a weather prediction

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect