Confidence interval for a cohort

Mogarrr · Sep 8, 2014

Homework Statement

A cohort of hemophiliacs is followed to elicit information on the distribution of time to onset of
AIDS following seroconversion (referred to as latency time). All patients who seroconvert become
symptomatic within 10 years, according to the distribution in Table 6.11.

Table 6.11 Latency time to AIDS among hemophiliacs who become HIV positive
Latency time (years) Number of patients

Latency Time(years): Number of patients
0: 2
1: 6
2: 9
3: 33
4: 49
5: 66
6: 52
7: 37
8: 18
9: 11
10: 4

(I don't know how to make a proper table with latex... tried \being{tabular}{l r} but this doesn't work)

6.64 Assuming an underlying normal distribution, compute 95% CIs for the mean and variance of
the latency times.

Homework Equations

When the variance is unknown, the t-distribution may be used
[tex] \mu = \bar{x} \pm t_{n,1- \frac {\alpha}2} \cdot \frac {s}{\sqrt {n}} [/tex]

and estimating the variance, we have...

[itex] (n-1) \cdot \frac {s^2}{ \chi^2_{n-1,1- \frac {\alpha}2}} \leq \sigma^2 \leq (n-1) \cdot \frac {s^2}{ \chi^2_{n-1,\frac {\alpha}2}} [/itex]

lastly, for the poisson distribution the confidence interval is given by [itex] \mu_1, \mu_2 [/itex], that satisfies

[itex] \frac {\alpha}2 = P(X \geq \mu | \mu = \mu_1) = \sum_{k=x}^{\infty} \frac {e^{-\mu_1} \mu_1^{k}}{k!}[/itex]

[itex] \frac {\alpha}2 = P(X \leq \mu | \mu = \mu_2) = \sum_{k=0}^{x} \frac {e^{-\mu_2} \mu_2^{k}}{k!}[/itex]

The Attempt at a Solution

I'm not really sure how to handle this. I'm used to just once column where I can compute the mean and sample variance. Here I'm asked to compute the mean and variance of the latency time. Since this is a time interval, I think I should be using the Poisson distribution, however it's given that the distribution is normal.

I don't know how to proceed. Any help would be appreciated.

mfb · Sep 8, 2014

You can ignore the medical background - it is an interval that always starts at zero, so mean and variance of the interval are the mean and variance of your data. Sure, the values cannot get negative (so it cannot be a perfect gaussian distribution), but that's not important here.

Mogarrr · Sep 8, 2014

mfb said:

You can ignore the medical background - it is an interval that always starts at zero, so mean and variance of the interval are the mean and variance of your data. Sure, the values cannot get negative (so it cannot be a perfect gaussian distribution), but that's not important here.

Not sure what you mean by a perfect guassian distribution. Stripping away the medical terminology, I still don't know what to do.

Given that I have two columns with data, I picture (perhaps incorrectly) the first column as values for X, and another column as the associated probabilities. Fom there I don't know what to do.

Perhaps I find an interval for [itex] \mu [/itex], contuining the previous tangent, how could I relate this back to the first column. It's not like I know [itex]f^{-1}(x) [/itex].

Mogarrr · Sep 8, 2014

Wait a sec...

Am I just finding a confidence interval for the data in the first column?

Ray Vickson · Sep 9, 2014

Mogarrr said:

Wait a sec...

Am I just finding a confidence interval for the data in the first column?

Yes, using probabilities estimated from the second column.

Ray Vickson · Sep 9, 2014

Mogarrr said:

Wait a sec...

Am I just finding a confidence interval for the data in the first column?

Yes, using probabilities estimated from the second column.

Mogarrr · Sep 9, 2014

Ray Vickson said:

Yes, using probabilities estimated from the second column.

Well, then I'd use students's t test, but...

How I am supposed to use the probabilities? I thought I could just find [itex] \bar{x}[/itex] and [itex] s^2[/itex] from the 1st column.

Ray Vickson · Sep 9, 2014

Mogarrr said:

Well, then I'd use students's t test, but...

How I am supposed to use the probabilities? I thought I could just find [itex] \bar{x}[/itex] and [itex] s^2[/itex] from the 1st column.

You can, if you expand it all out to get 287 sample points, like this:
X = 0,0,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,3,3, ...,9,9,10,10,10,10. Of course there is an easier way, and that is what you need to figure out.

Mogarrr · Sep 9, 2014

Ok, I think I got it now. This is like a table describing the frequency of a distribution.

Computing [itex] s^2 [/itex] might take a while.

Ray Vickson · Sep 9, 2014

Mogarrr said:

Ok, I think I got it now. This is like a table describing the frequency of a distribution.

Computing [itex] s^2 [/itex] might take a while.

Not if you think first and calculate later.

Mogarrr · Sep 9, 2014

Thought about, though I already did the calculation in Excel the long way.

Expressing the rows as order pairs [itex] (a_i , b_i) [/itex], then I have...

[itex] \bar{x} = (\sum b_i \cdot a_i)/( \sum b_i) [/itex], and

[itex] s^2 = \frac 1{\sum b_i - 1} \cdot \sum b_i (a_i - \bar{x} )^2 [/itex]...

That's what I think the easy computation is.

Confidence interval for a cohort

Homework Statement

Homework Equations

The Attempt at a Solution

1. What is a confidence interval for a cohort?

2. How is a confidence interval for a cohort calculated?

3. What is the purpose of a confidence interval for a cohort?

4. What is the difference between a confidence interval and a margin of error?

5. How do I interpret a confidence interval for a cohort?

Similar threads

Hot Threads

Recent Insights