# A problem about levels of confidence

1. Mar 20, 2012

### QuArK21343

It is known that a process occurs with a frequency of 10 events per second and a second process with a frequency of 12 per second. Consider the two situations:

1. It is possible to repeat N times the experiment, that consists in counting the processes, over a period of 10 seconds.

2. It is possible to take a single measurement of the number of processes over a time T arbitrarily large.

How should I choose N and T in order to claim, with a confidence level of 95%, that the two processes indeed occur at different rates?

2. Mar 20, 2012

### chiro

Hey QuArK213143 and welcome to the forums.

For 1. as long as you make sure to adhere to collecting data in terms of the resolution of your process then you should be ok. In other words if you are guaranteed that you will at most have only 1 event per period of time then as long you collect data it should be ok. If you can't guarantee this and can't guarantee that events are independent of one another then you should not use a Poisson distribution which is the common process used to model things like counting and rates processes. Are you familiar with the Poisson distribution and its interpretation with respect to distribution parameters?

I'm not exactly sure what you mean by number 2. Could you explain physically what you are doing in terms of data collection and what specifically that data is referring to?

In terms of showing that the two processes are different, that is a different issue.

The first thing you need to do for this is to figure out what kind of constraints you have for the model, but not for the parameters necessarily. The less constraints you make the more general and the harder it will be to make a more specific assertion.

If your processes seem to fit or are at least well approximated by a Poisson distribution then that becomes your first constraint. This alone makes a huge simplification in terms of analysis.

The next thing is whether you want to assume that these two processes are independent from one another in that the data pertaining to one process is independent from the data of another. This is to say that that there is no dependency on elements in one process having to do with the other. If these two groups of data have some dependency, then this will change things a lot and you need to alter your model and analytic techniques to take this into account.

So if you have two Poisson distribution in which there is no dependency characteristics between one set of data and another (corresponding to the processes) then you will need to move on to a statistical test that eventually tests whether the two distributions (which in the above set of assumptions are independent Poisson) are statistically significantly equal.

This means that under a given confidence criteria, given sample sizes for both datasets, and the data itself you will do a hypothesis test to see if the both processes have statistically significantly the same parameters which will be used to support your claim that they come from the same process (which means you have evidence to suggest your assumption is false) or from different processes' (which means you have evidence to suggest your assumption is true).

Now a poisson distribution has the property that the mean is equal to the variance which means you only have to deal with a statistical test for that one parameter. In terms of doing a hypothesis test I would need to lookup the assumptions but if you have enough data, I would think that a t-test should be appropriate due to the results of the central limit theorem which says that if you have enough data with respect to your distribution, the distribution of your mean should go normal when you standardize it with respect to other parameters like the real mean. Also the t-test should be a two-sample test just for clarification.

The actual determination of parameters has to do with using tables for a specific level of confidence that is calculated by using a value for N for both datasets.

3. Mar 21, 2012

### QuArK21343

Chiro, thank you for your answer. The concrete situation may be this: radioactive decay; isotope 1 has a characteristic frequency of 10 Hz and isotope 2 of 12 Hz. You measure the number of decays over a period of 10 seconds for isotope 1. You do the same experiment N times, where N is to be determined. Then, you consider isotope 2. You can't repeat the experiment, you have only one chance, but you can choose the interval of time over which to measure the number of decays. Intuitively, I think that the longer T, the more accurate is the estimate of the frequency of emission of isotope 2 (but I don't know how to formalize that). Also, the greater N, the more accurate is the estimate of the frequency of isotope 1. The question is, how to choose N and T so that I can say, with a confidence level of 95%, that the two isotopes are indeed different?

I suspect that there is some formula that gives the dependence of the uncertainty of the estimated frequency on the number of experiments N. And there should be also a relation involving the time of observation T, but I am not sure about the 95% confidence level... (yes, I know poisson distribution and I'm familiar with confidence levels, you compute the number of deviations from the mean and look up in tables the probability of a result outside a fixed range, but this is for a gaussian distribution, right?)

4. Mar 21, 2012

### Stephen Tashi

I don't think you are familiar with the proper terminology. A confidence level is associated with "confidence interval" and confidence intervals are used to quantify how effective a sampling plan is for estimating something. What you are proposing is to make a decision - i.e. Do the two isotopes have different decay rates? This falls under the heading of "hypothesis testing", as Chiro mentioned, not under the heading of "confidence intervals".

If you are using "confidence" simply as a synonym for "probability" then you aren't dealing with either of the above two scenarios. If you want to "know with a probability of 0.95 that the two isotopes have different decay rates", you would have to use Bayesian statistics. You would have to specify some probability of the two isotopes having (or not having) different decay rates before any data is taken.

5. Mar 21, 2012

### QuArK21343

I am not sure I get your point, partly because all I know is from a basic course in statistics in my native language. So, basically, confidence interval or confidence level is the terminology used when you want to say that a certain fraction of all measurements falls in a given interval around the mean (95% confidence level means that in 100 measurements the expected number of results outside 2 deviations from the mean is 5). But, here it is different because, as you say, I want to "know with a probability of 0.95 that the two isotopes have different decay rates". Is my understanding correct?

Also, apart from the terminology, how is the problem handled in practice and how does the number N of repetitions of the experiment enter the story? (I don't know bayesian statistics and hopefully it is not relevant, since this is intended to be an elementary exercise)

6. Mar 21, 2012

### chiro

The thing with hypothesis testing is that you have Type I and Type II errors. These in english mean that you make a positive hypothesis given that the real situation is negative and the other way around which means that you make a negative hypothesis given that the real situation is positive. The negative means you reject and the positive means you accept.

Because of the above we can't really say that 95% chance of the mean being in an interval in the kind of language you have described because we haven't taken into account the conditional nature of hypothesis testing which needs to account for these Type I and Type II errors.

If you want to do classical analysis, then I recommend you using a two-sample t-test non-pooled and non-paired. If you have enough values for the data then it will be ok to use due to the Central Limit Theorem which basically states that as the number of values goes to infinity, the distribution of the standardized mean of the sample distribution of the mean goes towards the standard normal and this is why we can use things like the t-test for testing means for any distribution in this way: it's because of the CLT.

If you give us figures for N, then that will help assess if using the t-test should be ok. Maybe Stephen Tashi might know this and give you some kind of rule of thumb for this number.

7. Mar 22, 2012

### Stephen Tashi

Your statement about 2 deviations is only true for certain problems. You are correct that a "confidence level of 0.95" is not the same as "know with a probability of 0.95". That is my point.

If you use a non-Bayesian approach, you cannot "know with a probability of 0.95". I think you are confusing "know with a probability of 0.95" with "conclude at the 0.05 significance level". You can do a hypothesis test and (if things come out right) say "At the 0.05 significance level, the two isotopes have the same decay rate". However, this does not imply that there is a 0.95 probability that the two isotopes have the same decay rate. It doesn't imply any definite probability of them having the same decay rate.

The number of repetitions N usually affects the variance of the statistic that you use in the hypothesis test. The variance of the statistic affects the shape of its probability distribution. The probability distribution is what you use to translate a significance requirement (like 0.05) to a threshold for the statistic (such as a difference of 13.28 between the mean counts of the isotopes).

If you are trying to formulate a question based on the idea that more repetitions should produce a result that is, in some sense, more reliable than one with fewer repetitions, you need to study the topic of the "power" of a statistical test.

You described this problem as an "exercise". Is it a problem from a textbook? if it is, you should quote it exactly. Or did you mean it is a real world labor that you must do?

8. Mar 22, 2012

### QuArK21343

Thank you, Chiro and Stephen for your replies. This exercise is taken from a past exam paper in statistics. I cannot quote the exact exercise because I already tried to translate it in english as best as I can. As you said, what I want is to draw a conclusion at the 0.05 significance level about the two isotopes being or not the same. First, I think I should tell you what I know and suspect may be relevant:

1. A Poisson distribution models the decay of particles. It has a variance equal to its mean.

2. If we assume normal distribution, our best guess is the mean of the various measurements, with an uncertainty that goes like 1/sqrt(N-1). The fractional uncertainty in the variance is 1/(sqrt(2N-2)).

2. The CLT guarantees normal distribution for a vast range of situations. Normal distribution can be obtained as the limiting case of Poisson distribution, when the mean "gets bigger".

My reasoning, based on what you said: hypothesis testing; null hypothesis is "the two isotopes are equal"; I do the experiment N times, the more experiments I do the smaller my uncertainty is in assessing the mean and the variance of the normal distribution (I assume normal distribution from the start). I do the experiment with the second isotope. If the result falls outside two sigma, the two isotopes "are" different. I don't know if this is more or less the right track and also how to use the fact that the experiment with isotope 2 lasts T seconds?

9. Mar 22, 2012

### Stephen Tashi

Then we still don't know what the problem is asking. The difficulty is interpreting the last sentence: "How should I choose N and T in order to claim, with a confidence level of 95%, that the two processes indeed occur at different rates?".

Unfortunately, I don't see that it helps to replace "with a confidence level of 95%" with "at a significance level of 0.05" because for almost any value of N and any reasonable statistical test, you could establish a threshold for the statistic corresponding to a significance level of 0.05 and the usual null hypothesis would be that the rates are equal, not that they are different without specifying exactly how different they are.

Because you are given the actual decay rates, I suspect this question is about the "power" of a statistical test. If the problem told us what test we were to use, then the problem might be read something like this:

The trouble with my interpretation is that your statement of the problem doesn't mention a specific test to use. Perhaps Chiro has some better ideas.

10. Mar 22, 2012

### QuArK21343

I have had a quick look at wikipedia's pages on student t-test and t-distribution and as far as I understand, your interpretation is very sensible. Assuming that the test to use is the t-test, how should I proceed in practice? Unfortunately my translation is literal, so the problem's statement is itself a bit ambiguous. The only other thing I can do is to ask my professor...

11. Mar 23, 2012

### chiro

I'm assuming that the approximation of the estimator for the standardized mean is approximated well enough to a standard normal distribution so that the OP doesn't have to worry about more complicated methods.

I also think you have mentioned an important point about the power of the test since that is a good way to mathematically show some kind of boundary for the usefulness of the test.

In terms of the actual test, I'm not really an expert in statistics but assuming that the normal approximation is good enough for the estimator of the means (for both distributions) then an unpaired, unpooled normal two-sample t-test in my mind seems appropriate.

In terms of power calculations I did a quick google and I got this:

http://www.ats.ucla.edu/stat/stata/dae/t_test_power2.htm

The thing is though, you would have to find out given a sample size for each data set and the interval for an estimated mean how 'close' the distribution of the estimator of the standardized mean actually is to a normal.

I'm pretty sure someone has already worked on this problem and come up with some ideas relating the point estimate of the mean using the sample information and the number of samples particularly for a Poisson distribution. You could possible use the 'rule of thumb' for the Binomial and then take the appropriate limits, but if you can find existing results for these bounds then it's probably a good idea to just use those.

An important note for the OP: if the normality condition isn't "good enough", your results won't be an accurate reflection of the data. The specific conditions depend on the different non-linearities of the distribution.

What you could do if you have time later on, is to simulate an estimator of the mean for various simulated variables with a given mean (and subsequently variance) in a statistical package and then see both graphically (histogram) and numerically (Shapiro-Wilk) how the normality condition adds up.

For your statistics class I'm going to go out on a limb and say that they want you to use a t-test like the one mentioned in the link above, but if you do further statistics it's important to be aware of issues like this.

12. Mar 23, 2012

### Stephen Tashi

Are you supposed to use some statistical software package to solve the exercise or are you expected to get an answer by hand calculation and consulting printed tables? (I myself have never done power calculations for the t-test, but I think we can figure this out.)

In the USA, it would be good to ask the professor. I don't know about your environment. If you ask, be sure and mention the topic of "power of a statistical test".