##X,Y \in L^2##, what is the critical region

  • Thread starter GabrielN00
  • Start date
Z, but using the z-score formula you wrote. Don't forget to put the ##\sqrt{\frac{s_1^2}{n} + \frac{s_2^2}{m}}## term into the denominator of the z-score formula, just as you did when you wrote the formula for the t-score.
  • #1
GabrielN00

Homework Statement


##X_1,\dots,X_n## and ##Y_1,\dots,Y_m## are simple random samples of ##X,Y \in L^2##, being ##X,Y## independent. ##H_0:\mu_x=\mu_y## is tested against ##H_1: \mu_x\neq\mu_y## in the level ##\alpha\in(0,1).## If ##n,m## are large enough, find an approximation to the rejection region.

Homework Equations

The Attempt at a Solution


No particular distribution is given for ##X,Y## in the problem. Maybe it should follow straight from the fact ##X,Y \in L^2##? It seems natural to think that if $n,m$ are large enough then the approximation of the critical region will be the whole region under the curve.

I considered that the T-score could be used ##\displaystyle t=\frac{(\bar{x_1}-\bar{x_2})-(\mu_1-\mu_2)}{\sqrt{\frac{s_1^2}{n}+\frac{s^2}{m}}}##

But now if ##n,m## are "large enough", what I understand as considering ##n,m \rightarrow \infty##, does the value of ##\alpha## still matter? It seems that regardless of the ##\alpha## the critical region will be the whole area.
 
Physics news on Phys.org
  • #2
For large enough n,m we can use the Central Limit Theorem (CLT) to validate an assumption that the statistic whose formula you give above has an approximately normal distribution, which then enables the use of the formula to find the CI. That does not mean letting n and m go to infinity. One can get good approximations to normality with sample sizes as small as ten if the underlying distribution is not too pathological. Just state that you are using CLT, which requires n and m be reasonably big.

Then you can write a formula for the bounds of the acceptance region in terms of ##t_\alpha## and the numerator and denominator of the score you wrote above (so yes, ##\alpha## is still needed). The acceptance region will get smaller as n,m increase, but we can still express its bounds in terms of ##\alpha,n,m,\bar x_1,\bar x_2,s_1^2,s_2^2##.
 
  • Like
Likes GabrielN00
  • #3
andrewkirk said:
For large enough n,m we can use the Central Limit Theorem (CLT) to validate an assumption that the statistic whose formula you give above has an approximately normal distribution, which then enables the use of the formula to find the CI. That does not mean letting n and m go to infinity. One can get good approximations to normality with sample sizes as small as ten if the underlying distribution is not too pathological. Just state that you are using CLT, which requires n and m be reasonably big.

Then you can write a formula for the bounds of the acceptance region in terms of ##t_\alpha## and the numerator and denominator of the score you wrote above (so yes, ##\alpha## is still needed). The acceptance region will get smaller as n,m increase, but we can still express its bounds in terms of ##\alpha,n,m,\bar x_1,\bar x_2,s_1^2,s_2^2##.
Thank you.

I have two questions:

(1) How ##X,T\in L^2## comes into play? Why is it needed?

(2) Because of the CLT we can say a normal distribution, and the rejection region of the normal distribution will be a good approximation of the rejection region in the problem. But how can the hypothesis be tested?

I should test ##H_0: u_x=u_y## against #H_a: u_x\neq u_y# . Considering the normal distribution because of CLT I have to apply the z-test to test the hypothesis, and here ##\displaystyle z=\frac{\bar{X}-\mu_0}{\sigma/\sqrt{n}}## ? I don't have the deviation needed for testing, should I replace it be the formula? Then ##\displaystyle z=\frac{\bar{X}-\mu_0}{\sum_{i=1}^n (x_i-u_y)^2}##. I'm not sure how do I relate ## \alpha## and the last equation. Normally I would calculate the z-score numerically and use a z table for the ## \alpha##, but I can't do that in this problem.
 
  • #4
GabrielN00 said:
Thank you.

I have two questions:

(1) How ##X,T\in L^2## comes into play? Why is it needed?

(2) Because of the CLT we can say a normal distribution, and the rejection region of the normal distribution will be a good approximation of the rejection region in the problem. But how can the hypothesis be tested?

I should test ##H_0: u_x=u_y## against #H_a: u_x\neq u_y# . Considering the normal distribution because of CLT I have to apply the z-test to test the hypothesis, and here ##\displaystyle z=\frac{\bar{X}-\mu_0}{\sigma/\sqrt{n}}## ? I don't have the deviation needed for testing, should I replace it be the formula? Then ##\displaystyle z=\frac{\bar{X}-\mu_0}{\sum_{i=1}^n (x_i-u_y)^2}##. I'm not sure how do I relate ## \alpha## and the last equation. Normally I would calculate the z-score numerically and use a z table for the ## \alpha##, but I can't do that in this problem.
The condition ##X \in L^2## implies that ##X## is "square integrable", so has ##E X^2 < \infty##. Without that, your random variables have infinite variance, and may not obey the Central Limit Theorem at all (as is shown by some examples).
 
  • #5
GabrielN00 said:
(2) Because of the CLT we can say a normal distribution, and the rejection region of the normal distribution will be a good approximation of the rejection region in the problem. But how can the hypothesis be tested?

I should test ##H_0: u_x=u_y## against #H_a: u_x\neq u_y# . Considering the normal distribution because of CLT I have to apply the z-test to test the hypothesis, and here ##\displaystyle z=\frac{\bar{X}-\mu_0}{\sigma/\sqrt{n}}## ? I don't have the deviation needed for testing, should I replace it be the formula? Then ##\displaystyle z=\frac{\bar{X}-\mu_0}{\sum_{i=1}^n (x_i-u_y)^2}##.
Yes, use the formula as the z-score. The t distribution is very close to the standard normal once the degrees of freedom are above about 15 - which they will be if we've chosen n and m large enough to invoke CLT - so the fact that it's technically a t-statistic can be ignored.

Find the acceptance region limits from ##\alpha## in the same way you would for a Z test.
 
  • #6
andrewkirk said:
Yes, use the formula as the z-score. The t distribution is very close to the standard normal once the degrees of freedom are above about 15 - which they will be if we've chosen n and m large enough to invoke CLT - so the fact that it's technically a t-statistic can be ignored.

Find the acceptance region limits from ##\alpha## in the same way you would for a Z test.

Technically, if either the ##X_i## or ##Y_j## are non-normal, the ratio ##(\bar{X} - \mu)/(s / \sqrt{n} )## is not "t" or anything convenient/familiar. However, for "very large" ##n## it is "close to" normal. Many researchers have studied what "very large" and "close to" actually mean for various classes of non-normal random variables.
 
Last edited:
  • #7
Ray Vickson said:
Technically, if either the ##X_i## or ##Y_j## are non-normal, the ratio ##(\bar{X} - \mu)/(s / \sqrt{n} )## is not "t" or anything convenient/familiar. However, for "very large" ##n## it is "close to" normal. Many researchers have studied what "very large" and "close to" actually mean for various classes of non-normal random variables.
Don't you mean ##X,Y## being normal ? ##X_i, Y_j## are just samples from ##X,Y## respectively.
 
  • #8
WWGD said:
Don't you mean ##X,Y## being normal ? ##X_i, Y_j## are just samples from ##X,Y## respectively.

Same thing, in different words.
 
  • Like
Likes WWGD
  • #9
andrewkirk said:
Yes, use the formula as the z-score. The t distribution is very close to the standard normal once the degrees of freedom are above about 15 - which they will be if we've chosen n and m large enough to invoke CLT - so the fact that it's technically a t-statistic can be ignored.

Find the acceptance region limits from ##\alpha## in the same way you would for a Z test.
¿But don't I need a value of ##\alpha## in order to do a Z test? If I had a value I would look it up in the z table after computing the z score. Here I have a generic ##\alpha \in (0,1)##
 
  • #10
GabrielN00 said:
¿But don't I need a value of ##\alpha## in order to do a Z test? If I had a value I would look it up in the z table after computing the z score. Here I have a generic ##\alpha \in (0,1)##
It is irrelevant whether you are told a value of ##\alpha##. Just do it all symbolically. For example, you can just say something like "let ##z_{\beta}## be the probability-##\beta## point of the standard normal distribution; that is, ##z_{\beta}## is the solution of the equation ##1-\Phi(z) = \beta##". (Here ##\Phi## is the CDF of the standard normal distribution.) Then you can express your answer in terms of ##z_{\beta}##, where ##\beta## is related in some way to your given ##\alpha##; I leave it up to you to figure out what ##\beta## value to use.
 
Last edited:
  • #11
Ray Vickson said:
It is irrelevant whether you are told a value of ##\alpha##. Just do it all symbolically. For example, you can just say something like "let ##z_{\beta}## be the probability-##\beta## point of the standard normal distribution; that is, ##z_{\beta}## is the solution of the equation ##\Phi(z) = \beta##". (Here ##\Phi## is the CDF of the standard normal distribution.) Then you can express your answer in terms of ##z_{\beta}##, where ##\beta## is related in some way to your given ##\alpha##; I leave it up to you to figure out what ##\beta## value to use.

I think it could be something like this: since I have confidence ##\alpha## to test ##H_0: u_x = u_y## against ##H_a: u_x\neq u_y## I at looking at a two tailed rejection region given by ##(-\infty,-|z_\alpha|)\cup(|z_\alpha,+\infty|)## where ##z_a## is the solution to ##\int_{-\infty}^{z_a}e^{-t^2/2}dt=(\alpha/2)(2\pi)##

But should I give this ##z_a## in more explicit terms? I don't see how to involve the ##n,m, \bar{x_1}, \bar{y_1}##
 
  • #12
GabrielN00 said:
I think it could be something like this: since I have confidence ##\alpha## to test ##H_0: u_x = u_y## against ##H_a: u_x\neq u_y## I at looking at a two tailed rejection region given by ##(-\infty,-|z_\alpha|)\cup(|z_\alpha,+\infty|)## where ##z_a## is the solution to ##\int_{-\infty}^{z_a}e^{-t^2/2}dt=(\alpha/2)(2\pi)##

But should I give this ##z_a## in more explicit terms? I don't see how to involve the ##n,m, \bar{x_1}, \bar{y_1}##
Yes, essentially, result should be a function of ##\alpha ##.
 
  • #13
WWGD said:
Yes, essentially, result should be a function of ##\alpha ##.
I don't this is alright because it doesn't involve ##u_x## nor ##u_y##, but ##\Phi## is bijective when restricted to ##(-\infty, 0) ##, then ##\Phi |_{(-\infty, 0)} (z_\alpha)=\alpha/2## means ##z_\alpha=\Phi |_{(-\infty, 0)}^{-1} (\alpha/2)##.

The rejection region would be ##(-\infty, -|\Phi |_{(-\infty, 0)}^{-1} (\alpha/2)|)\cup(|\Phi |_{(-\infty, 0)}^{-1} (\alpha/2)|,+\infty)##
 
  • #14
GabrielN00 said:
¿But don't I need a value of ##\alpha## in order to do a Z test? If I had a value I would look it up in the z table after computing the z score. Here I have a generic ##\alpha \in (0,1)##
My interpretation of the problem statement is that you are to take ##\alpha## as given and write your answer as formulas in terms of ##\alpha## and the values of the data.
 
  • #15
GabrielN00 said:
I don't this is alright because it doesn't involve ##u_x## nor ##u_y##, but ##\Phi## is bijective when restricted to ##(-\infty, 0) ##, then ##\Phi |_{(-\infty, 0)} (z_\alpha)=\alpha/2## means ##z_\alpha=\Phi |_{(-\infty, 0)}^{-1} (\alpha/2)##.

The rejection region would be ##(-\infty, -|\Phi |_{(-\infty, 0)}^{-1} (\alpha/2)|)\cup(|\Phi |_{(-\infty, 0)}^{-1} (\alpha/2)|,+\infty)##
Your answer in #14 was perfect, although we usually use the notation ##z_{\alpha/2}## for the ##1-\alpha/2##-probability point, rather than your ##z_{\alpha}##. Your answer in #15 is too cluttered, and is not necessary, as you stated it already in a better, clearer, form in your previous post.
 
  • #16
Ray Vickson said:
Your answer in #14 was perfect, although we usually use the notation ##z_{\alpha/2}## for the ##1-\alpha/2##-probability point, rather than your ##z_{\alpha}##. Your answer in #15 is too cluttered, and is not necessary, as you stated it already in a better, clearer, form in your previous post.

Thank you, but don't you mean #11 and #13?

Also, the #u_x,u_shouldn't be in the final answer?
 
  • #17
GabrielN00 said:
Thank you, but don't you mean #11 and #13?

Also, the #u_x,u_shouldn't be in the final answer?
I'd say ##\alpha## itself will be a function of ##\mu_x -\mu_s ##, so , in a sense, they are in the final answer.
 
  • #18
GabrielN00 said:
Thank you, but don't you mean #11 and #13?

Also, the #u_x,u_shouldn't be in the final answer?

Yes, #11 and #13.
 

1. What is L^2 and how does it relate to X and Y?

L^2 is a mathematical concept known as a Hilbert space, which is a type of vector space that has a defined inner product and allows for the measurement of distances and angles. In this context, X and Y refer to two different elements or vectors within the L^2 space.

2. What does it mean for X and Y to be in L^2?

When X and Y are said to be in L^2, it means that they satisfy a specific set of mathematical criteria that allow them to exist within the L^2 space. This includes properties such as being square-integrable and having finite norms.

3. What is a critical region in relation to X and Y in L^2?

A critical region in this context refers to a specific subset of the L^2 space that is used for hypothesis testing or statistical analysis. It is typically defined by certain boundaries or thresholds and is used to determine the likelihood of certain outcomes or events.

4. How is the critical region determined for X and Y in L^2?

The critical region for X and Y in L^2 is determined through a combination of mathematical calculations and statistical methods. It takes into account the properties and characteristics of X and Y, as well as any specific hypotheses or research questions being tested.

5. Why is understanding the critical region important in L^2 analysis?

The critical region is important in L^2 analysis because it allows for the evaluation and interpretation of data within a defined context. By using the critical region, researchers can make informed decisions about the significance of their findings and draw conclusions based on statistical evidence.

Similar threads

  • Calculus and Beyond Homework Help
Replies
20
Views
3K
  • Calculus and Beyond Homework Help
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
919
Replies
1
Views
803
Replies
5
Views
381
Replies
6
Views
974
  • Calculus and Beyond Homework Help
Replies
2
Views
1K
  • Calculus and Beyond Homework Help
Replies
7
Views
1K
  • Calculus and Beyond Homework Help
Replies
7
Views
1K
  • Calculus and Beyond Homework Help
Replies
8
Views
2K
Back
Top