##X,Y \in L^2##, what is the critical region

GabrielN00 · Oct 11, 2017

Homework Statement

##X_1,\dots,X_n## and ##Y_1,\dots,Y_m## are simple random samples of ##X,Y \in L^2##, being ##X,Y## independent. ##H_0:\mu_x=\mu_y## is tested against ##H_1: \mu_x\neq\mu_y## in the level ##\alpha\in(0,1).## If ##n,m## are large enough, find an approximation to the rejection region.

Homework Equations

The Attempt at a Solution

No particular distribution is given for ##X,Y## in the problem. Maybe it should follow straight from the fact ##X,Y \in L^2##? It seems natural to think that if $n,m$ are large enough then the approximation of the critical region will be the whole region under the curve.

I considered that the T-score could be used ##\displaystyle t=\frac{(\bar{x_1}-\bar{x_2})-(\mu_1-\mu_2)}{\sqrt{\frac{s_1^2}{n}+\frac{s^2}{m}}}##

But now if ##n,m## are "large enough", what I understand as considering ##n,m \rightarrow \infty##, does the value of ##\alpha## still matter? It seems that regardless of the ##\alpha## the critical region will be the whole area.

andrewkirk · Oct 11, 2017

For large enough n,m we can use the Central Limit Theorem (CLT) to validate an assumption that the statistic whose formula you give above has an approximately normal distribution, which then enables the use of the formula to find the CI. That does not mean letting n and m go to infinity. One can get good approximations to normality with sample sizes as small as ten if the underlying distribution is not too pathological. Just state that you are using CLT, which requires n and m be reasonably big.

Then you can write a formula for the bounds of the acceptance region in terms of ##t_\alpha## and the numerator and denominator of the score you wrote above (so yes, ##\alpha## is still needed). The acceptance region will get smaller as n,m increase, but we can still express its bounds in terms of ##\alpha,n,m,\bar x_1,\bar x_2,s_1^2,s_2^2##.

GabrielN00 · Oct 12, 2017

andrewkirk said:

For large enough n,m we can use the Central Limit Theorem (CLT) to validate an assumption that the statistic whose formula you give above has an approximately normal distribution, which then enables the use of the formula to find the CI. That does not mean letting n and m go to infinity. One can get good approximations to normality with sample sizes as small as ten if the underlying distribution is not too pathological. Just state that you are using CLT, which requires n and m be reasonably big.

Then you can write a formula for the bounds of the acceptance region in terms of ##t_\alpha## and the numerator and denominator of the score you wrote above (so yes, ##\alpha## is still needed). The acceptance region will get smaller as n,m increase, but we can still express its bounds in terms of ##\alpha,n,m,\bar x_1,\bar x_2,s_1^2,s_2^2##.

Thank you.

I have two questions:

(1) How ##X,T\in L^2## comes into play? Why is it needed?

(2) Because of the CLT we can say a normal distribution, and the rejection region of the normal distribution will be a good approximation of the rejection region in the problem. But how can the hypothesis be tested?

I should test ##H_0: u_x=u_y## against #H_a: u_x\neq u_y# . Considering the normal distribution because of CLT I have to apply the z-test to test the hypothesis, and here ##\displaystyle z=\frac{\bar{X}-\mu_0}{\sigma/\sqrt{n}}## ? I don't have the deviation needed for testing, should I replace it be the formula? Then ##\displaystyle z=\frac{\bar{X}-\mu_0}{\sum_{i=1}^n (x_i-u_y)^2}##. I'm not sure how do I relate ## \alpha## and the last equation. Normally I would calculate the z-score numerically and use a z table for the ## \alpha##, but I can't do that in this problem.

Ray Vickson · Oct 12, 2017

GabrielN00 said:

Thank you.

I have two questions:

(1) How ##X,T\in L^2## comes into play? Why is it needed?

(2) Because of the CLT we can say a normal distribution, and the rejection region of the normal distribution will be a good approximation of the rejection region in the problem. But how can the hypothesis be tested?

I should test ##H_0: u_x=u_y## against #H_a: u_x\neq u_y# . Considering the normal distribution because of CLT I have to apply the z-test to test the hypothesis, and here ##\displaystyle z=\frac{\bar{X}-\mu_0}{\sigma/\sqrt{n}}## ? I don't have the deviation needed for testing, should I replace it be the formula? Then ##\displaystyle z=\frac{\bar{X}-\mu_0}{\sum_{i=1}^n (x_i-u_y)^2}##. I'm not sure how do I relate ## \alpha## and the last equation. Normally I would calculate the z-score numerically and use a z table for the ## \alpha##, but I can't do that in this problem.

The condition ##X \in L^2## implies that ##X## is "square integrable", so has ##E X^2 < \infty##. Without that, your random variables have infinite variance, and may not obey the Central Limit Theorem at all (as is shown by some examples).

andrewkirk · Oct 12, 2017

GabrielN00 said:

(2) Because of the CLT we can say a normal distribution, and the rejection region of the normal distribution will be a good approximation of the rejection region in the problem. But how can the hypothesis be tested?

I should test ##H_0: u_x=u_y## against #H_a: u_x\neq u_y# . Considering the normal distribution because of CLT I have to apply the z-test to test the hypothesis, and here ##\displaystyle z=\frac{\bar{X}-\mu_0}{\sigma/\sqrt{n}}## ? I don't have the deviation needed for testing, should I replace it be the formula? Then ##\displaystyle z=\frac{\bar{X}-\mu_0}{\sum_{i=1}^n (x_i-u_y)^2}##.

Yes, use the formula as the z-score. The t distribution is very close to the standard normal once the degrees of freedom are above about 15 - which they will be if we've chosen n and m large enough to invoke CLT - so the fact that it's technically a t-statistic can be ignored.

Find the acceptance region limits from ##\alpha## in the same way you would for a Z test.

Ray Vickson · Oct 12, 2017

andrewkirk said:

Yes, use the formula as the z-score. The t distribution is very close to the standard normal once the degrees of freedom are above about 15 - which they will be if we've chosen n and m large enough to invoke CLT - so the fact that it's technically a t-statistic can be ignored.

Find the acceptance region limits from ##\alpha## in the same way you would for a Z test.

Technically, if either the ##X_i## or ##Y_j## are non-normal, the ratio ##(\bar{X} - \mu)/(s / \sqrt{n} )## is not "t" or anything convenient/familiar. However, for "very large" ##n## it is "close to" normal. Many researchers have studied what "very large" and "close to" actually mean for various classes of non-normal random variables.

WWGD · Oct 13, 2017

Ray Vickson said:

Technically, if either the ##X_i## or ##Y_j## are non-normal, the ratio ##(\bar{X} - \mu)/(s / \sqrt{n} )## is not "t" or anything convenient/familiar. However, for "very large" ##n## it is "close to" normal. Many researchers have studied what "very large" and "close to" actually mean for various classes of non-normal random variables.

Don't you mean ##X,Y## being normal ? ##X_i, Y_j## are just samples from ##X,Y## respectively.

Ray Vickson · Oct 13, 2017

WWGD said:

Don't you mean ##X,Y## being normal ? ##X_i, Y_j## are just samples from ##X,Y## respectively.

Same thing, in different words.

GabrielN00 · Oct 16, 2017

andrewkirk said:

Yes, use the formula as the z-score. The t distribution is very close to the standard normal once the degrees of freedom are above about 15 - which they will be if we've chosen n and m large enough to invoke CLT - so the fact that it's technically a t-statistic can be ignored.

Find the acceptance region limits from ##\alpha## in the same way you would for a Z test.

¿But don't I need a value of ##\alpha## in order to do a Z test? If I had a value I would look it up in the z table after computing the z score. Here I have a generic ##\alpha \in (0,1)##

Ray Vickson · Oct 16, 2017

GabrielN00 said:

¿But don't I need a value of ##\alpha## in order to do a Z test? If I had a value I would look it up in the z table after computing the z score. Here I have a generic ##\alpha \in (0,1)##

It is irrelevant whether you are told a value of ##\alpha##. Just do it all symbolically. For example, you can just say something like "let ##z_{\beta}## be the probability-##\beta## point of the standard normal distribution; that is, ##z_{\beta}## is the solution of the equation ##1-\Phi(z) = \beta##". (Here ##\Phi## is the CDF of the standard normal distribution.) Then you can express your answer in terms of ##z_{\beta}##, where ##\beta## is related in some way to your given ##\alpha##; I leave it up to you to figure out what ##\beta## value to use.

GabrielN00 · Oct 16, 2017

Ray Vickson said:

It is irrelevant whether you are told a value of ##\alpha##. Just do it all symbolically. For example, you can just say something like "let ##z_{\beta}## be the probability-##\beta## point of the standard normal distribution; that is, ##z_{\beta}## is the solution of the equation ##\Phi(z) = \beta##". (Here ##\Phi## is the CDF of the standard normal distribution.) Then you can express your answer in terms of ##z_{\beta}##, where ##\beta## is related in some way to your given ##\alpha##; I leave it up to you to figure out what ##\beta## value to use.

I think it could be something like this: since I have confidence ##\alpha## to test ##H_0: u_x = u_y## against ##H_a: u_x\neq u_y## I at looking at a two tailed rejection region given by ##(-\infty,-|z_\alpha|)\cup(|z_\alpha,+\infty|)## where ##z_a## is the solution to ##\int_{-\infty}^{z_a}e^{-t^2/2}dt=(\alpha/2)(2\pi)##

But should I give this ##z_a## in more explicit terms? I don't see how to involve the ##n,m, \bar{x_1}, \bar{y_1}##

WWGD · Oct 16, 2017

GabrielN00 said:

I think it could be something like this: since I have confidence ##\alpha## to test ##H_0: u_x = u_y## against ##H_a: u_x\neq u_y## I at looking at a two tailed rejection region given by ##(-\infty,-|z_\alpha|)\cup(|z_\alpha,+\infty|)## where ##z_a## is the solution to ##\int_{-\infty}^{z_a}e^{-t^2/2}dt=(\alpha/2)(2\pi)##

But should I give this ##z_a## in more explicit terms? I don't see how to involve the ##n,m, \bar{x_1}, \bar{y_1}##

Yes, essentially, result should be a function of ##\alpha ##.

GabrielN00 · Oct 16, 2017

WWGD said:

Yes, essentially, result should be a function of ##\alpha ##.

I don't this is alright because it doesn't involve ##u_x## nor ##u_y##, but ##\Phi## is bijective when restricted to ##(-\infty, 0) ##, then ##\Phi |_{(-\infty, 0)} (z_\alpha)=\alpha/2## means ##z_\alpha=\Phi |_{(-\infty, 0)}^{-1} (\alpha/2)##.

The rejection region would be ##(-\infty, -|\Phi |_{(-\infty, 0)}^{-1} (\alpha/2)|)\cup(|\Phi |_{(-\infty, 0)}^{-1} (\alpha/2)|,+\infty)##

andrewkirk · Oct 16, 2017

GabrielN00 said:

¿But don't I need a value of ##\alpha## in order to do a Z test? If I had a value I would look it up in the z table after computing the z score. Here I have a generic ##\alpha \in (0,1)##

My interpretation of the problem statement is that you are to take ##\alpha## as given and write your answer as formulas in terms of ##\alpha## and the values of the data.

Ray Vickson · Oct 17, 2017

GabrielN00 said:

I don't this is alright because it doesn't involve ##u_x## nor ##u_y##, but ##\Phi## is bijective when restricted to ##(-\infty, 0) ##, then ##\Phi |_{(-\infty, 0)} (z_\alpha)=\alpha/2## means ##z_\alpha=\Phi |_{(-\infty, 0)}^{-1} (\alpha/2)##.

The rejection region would be ##(-\infty, -|\Phi |_{(-\infty, 0)}^{-1} (\alpha/2)|)\cup(|\Phi |_{(-\infty, 0)}^{-1} (\alpha/2)|,+\infty)##

Your answer in #14 was perfect, although we usually use the notation ##z_{\alpha/2}## for the ##1-\alpha/2##-probability point, rather than your ##z_{\alpha}##. Your answer in #15 is too cluttered, and is not necessary, as you stated it already in a better, clearer, form in your previous post.

GabrielN00 · Oct 17, 2017

Ray Vickson said:

Your answer in #14 was perfect, although we usually use the notation ##z_{\alpha/2}## for the ##1-\alpha/2##-probability point, rather than your ##z_{\alpha}##. Your answer in #15 is too cluttered, and is not necessary, as you stated it already in a better, clearer, form in your previous post.

Thank you, but don't you mean #11 and #13?

Also, the #u_x,u_shouldn't be in the final answer?

WWGD · Oct 17, 2017

GabrielN00 said:

Thank you, but don't you mean #11 and #13?

Also, the #u_x,u_shouldn't be in the final answer?

I'd say ##\alpha## itself will be a function of ##\mu_x -\mu_s ##, so , in a sense, they are in the final answer.

Ray Vickson · Oct 17, 2017

GabrielN00 said:

Thank you, but don't you mean #11 and #13?

Also, the #u_x,u_shouldn't be in the final answer?

Yes, #11 and #13.

##X,Y \in L^2##, what is the critical region

Homework Help Overview

Discussion Character

Approaches and Questions Raised

Discussion Status

Contextual Notes

Homework Statement

Homework Equations

The Attempt at a Solution

Similar threads

Distance between a Clock's hands when the distance is increasing most rapidly

Polar integral

Deriving spatial derivatives

Is this the correct general solution of the given PDE?

J_1(x) = (x^2/10)*(J_1(x) + J_3(x)) How to solve?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect