# Factorization Theorem for Sufficient Statistics & Indicator Function

• kingwinner
In summary: Now I understand! In summary, the maximum of a random sample from a uniform distribution is a sufficient statistic for the uniform distribution.

#### kingwinner

Problem:
Let Y1,Y2,...,Yn denote a random sample from the uniform distribution over the interval (0,theta). Show that Y(n)=max(Y1,Y2,...,Yn) is a sufficient statistic for theta by the factorization theorem.

Solution:
http://www.geocities.com/asdfasdf23135/stat10.JPG
1) While I understand that IA (x)IB (x)=IA intersect B (x), I don't understand the equality circled in red above.

In the solutions, they say that I0,theta (y1)...I0,theta (yn)=I0,theta (y(n)). Is this really correct?
Shouldn't the right hand side be I0,theta (y(n))I0,infinity (y(1)) ? I believe that the second factor is necessary because the largest observation is greater than zero does not guarantee that the smallest observation is greater than zero.
Which one is correct?

2) Also, is I0,theta (y(n)) a function of y(n), a function of theta, or a function of both y(n) and theta?
If it is a function of both y(n) and theta, then there is something that I don't understand. Following the definition of indicator function that IA (x) is a function of x alone (it is a function of only the stuff in the parenthesis), shouldn't I0,theta (y(n)) be a function of only y(n) alone?

Thank you for explaining! I've been confused with these ideas for at least a week.

The left side of

$$\prod_{i=1}^n I_{[0,\theta)} (y_i) = I_{[0,\theta)} (y_{(n)})$$

means that all the $$y_i$$ values are in the interval $$[0,\theta)$$.

This is true if, and only if, the maximum of the y's is in the same interval, and that is the meaning of the right-side.

The indicator $$I_{[0,\theta)} (y_{(n)})$$ is a function of $$y_{(n)}$$ only, since $$\theta$$ is fixed (it's a parameter).

I understand that
0<X_1,..., X_n<theta here these are the unordered data
is the same as (iff)
0<X_(1)<X_(2)<...<X_(n)<theta

But I don't think
0<X_(1)<X_(2)<...<X_(n)<theta
is EQUIVALENT to (iff)
0<X_(n)<theta
The => direction is true but <= is not. (the fact that the largest observation x(n) is greater than zero does not guarantee that the smallest observation x(1) is greater than zero.)

So that's why I think we should have
I0,theta(y1)...I0,theta(yn) = I0,theta(y(n))I0,infinity(y(1))

Right?

Last edited:
The indicator $$I_{[0,\theta)} (y_{(n)})$$ is a function of $$y_{(n)}$$ only, since $$\theta$$ is fixed (it's a parameter).

But we can also write it as I y(n), inf (theta), in this case theta would be in the parenthesis, so in this case, would it be a function of theta alone? (in the gerenal case, f(x) means a function of x, f(y) means a function of y, the stuff in the parenthesis)

When we talk about functions, is it always only a function of the stuff in the parenthesis? It looks like that the restrictions/constraints are also important, so shouldn't it be a function also of the variables in the restrictions/constraints?

e.g.)
f(x)=x if x>y
f(x)=x^3 if x<y
Here not only the value of x controls f, the value of y also controls f, so is f a function of BOTH x and y here?

Thanks!

## 1. What is the Factorization Theorem for Sufficient Statistics?

The Factorization Theorem for Sufficient Statistics states that a statistic is sufficient for a parameter if and only if the joint probability distribution of the data can be factored into a product of two functions: one depending only on the data and the other depending only on the parameter.

## 2. What is a sufficient statistic?

A sufficient statistic is a function of the data that contains all the relevant information about a parameter of interest. It summarizes the data in a way that captures all the information necessary to make inferences about the parameter.

## 3. How is the factorization theorem related to the indicator function?

The indicator function is a function that takes on the value of 1 if a certain condition is met, and 0 otherwise. In the factorization theorem, the indicator function is often used to represent whether a particular data point is included or excluded from the sufficient statistic. This allows for a simpler and more efficient way to factorize the joint probability distribution.

## 4. What are some practical applications of the Factorization Theorem for Sufficient Statistics?

The Factorization Theorem is a fundamental concept in statistics and has many applications in data analysis. It is commonly used in parameter estimation, hypothesis testing, and model selection. It also allows for the reduction of high-dimensional data to a smaller set of sufficient statistics, making data analysis more efficient.

## 5. How does the Factorization Theorem for Sufficient Statistics relate to other statistical concepts such as maximum likelihood estimation and the method of moments?

The Factorization Theorem is closely related to maximum likelihood estimation and the method of moments. In fact, the factorization of the joint probability distribution can be used to derive the estimators for these methods. Additionally, sufficient statistics can be used to simplify the calculation of these estimators, making them more efficient and accurate.