Information contained in minimum value of truncated distribution

In summary: Yes, this is a question about sufficient statistics. In general, the minimum value of S in the sample is a maximum likelihood estimator for the distribution parameter a that defines the cutoff threshold for S.
  • #1
estebanox
26
0
Suppose that a given population is endowed with a pair of characteristics T and K. Let's think of these characteristics as random variables

(T,K)∼BiNormal((μT,μS),(σT,σS),ρ)

I observe the realisations of T for a sample consisting of those individuals with K<a, where the selection threshold a is unknown. Let t denote the minimum observed realisation of T in this sample.

In terms of the distributions and parameters above, what is t an estimator of?

To be more precise, I am trying to establish what information is contained in t that is not already contained in the truncated sample mean and variance. My intuition is that there must be some information: if selection was taking place on T itself, then it would seem intuitive to think of t as an estimator of a; but that's not the case here...
 
Physics news on Phys.org
  • #2
What exactly is that BiNormal?

If T and K are independent, then selecting for K<a does not tell you anything, it just reduces the sample size.
 
  • #3
The notation I used in the OP stands for bivariate normal with correlation coefficient ρ. So I'm asking about the general case in which K may have information about T (not the orthogonal case, which as you mention is uninteresting).
 
  • #4
@mfb: I just realize your confusion might be due to the fact that I noted the mean and variance of the marginal distribution of K with μ_S and σ_S. That's a typo – the obvious notation would read (T,K)∼BiNormal((μT,μK),(σT,σK),ρ) for -1<ρ<1
 
  • #5
Okay, so selecting K<a depletes your normal distribution of T in some one-sided way.

If the cut is weak (e.g. a>μS+2*σS) then the mean and variance are not influenced much, but t can hold more information, especially with a strong correlation between the two variables, but you will still need the mean and variance (observed or expected) to relate this to K, otherwise you are completely insensitive to shifts/rescalings.
If the cut is strong and correlation is weak, then I would expect that the overall shape gives you more information. The sample mean and variance alone don't help unless you know the mean and variance without cut.
 
  • #6
Thanks. This makes sense: if I understand correctly, your intuition is that t is informative about shift/rescaling (and how much information will be a function of the value of t and ρ). Do you know how to derive this more precisely? In other words: is there an analytical expression for it in terms of the parameters (i.e. (μT,μK),(σT,σK),ρ)?
 
Last edited:
  • #7
estebanox said:
your intuition is that t is informative about shift/rescaling
t alone is not.
(T,K)∼BiNormal((μT,μK),(σT,σK),ρ) and (T,K)∼BiNormal((μT,μK+c),(σT,σK),ρ) lead to the same distribution of t, but the best estimate of a has to be shifted by c. With a similar but a bit more complicated formula you can show that you can also change σK and a without changing anything related to t.

I would be surprised if there is an analytic expression for the distribution of t as function of the other parameters (if we apply the cut on K>a).
 
  • Like
Likes estebanox
  • #8
That's quite clear and helpful. Thanks!
 
  • #9
mfb said:
I would be surprised if there is an analytic expression for the distribution of t as function of the other parameters (if we apply the cut on K>a).

Your answer makes me wonder if applying the cut on both K and T would change things. For instance, for some other cut b, can we know what E[Min[ T | T>t , K<b]] is?
 
Last edited:
  • #10
If you know the parameters of the distribution, and b, you can calculate this numerically. Same as above, I don't expect an exact analytic expression. For some parameters there can be a good analytic approximation.
 
  • #11
estebanox said:
I observe the realisations of T for a sample consisting of those individuals with K<a, where the selection threshold a is unknown. Let t denote the minimum observed realisation of T in this sample.

In terms of the distributions and parameters above, what is t an estimator of?

I think the question should be rephrased. For each parameter of a distribution, any function of the sample values can be called an estimator of that parameter. We pay more attention to functions are "good" estimators of the parameter. There are various ways to defined what "good" means (e.g. unbiased, minimum variance, maximum liklihood).

A question about estimators could be made specific by asking things like

1. Is the minimum value of S in the sample an unbiased estimator of the distribution parameter "a" that defines the cutoff threshold for S ?

or

2. Is there a simple looking function f of the distribution parameters for which the minimum value of S in the sample is a maximum liklihood estimator for f evaluated at the particular parameter values of the distribution being sampled ?

The more general question of "is there information" could be phrased in terms of a question about "sufficient statistics", which I think mfb is effectively doing.
 

1. What is a truncated distribution?

A truncated distribution is a statistical probability distribution that has been limited or truncated to a certain range of values. This means that any values outside of the specified range have been removed from the distribution.

2. How is the minimum value of a truncated distribution determined?

The minimum value of a truncated distribution is determined by the range of values that have been specified to be included in the distribution. This value is the lowest possible value that can be observed in the distribution.

3. What information does the minimum value of a truncated distribution contain?

The minimum value of a truncated distribution contains information about the lower limit of the distribution, as well as the range of values that have been excluded from the distribution. It can also provide insight into the skewness and kurtosis of the distribution.

4. How is the minimum value of a truncated distribution useful?

The minimum value of a truncated distribution is useful in understanding the shape and characteristics of the distribution. It can also be used in hypothesis testing and determining the probability of certain events occurring within the truncated range.

5. Can the minimum value of a truncated distribution change?

Yes, the minimum value of a truncated distribution can change if the range of values included in the distribution is altered. If the range is expanded, the minimum value will decrease, and if the range is narrowed, the minimum value will increase.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
926
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
24
Views
6K
  • Poll
  • Science and Math Textbooks
Replies
1
Views
2K
  • Calculus and Beyond Homework Help
Replies
1
Views
2K
Replies
1
Views
1K
  • Math Proof Training and Practice
2
Replies
46
Views
5K
Replies
1
Views
2K
  • Calculus and Beyond Homework Help
Replies
1
Views
4K
  • STEM Academic Advising
Replies
13
Views
2K
Back
Top