# A Information contained in minimum value of truncated distribution

Tags:
1. Apr 18, 2016

### estebanox

Suppose that a given population is endowed with a pair of characteristics T and K. Let's think of these characteristics as random variables

(T,K)∼BiNormal((μT,μS),(σT,σS),ρ)

I observe the realisations of T for a sample consisting of those individuals with K<a, where the selection threshold a is unknown. Let t denote the minimum observed realisation of T in this sample.

In terms of the distributions and parameters above, what is t an estimator of?

To be more precise, I am trying to establish what information is contained in t that is not already contained in the truncated sample mean and variance. My intuition is that there must be some information: if selection was taking place on T itself, then it would seem intuitive to think of t as an estimator of a; but that's not the case here...

2. Apr 18, 2016

### Staff: Mentor

What exactly is that BiNormal?

If T and K are independent, then selecting for K<a does not tell you anything, it just reduces the sample size.

3. Apr 18, 2016

### estebanox

The notation I used in the OP stands for bivariate normal with correlation coefficient ρ. So I'm asking about the general case in which K may have information about T (not the orthogonal case, which as you mention is uninteresting).

4. Apr 18, 2016

### estebanox

@mfb: I just realise your confusion might be due to the fact that I noted the mean and variance of the marginal distribution of K with μ_S and σ_S. That's a typo – the obvious notation would read (T,K)∼BiNormal((μT,μK),(σT,σK),ρ) for -1<ρ<1

5. Apr 19, 2016

### Staff: Mentor

Okay, so selecting K<a depletes your normal distribution of T in some one-sided way.

If the cut is weak (e.g. a>μS+2*σS) then the mean and variance are not influenced much, but t can hold more information, especially with a strong correlation between the two variables, but you will still need the mean and variance (observed or expected) to relate this to K, otherwise you are completely insensitive to shifts/rescalings.
If the cut is strong and correlation is weak, then I would expect that the overall shape gives you more information. The sample mean and variance alone don't help unless you know the mean and variance without cut.

6. Apr 19, 2016

### estebanox

Thanks. This makes sense: if I understand correctly, your intuition is that t is informative about shift/rescaling (and how much information will be a function of the value of t and ρ). Do you know how to derive this more precisely? In other words: is there an analytical expression for it in terms of the parameters (i.e. (μT,μK),(σT,σK),ρ)?

Last edited: Apr 19, 2016
7. Apr 19, 2016

### Staff: Mentor

t alone is not.
(T,K)∼BiNormal((μT,μK),(σT,σK),ρ) and (T,K)∼BiNormal((μT,μK+c),(σT,σK),ρ) lead to the same distribution of t, but the best estimate of a has to be shifted by c. With a similar but a bit more complicated formula you can show that you can also change σK and a without changing anything related to t.

I would be surprised if there is an analytic expression for the distribution of t as function of the other parameters (if we apply the cut on K>a).

8. Apr 19, 2016

### estebanox

That's quite clear and helpful. Thanks!

9. Apr 19, 2016

### estebanox

Your answer makes me wonder if applying the cut on both K and T would change things. For instance, for some other cut b, can we know what E[Min[ T | T>t , K<b]] is?

Last edited: Apr 19, 2016
10. Apr 19, 2016

### Staff: Mentor

If you know the parameters of the distribution, and b, you can calculate this numerically. Same as above, I don't expect an exact analytic expression. For some parameters there can be a good analytic approximation.

11. Apr 20, 2016

### Stephen Tashi

I think the question should be rephrased. For each parameter of a distribution, any function of the sample values can be called an estimator of that parameter. We pay more attention to functions are "good" estimators of the parameter. There are various ways to defined what "good" means (e.g. unbiased, minimum variance, maximum liklihood).