Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

A Information contained in minimum value of truncated distribution

  1. Apr 18, 2016 #1
    Suppose that a given population is endowed with a pair of characteristics T and K. Let's think of these characteristics as random variables

    (T,K)∼BiNormal((μT,μS),(σT,σS),ρ)

    I observe the realisations of T for a sample consisting of those individuals with K<a, where the selection threshold a is unknown. Let t denote the minimum observed realisation of T in this sample.

    In terms of the distributions and parameters above, what is t an estimator of?

    To be more precise, I am trying to establish what information is contained in t that is not already contained in the truncated sample mean and variance. My intuition is that there must be some information: if selection was taking place on T itself, then it would seem intuitive to think of t as an estimator of a; but that's not the case here...
     
  2. jcsd
  3. Apr 18, 2016 #2

    mfb

    User Avatar
    2016 Award

    Staff: Mentor

    What exactly is that BiNormal?

    If T and K are independent, then selecting for K<a does not tell you anything, it just reduces the sample size.
     
  4. Apr 18, 2016 #3
    The notation I used in the OP stands for bivariate normal with correlation coefficient ρ. So I'm asking about the general case in which K may have information about T (not the orthogonal case, which as you mention is uninteresting).
     
  5. Apr 18, 2016 #4
    @mfb: I just realise your confusion might be due to the fact that I noted the mean and variance of the marginal distribution of K with μ_S and σ_S. That's a typo – the obvious notation would read (T,K)∼BiNormal((μT,μK),(σT,σK),ρ) for -1<ρ<1
     
  6. Apr 19, 2016 #5

    mfb

    User Avatar
    2016 Award

    Staff: Mentor

    Okay, so selecting K<a depletes your normal distribution of T in some one-sided way.

    If the cut is weak (e.g. a>μS+2*σS) then the mean and variance are not influenced much, but t can hold more information, especially with a strong correlation between the two variables, but you will still need the mean and variance (observed or expected) to relate this to K, otherwise you are completely insensitive to shifts/rescalings.
    If the cut is strong and correlation is weak, then I would expect that the overall shape gives you more information. The sample mean and variance alone don't help unless you know the mean and variance without cut.
     
  7. Apr 19, 2016 #6
    Thanks. This makes sense: if I understand correctly, your intuition is that t is informative about shift/rescaling (and how much information will be a function of the value of t and ρ). Do you know how to derive this more precisely? In other words: is there an analytical expression for it in terms of the parameters (i.e. (μT,μK),(σT,σK),ρ)?
     
    Last edited: Apr 19, 2016
  8. Apr 19, 2016 #7

    mfb

    User Avatar
    2016 Award

    Staff: Mentor

    t alone is not.
    (T,K)∼BiNormal((μT,μK),(σT,σK),ρ) and (T,K)∼BiNormal((μT,μK+c),(σT,σK),ρ) lead to the same distribution of t, but the best estimate of a has to be shifted by c. With a similar but a bit more complicated formula you can show that you can also change σK and a without changing anything related to t.

    I would be surprised if there is an analytic expression for the distribution of t as function of the other parameters (if we apply the cut on K>a).
     
  9. Apr 19, 2016 #8
    That's quite clear and helpful. Thanks!
     
  10. Apr 19, 2016 #9
    Your answer makes me wonder if applying the cut on both K and T would change things. For instance, for some other cut b, can we know what E[Min[ T | T>t , K<b]] is?
     
    Last edited: Apr 19, 2016
  11. Apr 19, 2016 #10

    mfb

    User Avatar
    2016 Award

    Staff: Mentor

    If you know the parameters of the distribution, and b, you can calculate this numerically. Same as above, I don't expect an exact analytic expression. For some parameters there can be a good analytic approximation.
     
  12. Apr 20, 2016 #11

    Stephen Tashi

    User Avatar
    Science Advisor

    I think the question should be rephrased. For each parameter of a distribution, any function of the sample values can be called an estimator of that parameter. We pay more attention to functions are "good" estimators of the parameter. There are various ways to defined what "good" means (e.g. unbiased, minimum variance, maximum liklihood).

    A question about estimators could be made specific by asking things like

    1. Is the minimum value of S in the sample an unbiased estimator of the distribution parameter "a" that defines the cutoff threshold for S ?

    or

    2. Is there a simple looking function f of the distribution parameters for which the minimum value of S in the sample is a maximum liklihood estimator for f evaluated at the particular parameter values of the distribution being sampled ?

    The more general question of "is there information" could be phrased in terms of a question about "sufficient statistics", which I think mfb is effectively doing.
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook

Have something to add?
Draft saved Draft deleted



Similar Discussions: Information contained in minimum value of truncated distribution
Loading...