Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Calculating a covariance matrix with missing data

  1. Nov 26, 2013 #1
    Consider a co-variance matrix A such that each element ai,j = E(Xi Xj) - E(Xi) E(Xj) where Xi,Xj are random variables.

    Consider the case that each variable X has a different sample size. Lets say that Xi contains the elements xi,1, …, xi,N, and Xj contains the elements xj,1, ..., xj,n where each element is paired up to element n and N > n.

    In this case, for each covariance ai,j, is it acceptable to trim the sample size for each Xi and Xj to n and continue the calculation? (I'm not sure if trim is the correct terminology but it seems to meet my needs).

    If it is acceptable to trim, then is it necessary to trim to the smallest n of all of the random variables X, or can I just trim to the smallest of the pair?

    I'd appreciate it if anyone can point me in the direction of some literature that explains this in detail. I've been struggling to find something that is specific to this case.
     
  2. jcsd
  3. Nov 28, 2013 #2
    See if non-parametric statistical analysis might be what you are needing...
     
  4. Nov 28, 2013 #3

    Stephen Tashi

    User Avatar
    Science Advisor

    The way such a matrix is computed is from the joint distribution of [itex] X_i, X_j [/itex]. It isn't computed from sample data.

    Apparently, what you want to do is estimate the covariance matrix.

    You should look up methods of estimating covariance from samples that have missing data.


    You haven't given enough information to define the case. There is no general "best" method for doing this unless you make some assumptions - for example, assumptions about what family of distributions generated the data.

    http://icml.cc/discuss/2012/313.html [Broken]
     
    Last edited by a moderator: May 6, 2017
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook




Similar Discussions: Calculating a covariance matrix with missing data
  1. Covariance matrix (Replies: 2)

  2. Covariance matrix (Replies: 4)

Loading...