Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Combining Distributions (ex. Mixture distribution, copula)

  1. May 13, 2014 #1
    This is a vague question and I apologize in advance for not being able to explain it better.

    I'm combining r.v.'s from different populations (distributions). The resulting population can be thought to come from a mixture distribution. I think another way of describing the resulting distribution may be by the use of copulas.

    I'm wondering if there are other ways aside from mixture distributions and copulas.

    Many thanks
  2. jcsd
  3. May 14, 2014 #2

    Stephen Tashi

    User Avatar
    Science Advisor

    A copula is used to handle a joint distribution of two random variables, so the two variables might represent two different physical quantities (e.g. weight and temperature). A mixture distribution represents a single random variable (e.g. weight). You can model the joint distribution of several random variables that have the same physical unit (e.g. weights of pieces of candy) and this would imply a distribution for their sum (e.g. weight of a box of 20 pieces of candy).

    What are the physical units of the random variables involved in your study?
  4. May 14, 2014 #3
    The physical units of my variables is $ dollars.

    One random variable represents $'s lost due fraud. The second $'s lost due to external circumstances.

    These r.v.'s may come from different distributions but they may not necessarily be independent.
  5. May 15, 2014 #4

    Stephen Tashi

    User Avatar
    Science Advisor

    How are you seeking to "combine" the random variables? If the object is compute total cost, this would imply adding them. Did you mean to ask for ways to "relate" two random variables?
  6. May 15, 2014 #5
    Hi Stephen,

    Thank you for your post. I'm sorry if I'm failing to describe the situation clearly. Here's a second attempt...

    Suppose I take all incidences of loss due to fraud (r.v. X) and those of external circumstances (r.v. Y) and put them in one "box". Then the members of my box can come from either of X or Y populations each of which has a different distribution.

    A mixture distribution allows me to describe the distribution of my box.

    Is there another way of doing that aside from mixture distributions.
  7. May 15, 2014 #6

    Stephen Tashi

    User Avatar
    Science Advisor

    I'd say no - meaning that that the natural model for "drawing a loss at random from the box containing two types of losses" is a mixture distribution. If you change your mental picture of how a loss is generated then the mathematics could change.

    For example, suppose we think some frauds resulting from exaggerating actual losses from external circumstances. This could lead to a model where one first draws a loss due to external circumstances at random and then makes another random selection to determine the amount of fraud added to that loss. From that point of view, any way of representing a joint distribution of the two variables ( external loss, added fraud) would model the situation.
  8. May 15, 2014 #7
    Hi Stephen,

    Yes, I think a mixture distribution is a very natural way of describing the distribution of the "box".

    Also, copulas allow a way to describe the joint distribution of the two r.v.'s.

    Do you know if there any other statistic which combines or links the r.v.'s?
  9. May 15, 2014 #8

    Stephen Tashi

    User Avatar
    Science Advisor

    There are various empirical ways of representing data as sum of components. It can analyzed by "principal component analysis" or "independent component analysis". I suppose the representation of data as the sum of components can be called using a mixture of distributions, but the same data can be represented in different ways by mixtures.

    Almost any function of two variables can be modified to create a joint distribution. What you mean by a "way" of relating two random variables isn't clear. Is a "way" a "family" of models that has known techniques for finding a member of that family that fits some given data?

    The word "statistic" has a technical meaning in mathematical statistics. Do you mean "statistic" in that technical sense?
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook