Combining Distributions (ex. Mixture distribution, copula)

Click For Summary

Discussion Overview

The discussion revolves around the combination of random variables from different populations, specifically exploring the concepts of mixture distributions and copulas. Participants are seeking alternative methods for describing the resulting distribution of combined random variables, which may not necessarily be independent, and are considering both theoretical and practical implications of these approaches.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested

Main Points Raised

  • One participant suggests that combining random variables from different populations can be modeled using mixture distributions or copulas.
  • Another participant explains that a copula is used to handle joint distributions of random variables with potentially different physical quantities, while a mixture distribution represents a single random variable.
  • A participant specifies that their random variables represent monetary losses due to fraud and external circumstances, which may come from different distributions but are not necessarily independent.
  • There is a question about how the random variables are being combined, with a suggestion that if the goal is to compute total cost, this implies adding them.
  • One participant asserts that a mixture distribution is a natural model for drawing a loss at random from a box containing two types of losses, while also suggesting that the model could change based on how losses are generated.
  • Another participant agrees that a mixture distribution is a natural way to describe the distribution of the combined losses and inquires about other statistical methods that might link the random variables.
  • One participant mentions empirical methods such as principal component analysis and independent component analysis as ways to represent data as a sum of components, noting that different representations can arise from mixtures of distributions.
  • There is a discussion about the clarity of the term "way" in relation to relating random variables, questioning whether it refers to a family of models with known fitting techniques.
  • A participant seeks clarification on the use of the term "statistic" and its technical meaning in mathematical statistics.

Areas of Agreement / Disagreement

Participants generally agree that mixture distributions and copulas are relevant models for the discussion, but there is no consensus on whether other methods exist or how to define the relationship between the random variables. Multiple competing views remain regarding the best approach to combine the distributions.

Contextual Notes

Participants express uncertainty about the independence of the random variables and the specific methods for combining them. There are also unresolved questions about the definitions and implications of terms used in the discussion, such as "statistic" and "way" of relating random variables.

Apteronotus
Messages
201
Reaction score
0
This is a vague question and I apologize in advance for not being able to explain it better.

I'm combining r.v.'s from different populations (distributions). The resulting population can be thought to come from a mixture distribution. I think another way of describing the resulting distribution may be by the use of copulas.

I'm wondering if there are other ways aside from mixture distributions and copulas.

Many thanks
 
Physics news on Phys.org
Apteronotus said:
I think another way of describing the resulting distribution may be by the use of copulas.

A copula is used to handle a joint distribution of two random variables, so the two variables might represent two different physical quantities (e.g. weight and temperature). A mixture distribution represents a single random variable (e.g. weight). You can model the joint distribution of several random variables that have the same physical unit (e.g. weights of pieces of candy) and this would imply a distribution for their sum (e.g. weight of a box of 20 pieces of candy).

What are the physical units of the random variables involved in your study?
 
The physical units of my variables is $ dollars.

One random variable represents $'s lost due fraud. The second $'s lost due to external circumstances.

These r.v.'s may come from different distributions but they may not necessarily be independent.
 
How are you seeking to "combine" the random variables? If the object is compute total cost, this would imply adding them. Did you mean to ask for ways to "relate" two random variables?
 
Hi Stephen,

Thank you for your post. I'm sorry if I'm failing to describe the situation clearly. Here's a second attempt...

Suppose I take all incidences of loss due to fraud (r.v. X) and those of external circumstances (r.v. Y) and put them in one "box". Then the members of my box can come from either of X or Y populations each of which has a different distribution.

A mixture distribution allows me to describe the distribution of my box.

Is there another way of doing that aside from mixture distributions.
 
Apteronotus said:
Is there another way of doing that aside from mixture distributions.

I'd say no - meaning that that the natural model for "drawing a loss at random from the box containing two types of losses" is a mixture distribution. If you change your mental picture of how a loss is generated then the mathematics could change.

For example, suppose we think some frauds resulting from exaggerating actual losses from external circumstances. This could lead to a model where one first draws a loss due to external circumstances at random and then makes another random selection to determine the amount of fraud added to that loss. From that point of view, any way of representing a joint distribution of the two variables ( external loss, added fraud) would model the situation.
 
Hi Stephen,

Yes, I think a mixture distribution is a very natural way of describing the distribution of the "box".

Also, copulas allow a way to describe the joint distribution of the two r.v.'s.

Do you know if there any other statistic which combines or links the r.v.'s?
 
There are various empirical ways of representing data as sum of components. It can analyzed by "principal component analysis" or "independent component analysis". I suppose the representation of data as the sum of components can be called using a mixture of distributions, but the same data can be represented in different ways by mixtures.

Almost any function of two variables can be modified to create a joint distribution. What you mean by a "way" of relating two random variables isn't clear. Is a "way" a "family" of models that has known techniques for finding a member of that family that fits some given data?


Do you know if there any other statistic which combines or links the r.v.'s?

The word "statistic" has a technical meaning in mathematical statistics. Do you mean "statistic" in that technical sense?
 

Similar threads

  • · Replies 4 ·
Replies
4
Views
1K
  • · Replies 12 ·
Replies
12
Views
5K
  • · Replies 14 ·
Replies
14
Views
6K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 20 ·
Replies
20
Views
4K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 4 ·
Replies
4
Views
6K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 16 ·
Replies
16
Views
3K