Combining Distributions (ex. Mixture distribution, copula)

Click For Summary
SUMMARY

This discussion focuses on combining random variables (r.v.'s) from different populations using mixture distributions and copulas. A mixture distribution effectively describes a single random variable, while copulas manage the joint distribution of two r.v.'s, such as losses due to fraud and external circumstances. The participants conclude that mixture distributions are the most natural method for modeling the combined distribution of these r.v.'s, although alternative approaches like principal component analysis and independent component analysis can also represent data as sums of components.

PREREQUISITES
  • Understanding of mixture distributions in statistics
  • Familiarity with copulas for joint distribution modeling
  • Knowledge of random variables and their properties
  • Basic concepts of principal component analysis (PCA) and independent component analysis (ICA)
NEXT STEPS
  • Research the mathematical foundations of mixture distributions
  • Explore the application of copulas in financial modeling
  • Study principal component analysis (PCA) and its use in data representation
  • Investigate independent component analysis (ICA) for separating mixed signals
USEFUL FOR

Statisticians, data scientists, and financial analysts interested in modeling complex distributions and understanding the relationships between different random variables.

Apteronotus
Messages
201
Reaction score
0
This is a vague question and I apologize in advance for not being able to explain it better.

I'm combining r.v.'s from different populations (distributions). The resulting population can be thought to come from a mixture distribution. I think another way of describing the resulting distribution may be by the use of copulas.

I'm wondering if there are other ways aside from mixture distributions and copulas.

Many thanks
 
Physics news on Phys.org
Apteronotus said:
I think another way of describing the resulting distribution may be by the use of copulas.

A copula is used to handle a joint distribution of two random variables, so the two variables might represent two different physical quantities (e.g. weight and temperature). A mixture distribution represents a single random variable (e.g. weight). You can model the joint distribution of several random variables that have the same physical unit (e.g. weights of pieces of candy) and this would imply a distribution for their sum (e.g. weight of a box of 20 pieces of candy).

What are the physical units of the random variables involved in your study?
 
The physical units of my variables is $ dollars.

One random variable represents $'s lost due fraud. The second $'s lost due to external circumstances.

These r.v.'s may come from different distributions but they may not necessarily be independent.
 
How are you seeking to "combine" the random variables? If the object is compute total cost, this would imply adding them. Did you mean to ask for ways to "relate" two random variables?
 
Hi Stephen,

Thank you for your post. I'm sorry if I'm failing to describe the situation clearly. Here's a second attempt...

Suppose I take all incidences of loss due to fraud (r.v. X) and those of external circumstances (r.v. Y) and put them in one "box". Then the members of my box can come from either of X or Y populations each of which has a different distribution.

A mixture distribution allows me to describe the distribution of my box.

Is there another way of doing that aside from mixture distributions.
 
Apteronotus said:
Is there another way of doing that aside from mixture distributions.

I'd say no - meaning that that the natural model for "drawing a loss at random from the box containing two types of losses" is a mixture distribution. If you change your mental picture of how a loss is generated then the mathematics could change.

For example, suppose we think some frauds resulting from exaggerating actual losses from external circumstances. This could lead to a model where one first draws a loss due to external circumstances at random and then makes another random selection to determine the amount of fraud added to that loss. From that point of view, any way of representing a joint distribution of the two variables ( external loss, added fraud) would model the situation.
 
Hi Stephen,

Yes, I think a mixture distribution is a very natural way of describing the distribution of the "box".

Also, copulas allow a way to describe the joint distribution of the two r.v.'s.

Do you know if there any other statistic which combines or links the r.v.'s?
 
There are various empirical ways of representing data as sum of components. It can analyzed by "principal component analysis" or "independent component analysis". I suppose the representation of data as the sum of components can be called using a mixture of distributions, but the same data can be represented in different ways by mixtures.

Almost any function of two variables can be modified to create a joint distribution. What you mean by a "way" of relating two random variables isn't clear. Is a "way" a "family" of models that has known techniques for finding a member of that family that fits some given data?


Do you know if there any other statistic which combines or links the r.v.'s?

The word "statistic" has a technical meaning in mathematical statistics. Do you mean "statistic" in that technical sense?
 

Similar threads

  • · Replies 4 ·
Replies
4
Views
1K
  • · Replies 12 ·
Replies
12
Views
4K
  • · Replies 14 ·
Replies
14
Views
6K
  • · Replies 20 ·
Replies
20
Views
4K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 4 ·
Replies
4
Views
5K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 16 ·
Replies
16
Views
3K