How to combine 2 distributions with different sample sizes?

This distribution can be used for modeling data from two different sources, where one source is expected to contribute more to the data than the other source. In summary, the distributions need to be combined using weighted averages to create a mixture distribution in the same format. This allows for modeling data from multiple sources with different contributions.
  • #1
batmantrippin
5
0
TL;DR Summary
I would like to know how to combine 2 distributions with different sample sizes and have a new 3rd distribution.
I apologise in advance for what is a very basic question for someone with a maths degree (it was a long time ago!).

I have 2 distributions that look something like this (but with much bigger samples), in the form of (probability,outcome). The outcome is literally just a number.

Distribution 1:
(0.1 , -1)
(0.2 , -0.9)
(0.25 , 0)
(0.3 , 4.5)
(0.15 , 7)

Distribution 2:

(0.05 , -1)
(0.05 , -0.8)
(0.05 , -0.5)
(0.05 , -0.2)
(0.1 , 0)
(0.1 , 3)
(0.1 , 6)
(0.1 , 6.5)
(0.2 , 7)
(0.2 , 7.5)

What is the best way to combine (average?) the two and have a third distribution in the same format that is basically an average of the two? Or am I asking for something that's not really doable?
 
Last edited:
Physics news on Phys.org
  • #2
Add them with weights proportional to the sample sizes.
 
  • Like
Likes batmantrippin
  • #3
The procedure described by @mathman gives what is called a mixture distribution.
 
  • Like
Likes batmantrippin

1. How do you combine two distributions with different sample sizes?

To combine two distributions with different sample sizes, you can use a weighted average method. This involves multiplying each data point in one distribution by the ratio of the sample sizes, and then adding it to the corresponding data point in the other distribution. The resulting values will be the combined distribution.

2. What is the purpose of combining two distributions with different sample sizes?

The purpose of combining two distributions with different sample sizes is to get a more accurate representation of the overall population. By combining the data from two different samples, you can reduce the impact of outliers and get a better understanding of the underlying distribution.

3. Can you combine distributions with different sample sizes if they have different means and standard deviations?

Yes, you can combine distributions with different sample sizes even if they have different means and standard deviations. However, it's important to note that the resulting combined distribution may not have a meaningful interpretation in terms of mean and standard deviation.

4. How does the sample size affect the combined distribution?

The sample size can have a significant impact on the combined distribution. A larger sample size will have a greater influence on the resulting distribution, while a smaller sample size will have a smaller influence. This is why it's important to use a weighted average method to combine distributions with different sample sizes.

5. Are there any limitations to combining distributions with different sample sizes?

Yes, there are some limitations to combining distributions with different sample sizes. For example, if the two distributions have vastly different shapes or underlying populations, combining them may not provide a meaningful representation. Additionally, the combination may not accurately reflect the true distribution if the sample sizes are significantly different.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
857
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
783
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
2K
  • Precalculus Mathematics Homework Help
Replies
5
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
898
Back
Top