Estimate the source of a variable when there are several distributions

Click For Summary

Discussion Overview

The discussion revolves around estimating the source of a variable, specifically the probability that a measured dielectric property value originated from a specific injection volume in a series of experiments. The context includes statistical methods for analyzing overlapping normal distributions resulting from different fluid injection volumes.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant suggests using maximum likelihood estimation (MLE) as a suitable method for the problem.
  • Another participant proposes Bayesian inference, especially if prior information about the injection volumes is available.
  • A different viewpoint emphasizes the use of a minimum variance estimator, specifically mentioning the Best Linear Unbiased Estimator (BLUE) in the context of linear processes.
  • One participant questions how to calculate the probability that observed data came from a specific normal distribution after applying MLE, providing an example of estimated parameters.
  • Another participant mentions applying linear regression to estimate the injection volume and suggests that Bayesian inference could yield a true probability.
  • A later reply raises concerns about the well-defined nature of the mathematical question, suggesting that the interpretation of the measured value can affect the probability estimation.
  • This participant emphasizes the importance of knowing the prior distribution of injection volumes to compute probabilities accurately.

Areas of Agreement / Disagreement

Participants express multiple competing views on the appropriate statistical methods to use, including MLE and Bayesian inference. The discussion remains unresolved regarding the best approach to estimate the probability of the injection volume based on the observed dielectric property.

Contextual Notes

There are limitations regarding the assumptions made about the linearity of the process and the distribution of injection volumes, which are not fully defined in the discussion.

uzi kiko
Messages
22
Reaction score
3
Hello everyone

In my study, I inject a different amount of fluid in each experiment, such as 1 ml, 2 ml..., and test the change in the general dielectric properties of the solution.

Now that I have done much (over 100) measurements for each injection in a specific volume, one can see that for each injection the values of the new dielectric properties have a normal distribution with specified mean and variance.

For example: for injection of 1 ml, the results distribute with μ1 and σ1, for injection of 2 ml we get μ2 and σ2. When there is some overlap between the various distributions.

Now suppose I get an X value of dielectric property, what is the correct way to estimate the probability that the value X came from a specific injection volume?

Seemingly, I could use the Z test for each distribution separately, but it seems to me that the Z test does not take into account the other distributions.

Thank you!
 
Physics news on Phys.org
This sounds like an application best suited for maximum likelihood
 
Maybe Bayes?
 
  • Like
Likes   Reactions: Dale
WWGD said:
Maybe Bayes?
Good idea, particularly if there is some prior information about which volumes are more probable.
 
  • Like
Likes   Reactions: WWGD
I'm always in favor of Bayesian inference if you figure out how to do it. Gaussian distributions give you priors that are fairly easy to manipulate. If not, and because I'm pretty sure that your process is linear, then a minimum variance estimator is a good fallback. Look up BLUE (Best Linear Unbiased Estimator) as used in, e.g., "spectral unmixing."
 
  • Like
Likes   Reactions: Dale
Thank you very much for the super fast and great answers!
They are very helpful.

Let's say that I used MLE and came out with the most likelihood model.
For example, let say that for injection of 1ml the values will distribute as X~N(0,1) and for 2ml X~N(1,1).

Now, after I calculated the MLE for the newly observed data, I came out with an estimation about μ and σ, let's say for the example that I reach, μ=1.1 and σ=0.9.

My question is, how can I calculate the probability that the observed data came from N(1,1)?
 
Assuming the problem is linear, applying linear regression to the mean values gives you a first-order estimate. In the example above, the estimated volume would be 2.1 ml (not N(1,1)), and confidence (probability) can be inferred from the correlation coefficient. You need to first verify from your data that your system is linear.

To get a true probability, you could try applying Bayesian inference. There are other tests as well, but someone versed in classical statistics would know better than I.
 
Last edited:
uzi kiko said:
In my study, I inject a different amount of fluid in each experiment, such as 1 ml, 2 ml..., and test the change in the general dielectric properties of the solution.

Now suppose I get an X value of dielectric property, what is the correct way to estimate the probability that the value X came from a specific injection volume?

This is not a "well defined" mathematical quesion. (It's similar to asking how one finds the remaining sides of a triangle when given the length of one side and the size of one angle.)

One way to interpret "I get an X value of a dielectric property" is that you pick the X value at random from all the X values that you measured in your experiments, giving all the measured values an equal probability of being chosen. Let's suppose you did the same number of experiments with a 2 ml concentration as with the other oncentrations.

Another way to interpret "I get an X value of a dielectric property" is that another person who did experiments tells you the X value. He picks this value at random from the experiments that he did. Perhaps he did many more tests using a 2 ml concentration than the other concentrations.

The answer to "the probability the value X came from the specific injection volume" depends on knowing the probability distribution for how the injection volumes are selected. If you can specify a specific distribution for how the injection volumes are selected ( a "prior distribution") then we can compute things like "The probability the injection volume is 2 ml given X = 3.1".
 

Similar threads

  • · Replies 30 ·
2
Replies
30
Views
4K
  • · Replies 12 ·
Replies
12
Views
3K
  • · Replies 10 ·
Replies
10
Views
3K
Replies
5
Views
6K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 10 ·
Replies
10
Views
2K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 2 ·
Replies
2
Views
1K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K