Probability of Normal Distribution Generating a Sample

verdverm · Jan 3, 2012

I would like to know how to calculate the probability that a normal distribution generated a sample.

More specifically, I am clustering lines so I have several assumed normal distributions. Each cluster has a mean and variance/StdDev.
of both slope and length.

Given a set of clusters (normal distributions) AND a sample line,
I would like to be able to calculate the probabilities for each cluster.

I think it is something like:
P(L|C_i) = P_len(L.len|C_i) * P_slp(L.slp|C_i)
I don't know how to calculate the two RHS probabilities.

Thanks in advance,
Tony

Stephen Tashi · Jan 3, 2012

verdverm said:

I would like to know how to calculate the probability that a normal distribution generated a sample.

Unless you adopt a Bayesian approach, you can't calculate "the probability that a normal distribution generated a sample".

Judging by remainder of your post, you may be able to calculate something that might be loosely interpreted as "the probability of a particular sample, give that we assume a particular normal distribution generated it." If that's what you meant, then we can discuss how to do it. First let's clarify what you are trying to do.

(The type of distinction you must make is between "The probability of A given B" versus "The probability of B given A". They aren't the same. )

verdverm · Jan 3, 2012

For a detailed specific reference: research.microsoft.com/pubs/144983/mod480-wang.pdf
( specifically the calculation of b_i(L) from section 3.3 )

I'm a little unclear on how the Bayesian comes into play...
perhaps because of the formula, perhaps because there are several clusters

a little clarification on the objective...

given a time series, I break it into line segments (Piecewise Linear Approximation).
Each line segment has a θ and a length.
Next I group the lines into clusters based on these values.
Then from each group/cluster we can calculate the mean and variance of the θ and length of the lines.

So at this point I have a bunch of clusters with 2 normal distributions each.
(one for θ and one for length) (joint probability?)

Now, given a new line, I want to associate a probability with each cluster.
This probability should encapsulate the likelihood that the cluster generated the new line.

Quote:
""(The type of distinction you must make is between "The probability of A given B" versus "The probability of B given A". They aren't the same. )""

I will always have the case of "The probability of LINE given CLUSTER"

Stephen Tashi · Jan 3, 2012

verdverm said:

This probability should encapsulate the likelihood that the cluster generated the new line.

I will always have the case of "The probability of LINE given CLUSTER"

Assuming you are attempting to define your goal, don't you see that these are contradictory statements?

Your first statement has the tortuous phrase "the probability should encapsulate the likelihood", but it amounts to saying that you want "the probability that a specific cluster generated the line given the data that defines the line". The second statement obviously refers to "the probability of the data that defines a line given the cluster than generated it".

The paper you mention assumes the reader is familiar with the context of applying "the Viterbi" algorithm. I'm not, but from a few minutes of Wikipedia surfing, this algorithm can be applied to data assumed to be from a Markov model. The Markov model has a vector of probabilities for its initial states. I suppose these might function as "prior probabilities" for a Bayesian analysis. Can you explain the probability model that the paper assumes?

verdverm · Jan 3, 2012

Not contradictory given that it is an iterative algorithm...

a Hidden Markov Model (HMM) has many states, each with:
- initial probability ( to start an observation series )
- transition probabilities ( to move from one state to another state ) { Matrix }
* output probability(ies) ( the probability of generating an observation )

the idea is to determine the hidden states of the model from the observations.

In the paper, instead of the points in time being the observations, the lines that approximate the data are the observations.

so my problem is with calculating the *output probabilities*

To initialize an iterative refinement, we first segment the series using the previously mentioned PLA
Then we cluster the lines created by PLA
Next, each cluster becomes a hidden state in an initial HMM (pHMM in the paper)
The output probabilities are calculated from the cluster of lines that is associated with the state (1-1 correspondence)

The output probabilities of a state are the {mean and variance} of the {angle and length}
of the lines that comprise the cluster. ( 4 values for the output in order to calculate probabilities later)So now we get to the iterative refinement stage after creating an initial HMM...
-- Re-segment the time series under guidance of the initial HMM
( this is where my question arises from )

given a candidate line from the new segmentation,
for each state in the HMM,
*** measure how likely it is that this state generated the candidate line [ b_i(L) in section 3.3 ]
***

measure is some how related to the two Gaussian distribution from each state ( angle & length )
and the current candidate line under consideration

The HMM will remain constant through the course of the re-segmentation
The candidate line will always be a different 'sample'

b_i(L) is used as part of a larger computation to find a new, optimal segmentation given the current HMMthe iterative process continues by

until HMM doesn't change
-- resegment with current HMM
-- create new HMM from resegmentationI could provide sample clusters and a single line if actual numbers are desiredTony

chiro · Jan 3, 2012

verdverm said:

Not contradictory given that it is an iterative algorithm...

a Hidden Markov Model (HMM) has many states, each with:
- initial probability ( to start an observation series )
- transition probabilities ( to move from one state to another state ) { Matrix }
* output probability(ies) ( the probability of generating an observation )

the idea is to determine the hidden states of the model from the observations.

If this fits any resemblence to a standard markov modeling problem (which it seems to do), then if you have the initial probabilities and the transition matrix, then what you need to do is to find the steady state solutions that should correspond to the "output probabilities".

Is the above assumption correct or is there something else that we are missing?

verdverm · Jan 3, 2012

okay, i think people are looking to far into this...

the problem I am having is simply this:

given (possibly joint) gaussian probability distributions

pHMM
1 |339|
theta: 1.4544 0.2695
lens: 26.8225 6.2101
2 |24|
theta: 0.8524 0.1335
lens: 2.4693 0.5381
3 |72|
theta: -0.9516 0.2081
lens: 3.7492 0.8248
4 |21|
theta: 0.0000 0.0000
lens: 2.0000 0.0000
5 |24|
theta: -0.1932 0.2335
lens: 3.1475 0.1783
6 |21|
theta: 0.6506 0.3428
lens: 3.3084 0.0837

and given a line:

line
theta: 1.0
lens: 3.0

what is the probability that the line belongs to / was generated by / fits in with / ... each cluster:
1: ?
2: ?
3: ?
4: ?
5: ?
6: ?

I need the probability of the line with each of the clustres in the pHMM

currently I am using a hack I think
( function of the standard deviations away from the mean with domian [-3,3] and range [0,1] )

func (s *pState) calcLineGenProb( length,theta float64 ) float64 {
lDiff, tDiff := length-s.lMean, theta-s.tMean // difference from mean
lNorm, tNorm := lDiff/s.lVari, tDiff/s.tVari // normalize to Std Deviations
ret := calcZscore(lNorm) * calcZscore(tNorm) // calc hack ~= [0,1]*[0,1]
return ret // return a value close to 1 if a 'probable' line, return close to zero if an 'unlikely' line
}

// hack helper function
func calcZscore( X float64 ) float64 {
X = math.Abs(X) // only care about magnitude
if X > 3.0 || math.IsNaN(X) { return 0.000001 }
d := int(X*100.0) // index scaling
d1 := d/10 // calc vert axis
d2 := d%10 // calc horz axis
z := ZSCORE[d1][d2] // [0.5,0.9999] table of zscores from back of probability book
R := (z-0.5)*2.0 // scale to range [0,0.5] then [0,1] so that close to mean is close to 0
return 1.0 - R // invert for close to 1 for good lines
}

Probability of Normal Distribution Generating a Sample

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Graduate Hypothesis testing: Defining H0, HA hypotheses so that ( H_A)_A' makes sense

Undergrad My basic understanding of set theory

Undergrad The problem of points

Graduate Expected numbers of cards of a last color remaining

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect