Basic notation (conditional probability delim in linear equation)

dspiegel · Aug 31, 2010

Hey all.

Looking at "Pattern Recognition and Machine Learning" (Bishop, 2006) p28-31, the author appears to be using what would ordinarily be a delimiter for a conditional probability inside a linear function. See the first variable in normpdf as below. This is in the context of defining a Bayesian prior distribution over polynomial coefficients in a curve fitting problem.

[tex]p(\textbf{w} | \alpha) = NormPDF(\textbf{w} | \textbf{0}, \alpha^{-1}\textbf{I}) = \left(\frac{\alpha}{2\pi}\right)^{(M+1)/2} exp \left(-\frac{\alpha}{2}\textbf{w}^T\textbf{w}\right)[/tex]

Can anybody shine some light on this for me please?

Many thanks.

CompuChip · Aug 31, 2010

I don't know if this is precisely the case here, but sometimes delimiters other than comma are used in functions. I have mostly seen semicolons (;) and vertical bars (|).
Often this is done to separate arguments by meaning. For example, an author may write

Consider a normal distribution with mean [itex]\mu[/itex] and standard deviation [itex]\sigma[/itex]. We define the probability of finding a value between a and b as [tex]P(a, b \mid \mu, \sigma)[/tex] as ...

You can just as well write [tex]P(x, \mu, \sigma)[/tex]. However, writing a separate delimiter hopefully makes it more clear to the reader that a and b are really the variables here and, though technically mu and sigma are variables as well, in this case they are more like parameters that have been previously fixed (some arbitrary values for some normal distribution we are interested in).

dspiegel · Aug 31, 2010

CompuChip said:

I don't know if this is precisely the case here, but sometimes delimiters other than comma are used in functions. I have mostly seen semicolons (;) and vertical bars (|).
Often this is done to separate arguments by meaning. For example, an author may write

You can just as well write [tex]P(x, \mu, \sigma)[/tex]. However, writing a separate delimiter hopefully makes it more clear to the reader that a and b are really the variables here and, though technically mu and sigma are variables as well, in this case they are more like parameters that have been previously fixed (some arbitrary values for some normal distribution we are interested in).

Thanks for your reply.

Although I am quite sure that's not the case in this particular instance, in general, I know non-variable parameters may be written after a semicolon.

I believe the case to be that it reads as, "the value of [tex]t_n[/tex] evaluated for [tex]y(x_n,\textbf{w})[/tex]" as described on http://en.wikipedia.org/wiki/Vertical_bar#Mathematics".

Elsewhere the likelihood of the parameters [tex]\{w,\beta\}[/tex] is written for two i.i.d. variables [tex]\{\textbf{x,t}\}[/tex] where the function y(x,w) computes the predicted value of t.

[tex]p(\textbf{t}|\textbf{x},w,\beta) = \prod_{n=1}^N NormPDF(t_n|y(x_n, \textbf{w}),\beta^{-1})[/tex]

So it seams a reasonable interpretation in this context.

SW VandeCarr · Aug 31, 2010

dspiegel said:

Hey all.

This is in the context of defining a Bayesian prior distribution over polynomial coefficients in a curve fitting problem.

[tex]p(\textbf{w} | \alpha) = NormPDF(\textbf{w} | \textbf{0}, \alpha^{-1}\textbf{I}) = \left(\frac{\alpha}{2\pi}\right)^{(M+1)/2} exp \left(-\frac{\alpha}{2}\textbf{w}^T\textbf{w}\right)[/tex]

Can anybody shine some light on this for me please?

Many thanks.

I don't know what this is. The Bayesian expression for the conditional probability p(w|a) is:

p(w|a)=p(a|w)p(w)/p(a).

dspiegel · Aug 31, 2010

SW VandeCarr said:

I don't know what this is. The Bayesian expression for the conditional probability p(w|a) is:

p(w|a)=p(a|w)p(w)/p(a).

Well there're a bit more to it. The formula you quoted is just for the prior.

The derivation is thus.

[tex]p(w|x,t,\alpha,\beta)[/tex] = (likelihood * prior) / marginal likelihood

[tex]p(w|x,t,\alpha,\beta) \propto p(t|x,w,\beta) * p(w|\alpha)[/tex]

[tex]\{\alpha,\beta\}[/tex] are hyperparameters.

SW VandeCarr · Aug 31, 2010

dspiegel said:

Well there're a bit more to it. The formula you quoted is just for the prior.

The derivation is thus.

[tex]p(w|x,t,\alpha,\beta)[/tex] = (likelihood * prior) / marginal likelihood

[tex]p(w|x,t,\alpha,\beta) \propto p(t|x,w,\beta) * p(w|\alpha)[/tex]

[tex]\{\alpha,\beta\}[/tex] are hyperparameters.

OK. I was going by the original equation where the left side was simply [tex]p([/tex]w[tex]|\alpha)=[/tex]

Basic notation (conditional probability delim in linear equation)

Graduate Expected numbers of cards of a last color remaining

Undergrad The problem of points

Graduate Probability puzzle

Undergrad The countability paradox of computable numbers

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Basic notation (conditional probability delim in linear equation)

Similar threads