# Joint probability of partitioned vectors

1. Jul 28, 2014

### scinoob

Hi everybody, I apologize if this question is too basic but I did 1 hour of solid Google searching and couldn't find an answer and I'm stuck.

I'm reading Bishop's Pattern Recognition and Machine Learning and in the second chapter he introduces partitioned vectors. Say, if X is a D-dimensional vector, it can be partitioned like:

X = [Xa, Xb] where Xa is the first M components of X and Xb is the remaining D-M components of X.

I have no problem with this simple concept. Later in the same chapter he talks about conditional and marginal multivariate Gaussian distributions and he uses the notation p(Xa, Xb). I'm trying to understand how certain integrals involving this notation are expanded but I'm actually struggling to understand even this expression. It seems to suggest that we're denoting the joint probability of the components of Xa and the components of Xb. But those are just the components of X anyway!

What is the difference between P(Xa, Xb) and P(X)?

It will be more helpful for me if we considered a more concrete example. Say, X = [X1, X2, X3, X4] and Xa = [X1, X2] while Xb = [X3, X4]. Now, the joint probability P(X) would simply be P(X1, X2, X3, X4), right? What is P(Xa, Xb) in this case?

2. Jul 28, 2014

### mathman

My guess: in later chapters he discusses Xa and Xb as separate entities.

3. Jul 28, 2014

### gill1109

There is no difference between p(Xa, Xb) and p(X), because X = (Xa, Xb). It starts to get interesting when we introduce marginal and conditional probability densities e.g. p(Xa | Xb) and p(Xb). Obviously, p(Xa, Xb) = p(X) = p(Xa | Xb) . p(Xb)

You get p(Xb) from p(X) by integrating out over Xa.

NB use small "p" for probability density. Use capital "P" for probability.