I'm posting this here, as i feel it's more probability-related than image processing.

I'm reading this lecture pdf.
At end of page 1 , beginning of page 2 it says:

Is this a bit vague, or am i missing something?
Since we are calculating frequency of the value a in the pixels of image A, when was the information about any position introduced?

Or does it mean that for *any* pixel in image A that contains the value a, check if the pixel(s) in image B at the corresponding position have the value b?

If this is too specific or too dependent on image processing,can you provide some material to build my intuition about joint probability distributions?

Let's say the images are
[tex] A = \begin{pmatrix} 1&1& 2 \\ 2&7&1 \end{pmatrix} [/tex]
and
[tex] B = \begin{pmatrix} 1&2&2 \\ 2&1&2 \end{pmatrix} [/tex]

Although there are (3)(3) = 9 possible ordered pairs of pixel values to consider, the total match-ups will only be 6.

As to whether we should call this "the" joint distribution of two images - that is merely the terminology that those authors wish to use. There are many many ways to take properties of images and define joint distributions.