Niles said:
Why is it more probable to detect coincident photons of a single mode thermal light than in coherent light?
This is a consequence of the noise (variance of the underlying photon number distribution. You can describe the momentary photon number as \langle n \rangle +\Delta, the sum of the mean photon number and a fluctuation. After reordering the operators, g^{(2)}(0) is defined as:
g^{(2)}(0)=\frac{\langle n (n-1) \rangle}{\langle n \rangle^2}=\frac{\langle \langle n \rangle+\Delta (\langle n \rangle +\Delta-1) \rangle}{\langle n \rangle^2}
which is
\frac{\langle \langle n \rangle^2 + 2 \Delta \langle n \rangle \Delta^2 -\langle n \rangle -\Delta \rangle}{\langle n \rangle^2}
The expectation values of all terms linear in \Delta vanish as the expectation value of the deviation from the mean value must be zero on average. What is left is:
g^{(2)}(0)=\frac{\langle n\rangle^2 +\langle \Delta^2 \rangle -\langle n \rangle}{\langle n \rangle^2}=1-\frac{1}{\langle n \rangle }+\frac{\langle \Delta \rangle^2}{\langle n \rangle^2}
Now the interesting term is \langle \Delta^2 \rangle which is the variance of the photon number distribution. For coherent light this is a Poissonian distribution, while it is a Bose-Einstein distribution for thermal light. Now the variances of these distributions are well known. You get \langle \Delta^2 \rangle=\langle n\rangle for a Poissonian distribution and \langle \Delta^2 \rangle=\langle n \rangle^2 +\langle n\rangle for the Bose-Einstein distribution. Inserting those into the equation gives the well-known values of 1 and 2, respectively.
Niles said:
Also, I cannot see how it even make sense to have g(2)(tau) > 1 for some fields. Wasn't g(2)(tau) a probability?
It is a relative or normalized probability. In detail, it is the probability to detect a secoond photon after a time delay \tau after the first photon was detected, compared to the same probability if the photons were emitted statistically independent of each other. In other words: If you take some snapshots (integration time much shorter than the coherence time) and count the total photon number inside them and get on average one photon every 6 pictures, you would expect to detect a pair of photons every 36 pictures on average. If you have the same mean photon count rate, but detect a pair every 18 pictures on average (and accordingly more pictures without any photons inside), you have a twofold increase which would be the signature of thermal light.