For my explanation, you need to know some elementary QM. The probability to go from initial state to final state is the probability amplitude squared. This probability amplitude is denoted <f|i>, so P(i→f)=|<f|i>|2. The total probability amplitude is the sum of all indistinguishable paths between final state and intial state. If the paths are distinguishable, then it is the probabilities that are summed. Now, let's set up the double slit in the following way:
Let |i> represent the initial state at the source, and |f> represent the state at the detector. We denote the particle passing through slit one by the state |1>, and the particle passing through slit two by |2>.
The probability amplitude for the particle to be emitted by the source and to go through slit one is <1|i>. The probability amplitude for the particle to travel from slit one to the detector is <f|1>. Thus the probability amplitude for the particle to travel from the source, through slit to, then to the detector, is <f|1><1|i>. The same goes for slit two, just replace |1> with |2>.
With no way of telling the difference between the particle passing through slit one or two, the probability amplitude is <f|i>=<f|1><1|i>+<f|2><2|i>=A1+A2. Then the probability to go from source to detector is P(i→f)=|A1+A2|2=A12+A22+A1A2cos(Δψ), where Δψ is the relative phase between the two amplitudes. The last term, A1A2cos(Δψ), is the interference term.
Now, if you put some particle detection mechanism at slit one, you CAN distinguish between the two paths from intial state to final state. The means that we may no longer add probability amplitudes, but must rather add the probabilies, just like in classical mechanics. Explicitly then, P(i→f)=P(i→1→f)+P(i→2→f)=|<f|1><1|i>|2+|<f|2><2|i>|2=A12+A22. No interference term. All because you can distinguish the path.
That's the formal/mathematical explanation. The physical explanation is given elsewhere in this thread.