The most direct way to illustrate this is using a single-slit measurement. Let's say you have a source of classical particle that emits this particle one at a time on demand, and emits it with a constant velocity and kinetic energy. At some distance from this source is a single slit. For clarity sake let's say the slit is alligned along the x-direction, so that the width of the slit is along the y-direction. The orientation of the x and y coordinate axes is in such a way that (using the right-handed coordinate system) the z-axis is along the direction of propagation of the particles. So the direction of the z-axis is from the source to the slit, and beyond.
Beyond the slit is a screen of detectors (could be a photographic plate, a CCD, etc.) This detector records where the particle hits after it passes through the slit. Let's say this screen is a distance L after the slit.
Now, let's get some basics out of the way:
1.If a particle gets through the slit, then I can say that my knowledge of the position of the particle at the slit has an uncertainty equal to the width of the slit. Thus, the width of the slit \Delta y is the uncertainty in the position of the particle when it passes through the slit.
2. The y-component of the momentum of the particle can be found by looking at how far the particle drifts along the y-direction when it hits the screen. This makes the explicit assumption that no external forces acts on the particle at and after it passes through the slit, so that it's momentum remains constant from the slit to the screen (which is a reasonable assumption). Let's say the particle drifts from the center, straight-through line and hits the screen at a distance Y. If it takes the particle a time T to reach the screen (which we can assume to be a constant if screen distance from the slit is much larger than the width of the slit (i.e. L >> \Delta y), then the y-component of the momentum is p_y \propto Y/T. Now, there is a measurement uncertainty here in determining where exactly the particle hits the detector. This measurement uncertainty depends on the resolution of the detector, how fine is the "mesh", etc. But this is NOT the "uncertainty" that is meant in the HUP. We haven't gotten to the uncertainty of the momentum YET. All we have is a measurement of the y-component of the momentum of the particle.
Now, let's do the experiment with the classical particles. You shoot the particles one at a time and record where it hits on the screen. Ideally, what you will end up on the screen is only one spot where the particles that pass through the slit hit. However, closer to reality is that you end up with a gaussian distribution at the slit, where the peak lies directly along the straight-through direction that has zero y-component of momentum. The uncertainty in the momentum then corresponds to the width of the gaussian distribution (full width at half maximum). Now THIS is \Delta p_x as referred to in the HUP!
Let's make the width of the slit smaller. This means \Delta y is smaller. You are now letting a smaller possible angle of incidence of the particle from the source to get through the slit. This means that there will be a smaller spread that is detected on the screen. The gaussian distribution will be thinner. So classically, what we expect is that as \Delta y gets smaller, \Delta p_y also correspondingly becomes smaller.
This is what we expect in classical mechanics. If all the initial conditions remans identical (I have the same source), then the more I know where the object is at any given instant, the more I can predict its subsequent properties. I can say what its y-momentum will be with increasing accuracy as I increase my certainty of its position by decreasing the width of the slit. I can easily predict where the next particle is going to hit since I will know what its momentum is going to be after it passes through a very narrow slit. My ability to predict such things increases with decreasing slit width.
Fine, but what happens with a quantum particle such as a photon, electron, neutron, etc.?
We need to consider two different cases. If the slit width is considerably larger than the deBroglie wavelength (or in the case of a photon, its wavelength) of the particle, then what you have is simply the image of the slit itself. The ideal situation would give you simply a "square" or gaussian distribution at the screen of the intensity of particles hitting the detector. This is no different than the classical case.
It gets interesting as you decrease the slit. By the time the width of the slit is comparable to the deBroglie wavelength, something strange happens. On the screen, the spread of the particles being detected start expanding! In fact, the smaller you make the slit width, the larger the range of values for Y that you detect. The "gaussian spread" now is becoming fatter and fatter. This is the single-slit diffraction pattern that everyone is familiar with.
Now THIS is the uncertainty principle at work. The slit width, and thus \Delta y is getting smaller. This implies that \Delta p_y is getting larger. Take note that the measurement uncertainty in a single is still the same as in the classical case. If I shoot the particle one at a time, I still see a distinct, accurate "dot" on the screen to tell me that this is where the particle hits the detector. However, unlike the classical case, my ability to predict where the NEXT one is going to hit becomes worse as I make the slit smaller. As the slit and \Delta y becomes smaller and smaller, I know less and less where the particle is going to hit the screen. Thus, my knowledge of its y-component of the momentum correspondingly becomes more uncertain.