Let's start from scratch.
First of all one has to unlearn the wrong idea that a photon were some localized (or localizable) massless point particle. That picture of very early quantum theory (Einstein 1905) is outdated since 1925, where one learned that the quantum phenomenology can be described by relativistic quantum field theory (and until today that's the only theory we have for that phenomenology and it's among the best theories ever in being consistent with all observations at very high precision).
It's way better to think about a single photon, which is a very specific state of the electromagnetic field (a socalled one-"particle" Fock state), as an electromagnetic wave with the lowest possible intensity possible for a given frequency. It behaves in all aspects like such an electromagnetic wave except when it is detected, because then it can make one excitation of the material in a detector. This means that whenever you detect a single photon electromagnetic field state you can detect it only once. It's not possible to detect "half a photon" at one place and the other half at another place. This implies that if you use a photoplate (or more modern a CCD camera) a single-photon state of the em. field will be detected at one place, i.e., it leaves one spot on the photoplate or CCD cam.
If you want to demonstrate the wave nature of light, i.e., an electromagnetic wave, you must use a coherent source, i.e., it should be light with a pretty well defined wave vector (and thus also pretty well defined frequency). In an idealized limit it's something that's close to a plane wave of a given frequency and wave vector. If you use such light to illuminate a double slit, you'll see a nice intereference pattern behind the slit.
Now if you use a single-photon source and observe the single photons one by one hitting the screen, you'll see that each photon will indeed leave only one spot on the screen, i.e., the single photon is not observed as somehow "smeared" as a classical em. wave (quantum-field theoretical it's a socalled coherent state, which is a state of the em. field that has not a specified photon number). For each single photon you cannot predict, where it will hit the screen behind the double slit but if you wait long enough to collect very many photons, you'll find the same interference pattern as with a classical em. wave. This means that the intensity of the classical em. wave translates into the detection probability distribution of the single photon on the screen, where each single photon is "prepared" to have the same properties (i.e., pretty well defined wave vector or using the Einstein-de Broglie relation, momentum ##\vec{p}=\hbar \vec{k}##).
In this way modern QFT delivers a consistent description what in the old quantum theory was called "wave-particle dualism": On the one hand you have "particle properties" of single photons: It's detected always only at one spot on the screen. On the other hand you have "wave properties" of an ensemble of equally prepared single photons: The pattern found after collecting very many such photons on the screen is given by the interference pattern of the classical em. wave, but the meaning of the corresponding intensity is a probability distribution for the detection of the single photon at any place of the screen.
What's also important to note is the "contextuality", i.e., it also depends on the measurement you make on the em. field or on the single photon state of the em. field: If you place the screen very close to the slits, you won't observe an interference pattern at all. For the classical em. wave you'll simply see the two slits on the screen. That's easy to understand using Huygens's principle: From each point in the two slits there comes a spherical wave, and the pattern on the screen is due to the superposition of all these partial waves. If you put the screen very close to the slits, the spherical waves originating from one slit won't have much overlap with those coming from the other slit, and thus there's no interference between different partial waves, having a phase difference wrt. each other when hitting a given point on the screen, which explains why then there's no interference pattern. For a single photon you have to use the same wave picture but interpret the intensity of the classical wave as a probability distribution for hitting a given point on the screen. Putting the screen pretty close to the slits, given our argument with the classical wave, you can always with pretty much certainty say, through which slit each photon came, but you won't see the interference pattern when collecting many photons. You also won't be able to predict through which screen each photon will come but only when you observe it you know pretty well through which screen it came. This gedanken experiment shows, how modern QFT resolves the other mind-boggling feature of the old wave-particle dualism, i.e., that it depends on the measurement you make, whether you observe particle- or wave-like features of the same situation, and why it's impossible to observe both aspects with one experiment, i.e., if you put the screen very close to the screen, you gain "which-way information" for each single photon and observe "particle properties", but then you don't observe wave properties. If you put the screen very far away from the slits it's the other way: You'll find interference patterns when sending very many single photons through the slits but for each single photon it's impossible to say, through which of the slit it came. I.e., if you want an accurate observation of particle properties, you don't find any wave properties and vice versa. Of course you can make a compromise and put the screen not too close and also not too far from the slits. Then you get an interference pattern with "lower contrast", i.e., you'll neither be able to say with certainty through which slit a single photon came nor will you see a well developed interference pattern, but what you see can be predicted using the modern QFT.