The problem with man popular-science texts about quantum theory is that they use photons as examples, because the historical development of quantum theory started on December 14, 1900 with Plancks talk about the derivation of the Planck spectrum of a radiating black body that he had found earlier empirically by using high-precision measurements from the Physikalisch Technische Reichsanstalt by Rubens and Kurlbaum over a up to then unprecedented range of wave lengths (frequencies).
Of all elementary "particles" the photon is the most difficult one to describe correctly in non-mathematical terms, because it is (as far as we know from experiments today) described by a quantized massless (and thus relativistic) vector field. The emphasis here lies on "quantized" and "massless" and thus "relativistic".
You must not envisage a "photon" as a little billard-ball like object. It's far from that. While this picture is often good for massive particles as a classical approximation. E.g., you can describe the motion of electrons or protons in a particle accelerator with very high accuracy by the motion of classical massive particles in the electromagnetic field provided by the accelerator's cavities.
But photons are very different. As massless particles with spin 1 it is not even possible to define a position observable in the strict sense of the word.
For both, massive and massless "particles" (I'd rather prefer to say "quanta" to avoid the word "particle" here at all) the idea of wave-particle duality is also misleading. This is an idea of the socalled "old quantum theory", which describes a set of rules dealing with quantum phenomena rather than a consistent theory. This ideas have been around for a quite short time between 1900 and 1925 and was developed by Einstein, Bohr, Sommerfeld, and others to describe phenomena in atomic physics, i.e., the motion of electrons in the field of atomic nuclei and interactions with electromagnetic field (usually treated as classical rather than quantum fields).
A bit later, Louis de Broglie came up with the idea that these rules could be derived from the idea that not only "photons" as the quanta of the electromagnetic field but also massive particles could be described as a wave-field phenomenon. E.g., the Bohr-Sommerfeld orbits of electrons around atomic nuclei could be understood as standing waves which need appropriate boundary conditions to "fit" into these orbits, much like the standing waves of a string of a violin, giving discrete frequencies and thus explaining the emission of sharp line spectra as observed already by Fraunhofer, Balmer, Kirchhoff, and others in the 19th century. His thesis of 1923 got a good evaluation from Einstein, and that was the moment, where the idea "wave-particle dualism" was born.
What was totally unclear was the question, what these waves might be. The next step was Schrödinger's development of "wave mechanics", which was one of the first formulations of quantum theory in 1926 (before Heisenberg, Born, and Jordan had developed "matrix mechanics", which very soon was proven to be the very same mathematical theory as "wave mechanics", just in other mathematical terms). At this time most physicists thought one could understand the Schrödinger wave function as a new kind of classical field, very similar to the electromagnetic field or the velocity, pressure, temperature, etc. fields of classical fluid dynamics.
However, this idea leads to totally misleading expectations about the real behavior of quanta. E.g., although one could show pretty soon after the development of wave mechanics that electrons indeed show interference phenomena, when going through a double slit. On the other hand, the interference pattern only occurs when shooting very many pretty well prepared electrons at the same momentum on the slit. Shooting with only one electron at a time, always gives one spot on the screen, which is more like what you expect from a particle picture. However, after collecting very many of such single electrons shot on the double slit independently of each other the interference pattern occurs, as if each electron seems to go somehow through both slits or at least "knows" about the presence of a second slit going only through one. In any case, both ideas make no real logical sense.
One of the most ingenious ideas then was made by Born in a now famous paper on the quantum theory of scattering as a little footnote (earning him a Nobel prize in 1953): The absolute square of the wave function, when properly normalized, gives the probability density for the electron's position and does not describe a kind of "density" of the electron in terms of a classical field as thought before in the "wave-particle duality" picture.
For many people this was unacceptable as a complete description of Nature (among them Einstein, Planck, and even de Broglie and Schrödinger who felt even sorry to have introduced the wave idea at all because of this conclusion of Born's). Even today some people seem to be not satisfied with this Statistical Interpretation, although it's still the most consistent interpretation of all quantum phenomena ever observed, and quantum theory must really be seen as the best tested theory ever, always been found to be correct, even in the circumstances, where the predictions are most far from anything you would expect from classical deterministic theories, where one only introduces statistical methods (a la Maxwell and Boltzmann) if a fully deterministica description is too complicated to describe some rough macroscopic "relevant" degrees of freedom to describe a system of very many particles in full microscopic detail. Here, even for one particle, where you could know in principle all its microscopic details at once, if only it would behave as a little classical billard ball.
But precisely this is not the case! Quantum theory says that it is principally impossible to prepare a particle to have, e.g., a definite momentum and position. Rather, if you make an electron's momentum very well defined, you must give up sharp knowledge about its position and vice versa. This is a pretty straight-forward conclusion from the quantum theoretical formalism, known as the Heisenberg-uncertainty principle, which tells you which different observables can be determined for a particle at a time ("compatible observables") and which cannot ("incompatible observables"). Within quantum theory there is no easy way to interpret it in terms of a deterministic theory, in which the indeterminism of the description comes into the game by just now knowing some also deterministic observables ("hidden variables"). As has been proven by John Bell, such a theory must be necessarily non-local. For the non-relativistic case Bohm and de Broglie have developed such ideas pretty far. There the Schrödinger wave function has been interpreted as a kind of "pilot wave" forcing the particles on "orbits" not unlike in the classical-particle picture, but it implies non-local interactions, and such an idea is pretty hard to think in connection with relativistic physics, which has to be applied when it comes to fast particles and to understand the fundamental interactions in terms of the very successful Standard Model of elementary particles.
In the relativistic case, it turns out that bombarding interacting particles at relativistic energies inevitably leads to the possibility to create new particles, destroy some particles originally present, or both at ones. E.g., it is in principle possible to create an electron-posititron pair by destroying a pair of photons. Although never been observed so far, this is an inevitable conclusion of Quantum Electrodynamics, which is the most accurate theory ever developed, describing some observables like the nomalous magnetic moment of the electron to an accuracy of 12 significant decimal places in accordance with the measurement which is possible at even higher precision than the theoretical calculation! Nevertheless right now, some experimentalists plan an experiment to demonstrate the annihilation of two photons to an electron-positron pair, and it's pretty likely that they succeed:
http://en.wikipedia.org/wiki/Breit–Wheeler_process