Well, math is the only language which adequately describes the (quantum) world. There's no other way to merely express the content of quantum mechanics.
That said, it's good advice to learn non-relativistic quantum theory before entering the more complicated topic of relativistic quantum field theory. Particularly photons are, in my opinion, the worst starting point from a didactical point of view since photons are very far from anything "particle like" we are used to by our experience of macroscopic objects like billard balls. Photons do not even admit a proper definition of a position observable, i.e., there's no way to localize a single photon in any sense of classical particles at all.
Also, forget the age-old outdated concept of "wave-particle dualism" and phrases like "something interferes with itself" or, even worse, "a particle is at several places at once" etc. The only consistent way to talk about quantum phenomena is a formal mathematical language in terms of Hilbert spaces and operators acting on Hilbert space. The physical meaning is provided by Born's Rule, leading to the probability interpretation of quantum states.
That said, let's look at the interference phenomena with single photons. First of all a single photon is an elementary quantum of the electromagnetic field, and to make a more intuitive picture it's better to think in terms of classical electromagnetic field modes. One complete set of such classical field modes are the plane-wave solutions of the charge-current-free Maxwell equations. It's defined by a wave vector ##\vec{k} \in \mathbb{R}^3## and a polarization vector ##\vec{\epsilon}_{\lambda}(\vec{k})##, which for simplicity we can choose also as real ##\mathbb{R}^3##-vectors. Elektromagnetic waves are transverse, i.e., you have ##\vec{\epsilon}_{\lambda}(\vec{k}) \cdot \vec{k}=0##. Thus there are two linearly independent polarization vectors, i.e., ##\lambda \in \{1,2\}##. We choose these vectors conveniently to be unit vectors which are perpendicular to each other and obeying ##\vec{\epsilon}_1(\vec{k}) \times \vec{\epsilon}_2(\vec{k})=\vec{k}/|\vec{k}|##. The frequency of the plane wave is ##\omega=c |\vec{k}|##, where ##c## is the speed of light (in vacuum).
In quantum mechanics each free-field mode corresponds to a harmonic oscillator, and you can define a basis of state vectors ##|\{N(\vec{k},\lambda)\} \rangle##, where ##N(\vec{k},\lambda) \in \mathbb{N}_0=\{0,1,2,\ldots \}##. If all numbers ##N(\vec{k},\lambda)=0## that defines the "vacuum", i.e., no em. fields/photons present. If ##\sum_{\vec{k},\lambda} N(\vec{k},\lambda)=1## you have a one-photon state and so on. This is called the (bosonic) Fock space spanning the Hilbert space of states for the quantized electromagnetic field.
Interference for single is now no brainer anymore: you can superimpose several single-photon states, leading to a new single-photon state. The probability to find a photon at a certain place ##\vec{x}## is simply proportional to the intensity of the em. field, i.e.,
$$I(\vec{x}) \propto \langle \Psi |\hat{\vec{E}}^2(t,\vec{x})|\Psi \rangle,$$
where ##\hat{\vec{E}}(t,\vec{x})## is the electric field (in the Heisenberg picture of time evolution) and ##|\Psi \rangle## the quantum under investigation, e.g., in our discussion a single-photon state.
Nowadays it's pretty easy to prepare single-photon states, but it's not so easy from a historical perspective. One way is parametric down conversion, using a birefringent crystal and a laser to provide true single-photon states, and one can do experiments with it.
The most simple experiment to show that one has prepared a single-photon state is the use of a Mach-Zehnder interferometer
https://en.wikipedia.org/wiki/Mach–Zehnder_interferometer
Just consider a setup without any phase shifter or sample as in Fig. 3 of the Wikipedia article in it (as described inhttps://en.wikipedia.org/wiki/Mach%E2%80%93Zehnder_interferometer#Observing_the_effect_of_a_sample) . First use classical coherent light, nowadays provided easily by a laser, and adjust the beam splitter such that there is full intensity at detector 1 and zero intensity at detector 2. As described by the Wikipedia article due to the Fresnel rules, known from classical wave optics, the intensities at the detectors is due to constructive interference at detector 1 and destructive interference of dectector 2. The interference occurs when combining the partial waves going the one or the other way through the optical setup including the half-silvered mirrors ("beam splitters"). Then putting in the sample leads to some intensity at Detector 1 and some intensity at Detector 2, again due to the interference of the partial beams going either way through the beam splitters. For classical light, one can think indeed in terms of "splitting" of the partial waves going through the apparatus, no matter how dim this light may be. Quantum mechanically such a classical em. wave as provided by a laser is a socalled "coherent state". In terms of photons, as defined above, it's a superposition of state vectors with any number of photons, i.e., the photon number is indetermined. No matter, how dim the laser light may be, there's always a probability to have more than one photon in the apparatus, and it can indeed always happen that you find one photon at detector 1 and one at detector 2 at the same time (or rather "in coincidence"). So with dimmed laser light you cannot prove the existence of single photons, simply because you don't investigate one-photon Fock states but coherent states.
Now consider to use true single-photon states. In the setup that there is no sample in, you always register a single photon at detector one and never one at detector 2, which is no surprise, given that the probabilities to register the photon at detector 1 or 2 is given by the intensity of the classical em. waves. Of course, that this is experimentally really found proves this picture right, i.e., that also in the case of true single-photon states, the constructive interference at detector 1 and the destructive interference of detector 2 holds also for single photons. In a sloppy language one says "the single photon interferes with itself", but it should have become clear that this is indeed only sloppy language, and nowhere in my description was the idea of photons as localized billard-ball-like particles in the apparatus! To the contrary everything is explained in the picture of "quantized fields", and very little by a particle-like picture.
The particle-like properties of single-photon states comes, however, into the game if we looking at the situation with the sample in place, where there's some non-zero intensity on both detectors for the classical waves. Now using single photons, these intensities provide the probabilities (and nothing else!) to find the single photon either at detector 1 or detector 2. There's no way to predict, where a specific single photon will end up, but what's found with very high accuracy is that indeed one never ever measures coincident countings at both detectors! Only either detector 1 or detector 2 registers the photon. Using very many single photons, you'll find precisely the statistics predicted by the above given modern description of single photons in terms of quantized electromagnetic fields. The only particle-like property is that the single-photon states indeed imply that these "tiniest lumps of electromagnetic fields at a given frequency" cannot be split in two such quanta. This is impossible already by the simple fact of energy conservation. Each single photon has an energy of ##\hbar \omega##, and you simply cannot create two photons of the same frequency, because then you'd have ##2\hbar \omega## energy, and this cannot happen in a "passive" apparatus like the Mach-Zehnder interferometer. On the other hand, all we know are the probability that the photon gets registered either at detector 1 or at detector 2, and these probabilities are given by the intensity of the corresponding classical em. wave. Only in this sense there's some kind of "wave-particle duality" left in this modern picture of quantum optics provided by the theory of quantized fields, i.e., quantum electrodynamics (QED).
It's long overdue that textbooks do not provide anymore the old-fashioned picture of photons as billard-ball-like particles!