Hi, I'm a grad student and the past semmester I took a qft course which led me to the exact same questions. So here is my humble understanding on this topic.
To understand what is really going on you need to understand what it means mathematically to canonically quantize your system.
In classical mechanics, after solving the Hamiltonian e.o.m, one ends up with a phase space M where each point on M represents a solution of the e.o.m, i.e. a state of our system. Infinitesimal symmetries are realized on M through the space of vector fields V on M. e.g. time translation symmetry is represented by a (Hamiltonian) vector field whose integral curves defines curves of constant energy. Thus moving along such a curve amounts to time evolution of your state. Clearly if you evaluate the Hamiltonian "function" on any point on such a curve will always give you the same number - the energy of your state.
This relation between a function and a vector field is not accident and it turns out that the space of functions F on M is in a 1-to-1 correspondance with V. So one could encode the same physics to the dual space of M, the dual phase space D. First let us choose a very convenient basis on D, the coordinate "functions" q and p, which allows us to view all functions on D as polynomials of q and/or p. Symmetries are now implemented through generating functions, e.g. the Hamiltonian, and these close under the Poisson bracket to form the ∞-dimensional Poisson algebra.
To proceed with quantization you must find a Hilbert space where your symmetry generators are realized as unitary operator. Equivalently, one needs to find a unitary representation of the Poisson algebra. It turns out that this can be achieved for functions up to 2nd order of p and/or q, otherwise you have ordering ambiguities, etc.
For NRQM one deals only with compact gorups which admit finite-dim unitary representations. However, the Poincare group is non-compact and therefore all of its unitary reps are ∞-dim. Thus this naive (1st) quatization of coordinate functions and etc, does not yield a nice Hilbert space where the Poincare generators can be dressed up with a unitary rep. So we need to quantize some other space.
Having said that, one may assume that the starting point of quantization is the space of wavefunctions which obey some wave eqtn , e.g. the KG eqn. A better name for this space is the space of classical fields, since at this point the wavefunctions have nothing to do with probability theory (the don't form a unitary rep). Once again we must find a unitary rep for the dual space of the field space and the associated Poisson algebra - this is sometimes called 2nd quantization, however you really quantize only once, and so I prefer to simply call it canonical quantization, like in QM. Once again, it is convenient to choose the "coordinate" functionals as the basis of the dual space and upon quantization, this will satisfy CCR. (if you are not convinced by this step, recall how one constructs the Hamiltonian functional in field theory and how that generates time translations through the Poisson bracket.)
So it is now understood how a field becomes an operator after quantization (in fact, one should've been curious how a coordinate becomes an operator in QM, in the first place).
Why does it describe an ∞ number of particles? Well, we have quantized a field which carries a continuous index, \varphi_{\vec{x}}(t), and by analogy with q_{i}(t), this must describe an ∞ number of particles. So let us put our argument in the following sequence:
Quantization of a relativistic system → unitary reps of Poincare → only ∞-dim reps → field represenations give rise to an ∞ number of Harmonic Oscillators, one for each \vec{x}
→ energy eigenvalues are not discrete like in QM, but rather continuous → cannot distinguish an N-particle state from an M-particle state of the same energy based purely on energy measurements (e.g. you can have a decay of a particle into two high-energetic ones traveling in opposite directions or to four particles of lower energy than the high-energetic ones, again with two of them traveling in one direction and the other two in the opposite direction).
Regarding your last question, in a quantum theory a particle is best thought as an irreducible representation of the group of symmetries of the given theory. The Poincare irreps are labeled by four numbers: mass, momentum, spin and either helicity or spin-z.
Now, to visualize a 1-particle this let us go back to the classical theory and do the following: pick the lowest-energy classical field configuration and consider a small perturbation about it, then linearize the e.o.m to find that the fluctuation obeys a relativistic e.o.m, e.g. the KG equation, etc. The solution (for a scalar field) is the sum of all Fourier modes ae^{ipx}. From Feynman's point of view, what we really quantize is these fluctuations (and more) about the "vacuum". In mmtm space a single mode looks like a delta function, which means it's localized on a point. In position space it propagates as a plane wave. In practice, one must deal with the wavepacket versions of these modes which propagate undistorted and therefore resemble to a lump of field (or if you prefer, of energy).
In canonical quantization we seek to find a unitary rep of these fluctuations and what we find is simply the irreps described before. Notice that we have constructed the 1-particle Hilbert space within the context of canonical quantization, but we visualized the physics with the help of the path integral.
At last, let me talk a little bit about interactions. The way that I visualize an interaction of particles is that I have some particle states coming in, i.e. some well-separated lumps of field which can be identified with 1-particle states, mixing up at a point (due to locality) resulting into all sort of undeformed fuzzy shapes of field distribution, which cannot be identified with identified as some kind of a state and then yielding some new localized lumps of field which, if you wait long enough to separate themselves, you will be able to identify with some new 1-particle states. Mathematically, this occurs because you have attempted to quantize a highly non-linear set of e.o.m which makes your task of finding an (exact) interacting Hilbert space extremely difficult. (or if you prefer, can you build up a unitary rep for this nonlinear system?)
Having drawn this picture, it is obvious why the number of particles is not fixed in a relativistic quantum theory.