PeroK said:
In other words, you can't really prove that ##\mathbf p \rightarrow -i\hbar \nabla##. But, if you could work QM out first, then you could show that ##-i\hbar \nabla \rightarrow \mathbf p##
This you cannot say in this general way. It depends on, where you start, whether something is provable or not.
I think the best way to learn quantum mechanics is to first learn classical mechanics in the analytical formulation, i.e., the Hamilton principle of least action in the Hamilton way and then a good deal of elementary Lie-group and Lie-algebra theory in terms of Poisson brackets. The key point is to understand Noether's theorem.
Then, what you indeed cannot derive starting from classical mechanics is of course quantum mechanics, because classical mechanics is an approximation of quantum mechanics for macroscopic observables on many-body systems. So you need some heuristics to understand, why there's a Hilbert space of states and operators that represent observables.
If you are that far, it's quite clear, how to get educated guesses for the operators representing a given observable known from classical mechanics: The 10 conservation laws of Newtonian mechanics connect the observables with Lie symmetries, and that's why the operators build a unitary representation of the corresponding symmetries on Hilbert space.
Now momentum is the conserved quantity related to spatial translations, i.e., it describes "infinitesimal spatial translations". Now take the representation of QM in terms of wave mechanics and check, how spatial translations are represented in terms of the wave equation. The most simple idea is to make the wave function a scalar field under translations, i.e., if the translation is described by ##\vec{x}'=\vec{x}-\vec{a}##, then
$$\psi'(\vec{x}')=\psi(\vec{x})=\psi(\vec{x}'+\vec{a}).$$
Now make this translation infinitesimal, so that
$$\psi'(\vec{x}')=\psi(\vec{x}'+\delta \vec{a})=\psi(\vec{x}')+\delta \vec{a} \cdot \vec{\nabla} \psi(\vec{x}')+\mathcal{O}(\delta \vec{a}^2).$$
Now ##\hat{\vec{p}}## should precisely do such in infinitesimal translation, i.e.,
$$\mathrm{i} \delta \vec{a} \hat{\vec{p}} \psi(\vec{x}) \stackrel{!}{=} \delta \vec{a} \cdot \vec{\nabla} \psi(\vec{x}).$$
Comparing both sides gives
$$\hat{\vec{p}}=-\mathrm{i} \vec{\nabla}.$$
Note that I used natural units with ##\hbar=1## here. It's easy to implement the ##\hbar## from dimensional reasoning:
$$\hat{\vec{p}}=-\mathrm{i} \hbar \vec{\nabla}.$$
That I introduced the imaginary ##\mathrm{i}## is for the reason that we want the generators of the symmetry and thus the operator representing momentum to be self-adjoint rather than anti-self-adjoint, because that's more convenient in the formalism.
Now from the infinitesimal symmetry generators you get the finite transformations by the operator exponential function, i.e., the unitary translation operator by definition fulfills
$$\vec{\nabla}_a \hat{U}(\vec{a})=\mathrm{i} \hat{\vec{p}} \hat{U}(\vec{a}),$$
and with the initial condition ##\hat{U}(0)=\hat{1}## this gives
$$\hat{U}(\vec{a})=\exp(\mathrm{i} \vec{a} \cdot \hat{\vec{p}}) .$$
Now indeed you get with our above result ##\hat{\vec{p}}=-\mathrm{i} \vec{\nabla}##
$$\hat{U}(\vec{a}) \psi(\vec{x})=\exp(\vec{a} \cdot \vec{\nabla} \psi(\vec{x}) = \sum_{k=0}^{\infty} \frac{1}{k!} (\vec{a} \cdot \vec{\nabla}) \psi(\vec{x})=\psi(\vec{x}+\vec{a}),$$
where in the last step I have used the Taylor expansion of functions.
As you see, indeed ##\hat{\vec{p}}=-\mathrm{i} \vec{\nabla}## is the generator of translations in quantum theory in the same way it is in classical mechanics in Hamiltonian formulation.
This heuristic connection brought Dirac to his formulation of quantum theory, by translating the classical quantitities and Poisson brackets to quantum descriptions by making the classical quantities self-adjoint operators on a Hilbert space and the Poisson brackets commutators (up to a factor ##\mathrm{i}##). Then the starting point to define position and momentum operators are the canonical Poisson brackets translating into the canonical commutation relations for the operators in quantum theory,
$$[\hat{x}_j,\hat{x}_k]=0, \quad [\hat{p}_j,\hat{p}_k]=0, \quad [\hat{x}_j,\hat{p}_k]=\mathrm{i} \hat{1}.$$
You can also use this "Heisenberg algebra" to derive the above result for the momentum operator using the position representation, i.e., the description of the state vectors in terms of "components" wrt. the position eigenbasis: ##\psi(\vec{x})=\langle \vec{x}|\psi \rangle##.