Lagrange fanboy said:
But the norm of the wavefunction won't be 1 then, shouldn't this cause a problem down the line when calculating probabilities?
That's among the most difficult subtleties in the beginning of learning quantum theory. If you have self-adjoint operators with a continuous spectrum the corresponding "eigenfunctions" almost always are not normalizable to 1 and thus do not belong to the Hilbert space. The math was also a problem in the beginning of quantum theory, and the mathematicians were inspired by Dirac's invention of his distribution to develop modern functional analysis.
The pragmatic physicist's answer is that in this case the "eigenfunctions" are "normalized to a ##\delta## distribution", i.e.,
$$\langle p|p' \rangle=\int_{\mathbb{R}} \mathrm{d} x u_p^*(x) u_p(x).$$
Now you have
$$\hat{p} u_p(x)=-\mathrm{i} \hbar u_p'(x)=p u_p(x) \; \Rightarrow \; u_p(x)=A(p) \exp(\mathrm{i} p x/\hbar).$$
From the theory of Fourier integrals indeed you get
$$\langle p|p' \rangle=\int_{\mathbb{R}} \mathrm{d} x A^*(p) A(p') \exp[\mathrm{i} x (p'-p)/\hbar] = 2 \pi |A(p)|^2 \delta[(p-p')/\hbar] = 2 \pi \hbar |A(p)|^2 \delta(p-p').$$
To "normalize the momentum eigenfunction to the ##\delta## distribution" thus implies that you can set
$$A(p)=\frac{1}{\sqrt{2 \pi \hbar}.$$
The normalization factor is only determined up to an arbitrary phase factor, which is unimportant, i.e., you can take the above most simple choice for the normalization factor without loosing anything.
If you want to learn about the "rigged Hilbert space" mentioned in the previous posting, a good starting point is the textbook
L. E. Ballentine, Quantum Mechanics, World Scientific,
Singapore, New Jersey, London, Hong Kong (1998).