It rests on two fundamental postulates of quantum theory.
To describe a physical system within quantum theory you have a Hilbert space.
(1) An observable ##T## of the system are described by self-adjoint operators ##\hat{T}##. The possible results of measuring accurately the observables are the eigenvalues of that operator. The eigenvectors span a complete set of orthornormal vectors.
(2) A pure state of a system is described by a vector ##|\psi \rangle## with ##\langle \psi|\psi \rangle## (modulo a phase factor). If ##t## is an eigenvalue of ##T## and ##|t,\lambda \rangle## an orthonormal set of eigenvectors for this eigenvector, then
$$P(t) =\sum_{\lambda} |\langle t,\lambda|\psi \rangle|^2$$
is the probability to obtain ##t## when measuring ##T##.
Form this it follows that the expectation value for the outcome of measurements of ##T## given that the system is prepared in the pure state described by ##|\psi \rangle## must be
$$\langle T \rangle = \sum_t t P(t)=\sum_{t,\lambda} t \langle \psi | t,\lambda \rangle \langle t,\lambda|\psi \rangle=\sum_{t,\lambda} \langle \psi|\hat{T}|t,\lambda \rangle \langle t,\lambda \psi = \langle \psi |\hat{T}|\psi \rangle,$$
where in the last step I used the completeness of the eigenstates of ##\hat{T}##,
$$\sum_{\lambda,t} |\lambda, t \rangle |\langle \lambda,t |=\hat{1}.$$