I'm also still struggling how to teach QT right (particularly as I have to challenge to teach it to high-school teacher students and I have only about 1/2 a semester time for it). It's difficult to teach both some intuition about quantum mechanics as well as enough math.
I think it is important to stress that pure states are not represented by normalized vectors but by rays, i.e., all physically observable content in pure states are the probabilities (or probability distributions) for the outcome of measurements, and these are given by Born's rule,
$$P(a)=|\langle a|\psi \rangle|^2,$$
where ##|a \rangle## are the eigenvectors of the self-adjoint operator ##\hat{A}##, representing some observable ##A##, and I assume that all eigenvalues are non-degenerate. Both the eigenvalues and the state vector should be normalized to 1, and it's clear that you can multiply both with arbitrary phases without changing the probabilities.
If you consider ##|\psi \rangle## written in terms of an arbitrary basis, of course you have the freedom to choose only one overall phase, and thus in your example you can arbitrarily choose one of the coefficients as real and the other with an arbitrary complex number to get all states in your two-level system.
Since (unit) rays are somewhat complicated, an equivalent alternative is to stress that states are in fact represented not by state vectors at all but by statistical operators, i.e., a self-adjoint positive semidefinite operator with trace 1. Then the pure state is just a special case, where the statistical operator is a projection operator an thus of the form ##\hat{\rho}=|\psi \rangle \langle \psi|## with a normalized vector ##|\psi \rangle##. It's immediately clear that for a given pure state ##\hat{\rho}## the vector ##|\psi \rangle## is only defined up to an arbitrary phase factor.