But a superposition of pure states is still a pure state, because the coefficients in the superposition are complex numbers with not only magnitude, but phase. This means you have "coherences" between the states you are superimposing, and that is what preserves the purity of the state (and also allows for interference between the contributing states as you follow the future time evolution of the entire pure state). A mixed state is essentially a weighted combination of pure states, where the weights refer to the probability of each state being actualized in a measurement, but the weights have no phase (or random phase interrelations, if you prefer), so the presence of multiple states does not lead to interference.
To give you an example, in the two-slit experiment, if the apparatus does not establish which slit the particle went through, then the emergent wave function is a pure state that is a superposition of states that go through either slit (and create an interference pattern). If the apparatus does establish which slit the particle went through, but we are not privy to that information, then the emergent wave function we would use is a mixture of the states that involve going through each slit (and this does not create a two-slit interference pattern because the pieces of the mixture don't interfere with each other). The way this might be said in common parlance is, in the first case, the "particle went through both slits" (or, I prefer saying the issue of which slit is simply not meaningfully defined by that experiment), and in the second case, "the particle went through one slit or the other we just don't know which."
Whether or not there is such a thing as a mixed state that does not represent information that has been established but we are not privy to, is a matter of interpretation of quantum mechanics. For example, if we have an ensemble of identically prepared particles, and they encounter an apparatus that we would use a mixed state to describe the outcome, that might mean the particles in the ensemble "really are" in one state or the other, and our lack of information is just around which one is which (which wavefunction that is, not which particle-- the particles are usually identical and indistinguishable), or it might mean that every particle is itself "really" in a mixed state, and no further information really exists. The former is more commonly taught, as it is the "Copenhagen" approach, but the latter has many proponents as well (it is the "many worlds" approach).