How does Voice Recognition work?

    How does Voice-Recognition work? Not, Speech recognition but voice. Like I would know Justin Bieber's voice with out seeing his face. Like if he had a new song and I did not know about it and I heard it in the car while driving, I would know that it was Justin Bieber. How does that work ? I looked it up on Google but I am getting sites about Speech recognition.

    would you just use a fft and save the key frequencies and magnitudes?
    and then compare it , every time a person talks ?
    Frustrating isn't it, the search.

    Apparantly no two people have the same manner of speaking, since for one thing, their vocal tracts are different and speech mannerisms also.
    Wiki has bit on it
