Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Cosine Similarity

  1. Nov 21, 2013 #1
    Just for clarification...

    If I take the cosine similarity of two vectors and i get an answer of 1, then bother vectors are equal and the same.

    If I do the same again with another two vectors and get an answer of 0, then the vectors are at an angle of 90 degrees to each other and dissimilar.

    If I do the same again with another two vectors and get an answer of -1, then the vectors are at an angle of 180 degrees to each other and are even more dissimilar than they were at 90 degree or should I always take the absolute value of the answer. The equation I'm referring to can be found here.


  2. jcsd
  3. Nov 21, 2013 #2
    If the cosine of two vectors is 1 then they are scalar multiples of each other. If it is -1 then it is same except the scalar is negative. If the cosine is 0 they are orthogonal. Don't take the absolute value. The greater the number the greater the similarity.

    As far as similarity goes, I'm guessing you are looking at the vector space model for text search since they represents documents as term weight vectors and similarity between documents is taken as the cosine between their corresponding vectors. This similarity is a separate notion that leverages the cosine and is not inherent in the cosine of two vectors. With that said, term weight vectors consist of non-negative weights and so all vectors will be within 90 degrees or less with each other and negative similarity is not possible so you only consider 0 to 1.
  4. Nov 22, 2013 #3
    Hi TheOldHag,

    First of, thanks for getting back to me.

    I'm comparing audio features (MFCCs), so it's possible for me to get vectors with negative values.

    Thanks for your answer. You can't beat clarification! :)
  5. Nov 22, 2013 #4
    Vectors that are scalar multiples of each other are in the same one dimensional subspace of each other. I think similarity is domain specific so not sure how to answer the question. They are similar because they are scalar multiples and they are different because they are oppositely signed.

    FYI, I'm answering questions because I'm in the process of learning myself but there are limits to what I know.
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook