Cosine Similarity: Explained and Examples

  • Context: Undergrad 
  • Thread starter Thread starter daveronan
  • Start date Start date
  • Tags Tags
    Cosine
Click For Summary

Discussion Overview

The discussion centers on the concept of cosine similarity, particularly in the context of comparing vectors. Participants explore its implications, definitions, and applications, including specific scenarios such as text search and audio feature comparison.

Discussion Character

  • Exploratory, Technical explanation, Conceptual clarification, Debate/contested

Main Points Raised

  • One participant states that a cosine similarity of 1 indicates that two vectors are equal, while a similarity of 0 indicates they are orthogonal, and -1 suggests they are oppositely directed.
  • Another participant clarifies that a cosine similarity of 1 means the vectors are scalar multiples of each other, and that negative values are not considered in certain contexts, such as text search.
  • A participant mentions they are comparing audio features (MFCCs), which can include negative values, thus allowing for a broader interpretation of cosine similarity.
  • Another contribution notes that similarity is domain-specific and that scalar multiples indicate similarity, while oppositely signed vectors indicate dissimilarity.

Areas of Agreement / Disagreement

Participants express differing views on the interpretation of cosine similarity, particularly regarding the treatment of negative values and the context in which similarity is assessed. There is no consensus on whether to take the absolute value of the cosine similarity or how to apply it across different domains.

Contextual Notes

Some participants highlight the limitations of their understanding and the domain-specific nature of similarity, indicating that the discussion may not cover all aspects of cosine similarity comprehensively.

Who May Find This Useful

Individuals interested in vector analysis, particularly in fields like machine learning, audio processing, and information retrieval, may find this discussion relevant.

daveronan
Messages
13
Reaction score
0
Just for clarification...

If I take the cosine similarity of two vectors and i get an answer of 1, then bother vectors are equal and the same.

If I do the same again with another two vectors and get an answer of 0, then the vectors are at an angle of 90 degrees to each other and dissimilar.

If I do the same again with another two vectors and get an answer of -1, then the vectors are at an angle of 180 degrees to each other and are even more dissimilar than they were at 90 degree or should I always take the absolute value of the answer. The equation I'm referring to can be found here.

http://upload.wikimedia.org/math/f/3/6/f369863aa2814d6e283f859986a1574d.png

Thanks!
 
Physics news on Phys.org
If the cosine of two vectors is 1 then they are scalar multiples of each other. If it is -1 then it is same except the scalar is negative. If the cosine is 0 they are orthogonal. Don't take the absolute value. The greater the number the greater the similarity.

As far as similarity goes, I'm guessing you are looking at the vector space model for text search since they represents documents as term weight vectors and similarity between documents is taken as the cosine between their corresponding vectors. This similarity is a separate notion that leverages the cosine and is not inherent in the cosine of two vectors. With that said, term weight vectors consist of non-negative weights and so all vectors will be within 90 degrees or less with each other and negative similarity is not possible so you only consider 0 to 1.
 
Hi TheOldHag,

First of, thanks for getting back to me.

I'm comparing audio features (MFCCs), so it's possible for me to get vectors with negative values.

Thanks for your answer. You can't beat clarification! :)
 
Vectors that are scalar multiples of each other are in the same one dimensional subspace of each other. I think similarity is domain specific so not sure how to answer the question. They are similar because they are scalar multiples and they are different because they are oppositely signed.

FYI, I'm answering questions because I'm in the process of learning myself but there are limits to what I know.
 

Similar threads

  • · Replies 26 ·
Replies
26
Views
4K
  • · Replies 33 ·
2
Replies
33
Views
5K
Replies
5
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 9 ·
Replies
9
Views
4K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 12 ·
Replies
12
Views
5K
  • · Replies 7 ·
Replies
7
Views
3K