- #1
Adel Makram
- 635
- 15
In Latent semantic analysis, the truncated singular value decomposition (SVD) of a term-document matrix ##A_{mn}## is
$$A=U_rS_rV^T_r$$
In many references including wikipedia, the new reduced document column vector in r-space is scaled by the singular value ##S## before comparing it with other vectors by cosine similarity. This yields ##q^T_r S## where ##q^T_r## is just the component of a column vector of the ##V^T_r## matrix. and ##S## is the corresponding singular value.
But in other references, only ##q^T_r## is used for cosine similarity. which one of them is more appropriate and why?
$$A=U_rS_rV^T_r$$
In many references including wikipedia, the new reduced document column vector in r-space is scaled by the singular value ##S## before comparing it with other vectors by cosine similarity. This yields ##q^T_r S## where ##q^T_r## is just the component of a column vector of the ##V^T_r## matrix. and ##S## is the corresponding singular value.
But in other references, only ##q^T_r## is used for cosine similarity. which one of them is more appropriate and why?