Latent semantic analysis (LSA) single value decomposition (SVD) understanding

Question

Bear with me through my modest understanding of LSI (Mechanical Engineering background):

After performing SVD in LSI, you have 3 matrices:

U, S, and V transpose.

U compares words with topics and S is a sort of measure of strength of each feature. Vt compares topics with documents.

 U dot S dot Vt

returns the original matrix before SVD. Without doing too much (none) in-depth algebra it seems that:

 U dot S dot **Ut**

returns a term by term matrix, which provides a comparison between the terms. ie how related one term is to other terms, a DSM (design structure matrix) of sorts that compares words instead of components. I could be completely wrong, but I tried it on a sample data set, and the results seemed to make sense. It could just be bias though (I wanted it to work, so I saw what I wanted). I can't post the results as the documents are protected.

My question though is: Does this make any sense? Logically? Mathematically?

Thanks for any time/responses.

Answer 1

If you want to know how related one term is to another you can just compute

(U dot S)

The terms are represented by the row vectors. You can then compute the distance matrix by applying a distance function such as euclidean distance. Once you make the distance matrix by computing the distance between all the vectors the resulted matrix should be hollow symmetric with all distances >0. if the distance A[i,j] is small then they are related otherwise they are not.

Latent semantic analysis (LSA) single value decomposition (SVD) understanding

Question

1 answers

solution1
0 2013-03-04 11:27:54

Latent semantic analysis (LSA) single value decomposition (SVD) understanding

Question

1 answers

solution1 0 2013-03-04 11:27:54

solution1
0 2013-03-04 11:27:54