简体   繁体   中英

What does the score indicate in topic modelling

I used gimsm for LSA as per this tutorial https://www.datacamp.com/community/tutorials/discovering-hidden-topics-python

and I got the following output after running it for a list of text


[(1, '-0.708*"London" + 0.296*"like" + 0.294*"go" + 0.287*"dislike" + 0.268*"great" + 0.200*"romantic" + 0.174*"stress" + 0.099*"lovely" + 0.082*"good" + -0.075*"Tower" + 0.072*"see" + 0.063*"nice" + 0.061*"amazing" + -0.053*"Palace" + 0.053*"walk" + -0.050*"Eye" + 0.046*"eat" + -0.042*"Bridge" + 0.041*"Garden" + 0.040*"Covent" + -0.040*"old" + -0.039*"visit" + 0.039*"really" + 0.035*"spend" + 0.034*"watch" + 0.034*"get" + -0.032*"Buckingham" + 0.032*"Weather" + -0.032*"Museum" + -0.032*"Westminster"')]

What does -0.708 London indicate?

Those are the words mostly contributing to your topic, both positively and negatively. One of the characteristics of your topic seems to be, that it does not have anything to do with London. You can see that other "London-related" words also contribute negatively to your topic: Westminster, Tower and Eye are also negative for this topic.

So if a text lacks the word London, it is highly plausible that the text is about this topic, according to your model.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM