简体   繁体   English

二维可视化Gensim短语的向量

[英]Visualize Gensim's Phrases' vectors in 2D

I'm using the Phrases class and want to visualize the vectors in a 2D space. 我正在使用Phrases类,并希望在2D空间中可视化矢量。 In order to do this with Word2Vec I've used T-SNE and it worked perfectly. 为了使用Word2Vec做到这一点,我使用了T-SNE,并且效果很好。 When I'm trying to do the same with Phrases it doesn't make any sense (words appear next to irrelevant words). 当我尝试对短语进行相同操作时,这没有任何意义(单词出现在不相关的单词旁边)。

Any suggestions on how to visualize the Phrases output? 关于如何可视化短语输出的任何建议?

As suggested/reported on the gensim mailing list , the key problem was that merely wrapping a corpus in Phrases results in an iterator that offers only one pass over the data. 正如gensim邮件列表中建议/报告的那样 ,关键问题在于仅将语料包裹在Phrases导致迭代器仅提供一次数据传递。 The Word2Vec model needs a corpus over which it can make multiple passes to do its vocabulary-discovery then multiple-passes of training. Word2Vec模型需要一个语料库,通过它可以进行多次遍历以进行词汇发现,然后进行多次遍历训练。 (If closely watching INFO-level logging, there should be indications that 'training' ended almost instantly in such a situation.) (如果密切关注INFO级别的日志记录,则应表明在这种情况下“培训”几乎立即结束。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM