简体   繁体   English

Gensim WordRank嵌入中的问题

[英]Issues in Gensim WordRank Embeddings

I am using Gensim wrapper to obtain wordRank embeddings (I am following their tutorial to do this) as follows. 我正在使用Gensim包装器来获取wordRank嵌入(我正在按照他们的教程进行操作),如下所示。

from gensim.models.wrappers import Wordrank

model = Wordrank.train(wr_path = "models", corpus_file="proc_brown_corp.txt", 
out_name= "wr_model")

model.save("wordrank")
model.save_word2vec_format("wordrank_in_word2vec.vec")

However, I am getting the following error FileNotFoundError: [WinError 2] The system cannot find the file specified . 但是,我收到以下错误FileNotFoundError: [WinError 2] The system cannot find the file specified I am just wondering what I have made wrong as everything looks correct to me. 我只是想知道我做错了什么,因为一切对我看来都是正确的。 Please help me. 请帮我。

Moreover, I want to know if the way I am saving the model is correct. 此外,我想知道保存模型的方式是否正确。 I saw that Gensim offers the method save_word2vec_format . 我看到Gensim提供了方法save_word2vec_format What is the advantage of using it without directly using the original wordRank model? 不直接使用原始wordRank模型而使用它的好处是什么?

FileNotFoundError: [WinError 2] The system cannot find the file specified . FileNotFoundError: [WinError 2] The system cannot find the file specified

So, I am gonna assume here that you got the traceback on 所以,我要在这里假设您已经追溯了

model = Wordrank.train(wr_path = "models", corpus_file="proc_brown_corp.txt", 
out_name= "wr_model")

See, the wr_path is supposed to point to where you have your wordrank installed, to be more specific, the path to the folder where your wordrank binary is saved. 看到,wr_path应该指向您安装了wordrank的位置,更具体地说,它指向保存wordrank二进制文件的文件夹的路径。

So mine was path_to_wordrank_binary ='/home/ubuntu/wordrank' where wordrank is the folder that contains the wordrank.cpp 所以我的是path_to_wordrank_binary ='/home/ubuntu/wordrank' ,其中wordrank是包含wordrank.cpp的文件夹

Then ensure that your corpus file is on the current directory. 然后,确保您的语料库文件在当前目录中。 Since that's what you have given. 既然那是你给的。

This is the tutorial you should be looking into. 是您应该研究的教程。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM