简体   繁体   English

Gensim 槌 CalledProcessError:返回非零退出状态

[英]Gensim mallet CalledProcessError: returned non-zero exit status

I'm getting an error while trying to access gensims mallet in jupyter notebooks.尝试在 jupyter 笔记本中访问 gensims mallet 时出现错误。 I have the specified file 'mallet' in the same folder as my notebook, but cant seem to access it.我在与笔记本相同的文件夹中有指定的文件“mallet”,但似乎无法访问它。 I tried routing to it from the C drive but I still get the same error.我尝试从 C 驱动器路由到它,但仍然出现相同的错误。 Please help :)请帮忙 :)

 import os from gensim.models.wrappers import LdaMallet #os.environ.update({'MALLET_HOME':r'C:/Users/new_mallet/mallet-2.0.8/'}) mallet_path = 'mallet' # update this path ldamallet = gensim.models.wrappers.LdaMallet(mallet_path, corpus=bow_corpus, num_topics=20, id2word=dictionary) result = (ldamallet.show_topics(num_topics=3, num_words=10,formatted=False)) for each in result: print (each)

Mallet 错误 CalledProcessError

在此处输入图片说明

Update the path to:将路径更新为:

mallet_path = 'C:/mallet/mallet-2.0.8/bin/mallet.bat'

and edit the notepad mallet.bat within the mallet 2.0.8 folder to:并将 mallet 2.0.8 文件夹中的记事本 mallet.bat 编辑为:

@echo off

rem This batch file serves as a wrapper for several
rem  MALLET command line tools.

if not "%MALLET_HOME%" == "" goto gotMalletHome

echo MALLET requires an environment variable MALLET_HOME.
goto :eof

:gotMalletHome

set MALLET_CLASSPATH=C:\mallet\mallet-2.0.8\class;C:\mallet\mallet-2.0.8\lib\mallet-deps.jar
set MALLET_MEMORY=1G
set MALLET_ENCODING=UTF-8

set CMD=%1
shift

set CLASS=
if "%CMD%"=="import-dir" set CLASS=cc.mallet.classify.tui.Text2Vectors
if "%CMD%"=="import-file" set CLASS=cc.mallet.classify.tui.Csv2Vectors
if "%CMD%"=="import-svmlight" set CLASS=cc.mallet.classify.tui.SvmLight2Vectors
if "%CMD%"=="info" set CLASS=cc.mallet.classify.tui.Vectors2Info
if "%CMD%"=="train-classifier" set CLASS=cc.mallet.classify.tui.Vectors2Classify
if "%CMD%"=="classify-dir" set CLASS=cc.mallet.classify.tui.Text2Classify
if "%CMD%"=="classify-file" set CLASS=cc.mallet.classify.tui.Csv2Classify
if "%CMD%"=="classify-svmlight" set CLASS=cc.mallet.classify.tui.SvmLight2Classify
if "%CMD%"=="train-topics" set CLASS=cc.mallet.topics.tui.TopicTrainer
if "%CMD%"=="infer-topics" set CLASS=cc.mallet.topics.tui.InferTopics
if "%CMD%"=="evaluate-topics" set CLASS=cc.mallet.topics.tui.EvaluateTopics
if "%CMD%"=="prune" set CLASS=cc.mallet.classify.tui.Vectors2Vectors
if "%CMD%"=="split" set CLASS=cc.mallet.classify.tui.Vectors2Vectors
if "%CMD%"=="bulk-load" set CLASS=cc.mallet.util.BulkLoader
if "%CMD%"=="run" set CLASS=%1 & shift

if not "%CLASS%" == "" goto gotClass

echo Mallet 2.0 commands: 
echo   import-dir        load the contents of a directory into mallet instances (one per file)
echo   import-file       load a single file into mallet instances (one per line)
echo   import-svmlight   load a single SVMLight format data file into mallet instances (one per line)
echo   info              get information about Mallet instances
echo   train-classifier  train a classifier from Mallet data files
echo   classify-dir      classify data from a single file with a saved classifier
echo   classify-file     classify the contents of a directory with a saved classifier
echo   classify-svmlight classify data from a single file in SVMLight format
echo   train-topics      train a topic model from Mallet data files
echo   infer-topics      use a trained topic model to infer topics for new documents
echo   evaluate-topics   estimate the probability of new documents given a trained model
echo   prune             remove features based on frequency or information gain
echo   split             divide data into testing, training, and validation portions
echo   bulk-load         for big input files, efficiently prune vocabulary and import docs
echo Include --help with any option for more information


goto :eof

:gotClass

set MALLET_ARGS=

:getArg

if "%1"=="" goto run
set MALLET_ARGS=%MALLET_ARGS% %1
shift
goto getArg

:run

"C:\Program Files\Java\jdk-12\bin\java" -ea -Dfile.encoding=%MALLET_ENCODING% -classpath %MALLET_CLASSPATH% %CLASS% %MALLET_ARGS%

:eof

in command line these were helpful commands to figure out what was going on:在命令行中,这些是有助于弄清楚发生了什么的有用命令:

notepad mallet.bat
java
C:\Program Files\Java\jdk-12\bin\java
dir /OD
cd %userdir%
cd %userpath%
cd\
cd users
cd your_username
cd appdata\local\temp\2
dir /OD

the problem is with java not being installed correctly or with the path not including java and the mallet classpath not being defined correctly.问题在于 java 没有正确安装或者路径不包括 java 并且没有正确定义 mallet 类路径。 More info here: https://docs.oracle.com/javase/7/docs/technotes/tools/windows/classpath.html .更多信息: https : //docs.oracle.com/javase/7/docs/technotes/tools/windows/classpath.html This solved my error hopefully it helps someone else :)这解决了我的错误,希望它可以帮助其他人:)

I got the same problem.我遇到了同样的问题。 What I did was change the location of mallet folder to the c://new_mallet so it worked nicely我所做的是将 mallet 文件夹的位置更改为 c://new_mallet,因此它运行良好

    import os
    os.environ.update({'MALLET_HOME': r'C:/new_mallet/mallet-2.0.8/'})
    mallet_path = 'C:/new_mallet/mallet-2.0.8/bin/mallet'  # update this path
    ldamallet = gensim.models.wrappers.LdaMallet(mallet_path, corpus=corpus, num_topics=10, id2word=id2word)

In Jupyter Notebook with Python, I run a在带有 Python 的 Jupyter Notebook 中,我运行了一个

conda uninstall gensim
conda install gensim

in cmd as an administrator and restarted my kernel.在 cmd 中以管理员身份重新启动我的内核。 Worked like charm after i spent horrendous hours online searching.在我花了可怕的时间在线搜索后,我的工作就像魅力一样。

Make sure you installed the Java Developers Kit ( JDK ).确保您安装了 Java 开发人员工具包 ( JDK )。

The credit goes to this another answer归功于另一个答案

After installing the JDK , the following codes for the LDA Mallet worked like charm!安装 JDK 后,LDA Mallet 的以下代码非常有用!

import os
from gensim.models.wrappers import LdaMallet

os.environ.update({'MALLET_HOME':r'C:/mallet/mallet-2.0.8/'})
mallet_path = r'C:/mallet/mallet-2.0.8/bin/mallet.bat'

lda_mallet = LdaMallet(
        mallet_path,
        corpus = corpus_bow,
        num_topics = n_topics,
        id2word = dct,
    )

For me, this was not an import or a path problem.对我来说,这不是导入或路径问题。

I spent hours trying to solve it.我花了几个小时试图解决它。 Tried this solution and nothing worked.尝试了这个解决方案,但没有任何效果。

Looking to a previous sucessfull call I made to LDA Mallet, I noticed some parameters were not being set, then I made it like this:回顾我之前对 LDA Mallet 的成功调用,我注意到一些参数没有被设置,然后我把它变成了这样:

gensim.models.wrappers.LdaMallet(mallet_path=mallet_path, corpus=corpus, num_topics=num_topics, id2word=id2word, prefix='temp_file_', workers=4) gensim.models.wrappers.LdaMallet(mallet_path=mallet_path,corpus=corpus,num_topics=num_topics,id2word=id2word,prefix='temp_file_',workers=4)

I really hope it helps you.我真的希望它能帮助你。 Finding a solution to this problem was a pain.找到解决这个问题的方法是一件痛苦的事情。

For linux, I found that one needs to explicitly define the binary mallet path.对于linux,我发现需要明确定义二进制槌路径。 The following code works.以下代码有效。

from gensim.test.utils import common_corpus, common_dictionary
from gensim.models.wrappers import LdaMallet

mallet_path = "/path/Mallet/bin/mallet"
model = LdaMallet(mallet_path=mallet_path, corpus=common_corpus, num_topics=2, id2word=common_dictionary)

For anyone else who is still struggling and spent hours trying many different suggestions, I finally got it working!对于仍在挣扎并花费数小时尝试许多不同建议的其他人,我终于让它起作用了!

follow the instructions here (I was on mac)按照此处的说明操作(我在 mac 上)

https://ps.au.dk/fileadmin/ingen_mappe_valgt/installing_mallet.pdf https://ps.au.dk/fileadmin/ingen_mappe_valgt/installing_mallet.pdf

I also closed anaconda before I started this, don't know if that's important.我在开始之前也关闭了anaconda,不知道这是否重要。

In the terminal I got the following error:在终端中,我收到以下错误:

(base) myname-MacBook-Air:mallet-2.0.8 myname$ ./bin/mallet
-bash: ./bin/mallet: /bin/bash: bad interpreter: Operation not permitted

then I followed these instructions to un-quarantine然后我按照这些说明取消隔离

“bad interpreter: Operation not permitted” Error on El Capitan “错误的解释器:不允许操作” El Capitan 上的错误

reopened anaconda and it all worked!重新打开 anaconda,一切正常!

我有同样的错误,因为我忘记在我的 ubuntu 上安装 java。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM