简体   繁体   English

如何使用 `pip install -r requirements.txt` 通过 `requirements.txt` 下载 NLTK 语料库?

[英]How can I download NLTK corpora via `requirements.txt` using `pip install -r requirements.txt`?

One can download NLTK corpora punkt and wordnet via the command line:可以通过命令行下载 NLTK 语料库punktwordnet

python3 -m nltk.downloader punkt wordnet

How can I download NLTK corpora via requirements.txt using pip install -r requirements.txt ?如何使用pip install -r requirements.txt通过requirements.txt下载 NLTK 语料库?

For example one can download spacy models requirements.txt using pip install -r requirements.txt by adding the URL of the model (eg https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz#egg=en_core_web_sm==2.0.0 in requirements.txt ) For example one can download spacy models requirements.txt using pip install -r requirements.txt by adding the URL of the model (eg https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0 /en_core_web_sm-2.0.0.tar.gz#egg=en_core_web_sm==2.0.0 in requirements.txt )

How can I download NLTK corpora via requirements.txt如何通过requirements.txt下载 NLTK 语料库

Short answer: no way.简短的回答:不可能。

The URL for spacy models points to a Python package ( setup.py and all that) so it can be downloaded and installed by pip . The URL for spacy models points to a Python package ( setup.py and all that) so it can be downloaded and installed by pip . There are no such pip -installable packages for NLTK data.没有这样的pip安装包用于 NLTK 数据。 nltk.downloader downloads data in its own format. nltk.downloader以自己的格式下载数据。

There is no way to actually do this via a requirements.txt file.无法通过 requirements.txt 文件实际执行此操作。 However, if it is necessary for you to use NLTK for wordnet and punkt what you can do is have 2 files.但是,如果您有必要将 NLTK 用于 wordnet 和 punkt,您可以做的是拥有 2 个文件。 And download the nltk data in one and import that file into your main file.并下载 nltk 数据并将该文件导入您的主文件。 For example,例如,

nltkmodules.py: nltkmodules.py:

import nltk

nltk.download('wordnet')
nltk.download('punkt')

main.py:主要.py:

import nltkmodules

# Rest of Code goes here

In your requirements.txt, you can just include:在你的 requirements.txt 中,你可以只包括:

nltk==3.5

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM