簡體   English   中英

將文件放入站點包文件后,punkt 中仍然出現 LookupError

[英]Still LookupError in punkt after putting the file into site-packages file

我試圖通過以下方式標記我從 web 獲得的文本:

import nltk,re,pprint
from nltk import word_tokenize
from urllib import request
#...getting file from web
tokens=word_tokenize(raw) #raw is the text from web

然后 LookupError 來了:

Traceback (most recent call last):
  File "<pyshell#56>", line 1, in <module>
    tokens = word_tokenize(raw)
  File "/usr/local/lib/python3.7/site-packages/nltk/tokenize/__init__.py", line 129, in word_tokenize
    sentences = [text] if preserve_line else sent_tokenize(text, language)
  File "/usr/local/lib/python3.7/site-packages/nltk/tokenize/__init__.py", line 106, in sent_tokenize
    tokenizer = load("tokenizers/punkt/{0}.pickle".format(language))
  File "/usr/local/lib/python3.7/site-packages/nltk/data.py", line 752, in load
    opened_resource = _open(resource_url)
  File "/usr/local/lib/python3.7/site-packages/nltk/data.py", line 877, in _open
    return find(path_, path + [""]).open()
  File "/usr/local/lib/python3.7/site-packages/nltk/data.py", line 585, in find
    raise LookupError(resource_not_found)
LookupError: 
**********************************************************************
  Resource [93mpunkt[0m not found.
  Please use the NLTK Downloader to obtain the resource:

  [31m>>> import nltk
  >>> nltk.download('punkt')
  [0m
  For more information see: https://www.nltk.org/data.html

  Attempted to load [93mtokenizers/punkt/PY3/english.pickle[0m
 Searched in:
    - '/Users/ic/nltk_data'
    - '/usr/local/Cellar/python/3.7.7/Frameworks/Python.framework/Versions/3.7/nltk_data'
    - '/usr/local/Cellar/python/3.7.7/Frameworks/Python.framework/Versions/3.7/share/nltk_data'
    - '/usr/local/Cellar/python/3.7.7/Frameworks/Python.framework/Versions/3.7/lib/nltk_data'
    - '/usr/share/nltk_data'
    - '/usr/local/share/nltk_data'
    - '/usr/lib/nltk_data'
    - '/usr/local/lib/nltk_data'
    - ''

我意識到這可能會發生,因為我還沒有下載“punkt”,然后我嘗試從 python 下載:

nltk.download('punkt')

但結果如下:

[nltk_data] Error loading punkt: <urlopen error [Errno 61] Connection
[nltk_data]     refused>
False

我想也許 inte.net 連接有問題? 所以我也從 web 下載了 punkt package 並將其放入我的站點包中的 nltk 文件中。 但是我一開始遇到的問題仍然存在。 現在不要對此做些什么 LOL! 任何建議!

我想我 knida 通過簡單地將單詞分成列表來解決問題! 完畢!

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM