NLTK TweetTokenizer不起作用（Python）

Question

我目前已安裝NLTK並已運行命令nltk.download() 。 但是，並非所有庫都已安裝（卡在panlex_lite上）。

問題是，當我嘗試導入Tweet Tokenizer時，出現錯誤：

在第7行的文件“ create_docs.py”中
 from nltk.tokenize import TweetTokenizer ImportError: cannot import 
名稱TweetTokenizer

我該如何處理？ 干杯!

Answer 1

這是因為未正確安裝庫，因此需要跳過“ panlex_lite”庫並且應該可以使用。

當前對此是未解決的問題，解決方案如下：

I guess, we could add something like if id != 'panlex_lite' to the code...

But, as for me, the easiest way looks like this:

get https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/index.xml
remove panlex from it
upload it to a public Gist
pass the gist's url to the downloader: python -m nltk.downloader -d /usr/local/share/nltk_data -u https://gist.githubusercontent.com/demidovakatya/61dab385d74065ae825c80496a197980/raw/c6ff7fbf44265c7f8c9e961e3e1158cd812d6af1/index.xml all

這是發布的鏈接：查看最近2次對話

NLTK TweetTokenizer不起作用（Python）

問題描述

1 個解決方案

解決方案1
0 2016-12-01 00:27:52

NLTK TweetTokenizer不起作用（Python）

問題描述

1 個解決方案

解決方案1 0 2016-12-01 00:27:52

解決方案1
0 2016-12-01 00:27:52