NLTK TweetTokenizer不起作用（Python）

Question

I have currently installed NLTK and have run the command nltk.download() . 我目前已安装NLTK并已运行命令nltk.download() 。 However not all the libraries are installed (it gets stuck on panlex_lite). 但是，并非所有库都已安装（卡在panlex_lite上）。

The thing is that when I try to import Tweet Tokenizer I get the error: 问题是，当我尝试导入Tweet Tokenizer时，出现错误：

File "create_docs.py", line 7, in 在第7行的文件“ create_docs.py”中
 from nltk.tokenize import TweetTokenizer ImportError: cannot import 
name TweetTokenizer 名称TweetTokenizer

How can I deal with this? 我该如何处理？ Cheers! 干杯!

Answer 1

This is because is not installed properly libraries, so need to skip "panlex_lite" libraries and should work. 这是因为未正确安装库，因此需要跳过“ panlex_lite”库并且应该可以使用。

Currently is open issue for this, solution will be as follow: 当前对此是未解决的问题，解决方案如下：

I guess, we could add something like if id != 'panlex_lite' to the code...

But, as for me, the easiest way looks like this:

get https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/index.xml
remove panlex from it
upload it to a public Gist
pass the gist's url to the downloader: python -m nltk.downloader -d /usr/local/share/nltk_data -u https://gist.githubusercontent.com/demidovakatya/61dab385d74065ae825c80496a197980/raw/c6ff7fbf44265c7f8c9e961e3e1158cd812d6af1/index.xml all

here is the link to issue: look at last 2 conversations 这是发布的链接：查看最近2次对话

NLTK TweetTokenizer不起作用（Python）

问题描述

1 个解决方案

解决方案1
0 2016-12-01 00:27:52

NLTK TweetTokenizer不起作用（Python）

问题描述

1 个解决方案

解决方案1 0 2016-12-01 00:27:52

解决方案1
0 2016-12-01 00:27:52