python/ nltk

I'm not able to execute the below lines, the error is

"UnicodeDecodeError: 'ascii' codec can't decode byte 0xcb in position 0: ordinal not in range(128)"

File "D:\\Py 64\\ anaconda\\lib\\site-packages\\nltk\\tag__init__.py", line 100, in pos_tag tagger = load(_POS_TAGGER)

File "D:\\Py 64\\ anaconda\\lib\\site-packages\\nltk\\data.py", line 779, in load resource_val = pickle.load(opened_resource, encoding='iso-8859-1')

My error is not just in data.py, but also in init .py.

Note:- I have changed the code in data.py, line 779 as mentioned here


text = word_tokenize("They refuse to permit us to obtain the refuse permit")

nltk.pos_tag(text)

I believe this problem is fixed using nltk 3.0.3 and the lastest maxent_treebank_pos_tagger model.

To install nltk, use

pip install -U nltk

Make sure the pip you are calling is for Python3 .

Once nltk is installed, open the Python3 interpreter, type:

>>> import nltk
>>> nltk.download()

and use the GUI to install maxent_treebank_pos_tagger . It's located under the models tab:

models > maxent_treebank_pos_tagger

暂无
暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Python NLTK Word Tokenize UnicodeDecode Error UnicodeDecode error Pip spitting UnicodeDecode error python - Steganography - UnicodeDecode Error How to catch unicodedecode error python? python mechanize file upload UnicodeDecode error Python - Why is reverse geocode throwing a UnicodeDecode error? How to fix UnicodeDecode error in django in this scenario? UnicodeDecode error when reading special characters from HDFStore with Pandas Google App Engine: UnicodeDecode Error in bulk data upload
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM