简体   繁体   English

调用nlp时发生Python Spacy错误:UnicodeDecodeError:'ascii'编解码器无法解码字节0xe2

[英]Python Spacy errors when nlp is called: UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2

Python3.6: I am using Spacy on a column of text in a pandas df. Python3.6:我在pandas df中的一列文字上使用Spacy。 The text does have "Special Characters" and I need to keep them. 文本中确实包含“特殊字符”,我需要保留它们。 nlp required unicode for some reason. nlp由于某种原因需要unicode。 I am getting an error from nlp below: 我从下面的nlp得到一个错误:

Any help would be very much appreciated. 任何帮助将不胜感激。

# -*- coding: utf-8 -*-
import spacy
nlp = spacy.load("en_core_web_sm")

df['TextCol'] = df['TextCol'].str.encode('utf-8')
def function(row):
    doc = nlp(unicode(text))

df.apply(function, axis=1)

Return from nlp: 从nlp返回:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 

So I solved my own question. 所以我解决了我自己的问题。 Not really sure what changed I switched IDEs from pycharm to eclipse(pydev). 不太确定发生了什么变化,我将IDE从pycharm切换到eclipse(pydev)。 I am still using the same interpreter. 我仍在使用相同的解释器。 Here is the changes, looks pretty standard usage. 这是更改,看起来很标准。

# -*- coding: utf-8 -*-
import spacy
nlp = spacy.load("en_core_web_sm")

# Removed encode
# df['TextCol'] = df['TextCol'].str.encode('utf-8')
def function(row):
    # Removed unicode
    doc = nlp(text)

df.apply(function, axis=1)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 python熊猫到excel UnicodeDecodeError:'ascii'编解码器无法解码位置11的字节0xe2 - Python pandas to excel UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 11 UnicodeDecodeError:'ascii' 编解码器无法解码位置 14 中的字节 0xe2:在 GAE python 中序号不在范围内(128)? - UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 14: ordinal not in range(128) in GAE python? Python UnicodeDecodeError:'ascii'编解码器无法解码位置0的字节0xe2:序数不在范围内(128) - Python UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 0: ordinal not in range(128) Python 3 UnicodeDecodeError:'ascii'编解码器无法解码位置0中的字节0xe2:序数不在范围内(128) - Python 3 UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 0: ordinal not in range(128) Python UnicodeDecodeError:'ascii'编解码器不能解码不在范围内的字节0xe2序数(128) - Python UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 ordinal not in range(128) 熊猫read_excel奇怪的错误:UnicodeDecodeError:'ascii'编解码器无法解码字节0xe2 - Pandas read_excel strange error: UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 UnicodeDecodeError:'ascii'编解码器无法解码位置0中的字节0xe2:序号不在范围内(128) - UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 0: ordinal not in range(128) UnicodeDecodeError:“ascii”编解码器无法解码位置 13 中的字节 0xe2:序号不在范围内(128) - UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 13: ordinal not in range(128) UnicodeDecodeError:“ ascii”编解码器无法解码位置46的字节0xe2:序数不在范围内 - UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 46: ordinal not in range 安装熊猫:UnicodeDecodeError:'ascii'编解码器无法解码位置 72 中的字节 0xe2:序号不在范围内(128) - Installing pandas: UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 72: ordinal not in range(128)
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM