简体   繁体   English

ascii编解码器无法编码字符,python 2.6

[英]ascii codec can't encode character, python 2.6

I know this is a common beginner issue and there are a ton of questions like this here on stack exchange and I've been searching through them but i still can't figure this out. 我知道这是一个常见的初学者问题,在堆栈交换中这里有很多类似的问题,我一直在搜索它们,但我仍然无法弄清楚。 I have some data from a scrape that looks like this (about 1000 items in the list): 我有一些像这样的刮擦数据(列表中约有1000项):

inputList = [[u'someplace', u'3901 West Millen Drive', u'Hobbs', u'NH', 
u'88240', u'37.751117', u'-103.187709999'], [u'\u0100lon someplace', u'3120 
S Las Vegas Blvd', u'Las Duman', u'AL', u'89109', u'36.129066', u'-145.168791']]

I'm trying to write it to a csv file like this: 我正在尝试将其写入csv文件,如下所示:

for i in inputList:
    for ii in i:
        ii.replace(" u'\u2019'", "") #just trying to get rid of offending character
        ii.encode("utf-8")

def csvWrite(inList, outFile):
    import csv
    destination = open(outFile, 'w')
    writer = csv.writer(destination, delimiter = ',')
    data = inList   
    writer.writerows(data)
    destination.close()
csvWrite(inputList, output)

but I keep getting this error on, writer.writerows(data): 但我不断遇到这个错误,writer.writerows(data):

UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in  
position 5: ordinal not in range(128)

I've tried a bunch of different thing to encode the data in the list, but still always get the error. 我尝试了很多不同的事情来对列表中的数据进行编码,但是仍然总是收到错误。 I'm open to just ignoring the characters that can't be encoded to ascii. 我愿意忽略无法编码为ascii的字符。 Can anyone point me in the right direction, I'm using python2.6 谁能指出我正确的方向,我正在使用python2.6

这行似乎很奇怪: ii.replace(" u'\’'", "") ,您是说ii.replace(u"\’", u"")吗?

if you just want to remove those bad characters you could use this code instead: 如果您只想删除这些不良字符,则可以使用以下代码:

for i in inputList:
    for ii in i:
        ii = "".join(list( filter((lambda x: ord(x) < 128), ii)))
        print ii

Output: 输出:

someplace
3901 West Millen Drive
Hobbs
NH
88240
37.751117
-103.187709999
lon someplace
3120 S Las Vegas Blvd
Las Duman
AL
89109
36.129066
-145.168791

the final code will look like this: 最终代码将如下所示:

inputList = [[u'someplace', u'3901 West Millen Drive', u'Hobbs', u'NH', 
u'88240', u'37.751117', u'-103.187709999'], [u'\u0100lon someplace', u'3120 S Las Vegas Blvd', u'Las Duman', u'AL', u'89109', u'36.129066', u'-145.168791']]

cleared_inputList = []

for i in inputList:
    c_i = []
    for ii in i:
        ii = "".join(list( filter((lambda x: ord(x) < 128), ii)))
        c_i.append(ii)
    cleared_inputList.append(c_i)

def csvWrite(inList, outFile):
    import csv
    destination = open(outFile, 'w')
    writer = csv.writer(destination, delimiter = ',')
    data = inList   
    writer.writerows(data)
    destination.close()


csvWrite(cleared_inputList, output)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM