简体   繁体   中英

ascii codec can't encode character, python 2.6

I know this is a common beginner issue and there are a ton of questions like this here on stack exchange and I've been searching through them but i still can't figure this out. I have some data from a scrape that looks like this (about 1000 items in the list):

inputList = [[u'someplace', u'3901 West Millen Drive', u'Hobbs', u'NH', 
u'88240', u'37.751117', u'-103.187709999'], [u'\u0100lon someplace', u'3120 
S Las Vegas Blvd', u'Las Duman', u'AL', u'89109', u'36.129066', u'-145.168791']]

I'm trying to write it to a csv file like this:

for i in inputList:
    for ii in i:
        ii.replace(" u'\u2019'", "") #just trying to get rid of offending character
        ii.encode("utf-8")

def csvWrite(inList, outFile):
    import csv
    destination = open(outFile, 'w')
    writer = csv.writer(destination, delimiter = ',')
    data = inList   
    writer.writerows(data)
    destination.close()
csvWrite(inputList, output)

but I keep getting this error on, writer.writerows(data):

UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in  
position 5: ordinal not in range(128)

I've tried a bunch of different thing to encode the data in the list, but still always get the error. I'm open to just ignoring the characters that can't be encoded to ascii. Can anyone point me in the right direction, I'm using python2.6

这行似乎很奇怪: ii.replace(" u'\’'", "") ,您是说ii.replace(u"\’", u"")吗?

if you just want to remove those bad characters you could use this code instead:

for i in inputList:
    for ii in i:
        ii = "".join(list( filter((lambda x: ord(x) < 128), ii)))
        print ii

Output:

someplace
3901 West Millen Drive
Hobbs
NH
88240
37.751117
-103.187709999
lon someplace
3120 S Las Vegas Blvd
Las Duman
AL
89109
36.129066
-145.168791

the final code will look like this:

inputList = [[u'someplace', u'3901 West Millen Drive', u'Hobbs', u'NH', 
u'88240', u'37.751117', u'-103.187709999'], [u'\u0100lon someplace', u'3120 S Las Vegas Blvd', u'Las Duman', u'AL', u'89109', u'36.129066', u'-145.168791']]

cleared_inputList = []

for i in inputList:
    c_i = []
    for ii in i:
        ii = "".join(list( filter((lambda x: ord(x) < 128), ii)))
        c_i.append(ii)
    cleared_inputList.append(c_i)

def csvWrite(inList, outFile):
    import csv
    destination = open(outFile, 'w')
    writer = csv.writer(destination, delimiter = ',')
    data = inList   
    writer.writerows(data)
    destination.close()


csvWrite(cleared_inputList, output)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM