简体   繁体   English

python:尝试解析列表时获取UnicodeEncodeError

[英]python: getting UnicodeEncodeError when trying to parse a list

trying to pipe a list that I've scraped from http://www.ropeofsilicon.com/roger-eberts-great-movies-list/ through the API at http://www.omdbapi.com/ to grab their IMDB ids. 想管,我已经从一个刮名单http://www.ropeofsilicon.com/roger-eberts-great-movies-list/通过API在http://www.omdbapi.com/抓住他们IMDB IDS 。

creating logging for movies that I can and can't find as follows: 为我无法找到的电影创建日志,如下所示:

import requests

OMDBPath = "http://www.omdbapi.com/"

movieFile = open("movies.txt")
foundLog = open("log_found.txt", 'w')
notFoundLog = open("log_not_found.txt", 'w')

####

for line in movieFile:
    name = line.split('(')[0].decode('utf8')
    print name
    year = False
    if line.find('(') != -1:
        year = line[line.find('(')+1 : line.find(')')].decode('utf8')
        OMDBQuery = {'t': name, 'y': year}
    else:
        OMDBQuery = {'t': name}

    req = requests.get(OMDBPath, params=OMDBQuery)
    if req.json()[u'Response'] == "False":
        if year:
            notFoundLog.write("Couldn't find " + name + " (" + year + ")" + "\n")
        else:
            notFoundLog.write("Couldn't find " + name + "\n")
    # else:
    #     print req.json()
    #     foundLog.write(req.text.decode('utf8').encode('latin1') + ",")
movieFile.close()
foundLog.close()
notFoundLog.close()

Been reading a lot about unicode encoding and decoding, looks like this is happening because I'm not encoding the file in the right manner? 读了很多有关unicode编码和解码的内容,看来是因为我没有以正确的方式对文件进行编码? Not sure what's wrong here, getting an issue when I get to "Caché": 不确定这里出了什么问题,当我进入“Caché”时遇到问题:

Caché
Traceback (most recent call last):
  File "app.py", line 34, in <module>
    notFoundLog.write("Couldn't find " + name + " (" + year + ")" + "\n")
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 18: ordinal not in range(128)

Here is a working solution that relies on the codecs module to provide transparent encoding/decoding to/from utf-8 for the various files you open: 这是一个有效的解决方案,它依赖于codecs模块为打开的各种文件提供到utf-8透明编码/解码:

import requests
import codecs

OMDBPath = "http://www.omdbapi.com/"

with codecs.open("movies.txt", encoding='utf-8') as movieFile, \
     codecs.open("log_found.txt", 'w', encoding='utf-8') as foundLog, \
     codecs.open("log_not_found.txt", 'w', encoding='utf-8') as notFoundLog:
    for line in movieFile:
        name = line.split('(')[0]
        print(name)
        year = False
        if line.find('(') != -1:
            year = line[line.find('(')+1 : line.find(')')]
            OMDBQuery = {'t': name, 'y': year}
        else:
            OMDBQuery = {'t': name}

        req = requests.get(OMDBPath, params=OMDBQuery)
        if req.json()[u'Response'] == "False":
            if year:
                notFoundLog.write(u"Couldn't find {} ({})\n".format(name, year))
            else:
                notFoundLog.write(u"Couldn't find {}\n".format(name))
        #else:
            #print(req.json())
            #foundLog.write(u"{},".format(req.text))

Note that the use of the codecs module is only required in Python 2.x. 请注意,仅在Python 2.x中才需要使用codecs模块。 In Python 3.x, the built-in open function should handle this properly by default. 在Python 3.x中,默认情况下内置的open函数应正确处理此问题。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 尝试编写JSON文件时获取UnicodeEncodeError - getting a UnicodeEncodeError when trying to write a JSON file 在python中将列表导出为csv文件并获取UnicodeEncodeError - export a list as a csv file in python and getting UnicodeEncodeError 尝试加密和写入文件时Python中的UnicodeEncodeError - UnicodeEncodeError in Python when trying to encrypt and write to a file 尝试从请求文本中读取 JSON 字符串时出现 UnicodeEncodeError - Getting UnicodeEncodeError when trying to read a JSON string from a request text Python 2.7 + Django 1.7 + PostgreSQL 9.3:尝试将一些文本保存到数据库时出现UnicodeEncodeError。 是什么赋予了? - Python 2.7 + Django 1.7 + PostgreSQL 9.3: I'm getting a UnicodeEncodeError when trying to save some text to my database. What gives? 尝试上传具有unicode内容的XML时,Python ftplib UnicodeEncodeError - Python ftplib UnicodeEncodeError when trying to upload an XML with unicode content 写入CSV时获取UnicodeEncodeError - Getting UnicodeEncodeError when writing to CSV 尝试通过openpyxl解析.xlsx时出现“ UnicodeEncodeError:&#39;charmap&#39;编解码器无法编码字符” - “UnicodeEncodeError: 'charmap' codec can't encode characters” when trying to parse .xlsx by openpyxl 尝试在Python中将字符串解析为列表 - Trying to parse a string into a list in Python 在Python中使用%格式化字符串时出现UnicodeEncodeError - UnicodeEncodeError when formatting string with % in Python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM