简体   繁体   English

unicode方法在Python3中不起作用

[英]unicode method doesn't work in Python3

def unicode_csv_reader(utf8_data, dialect=csv.excel, **kwargs):
    csv_reader = csv.reader(utf8_data, dialect=dialect, **kwargs)
    for row in csv_reader:
        yield [unicode(cell, 'utf-8') for cell in row]

filename = '/Users/congminmin/Downloads/kg-temp.csv'
reader = unicode_csv_reader(open(filename))

out_filename = '/Users/congminmin/Downloads/kg-temp.out'
#writer = open(out_filename, "w", "utf-8")
for question, answer in reader:
  print(question+ " " + json.loads(answer)[0]['content'])
  #writer.write(question + " " + answer)

reader.close();

This code works in Python 2.7. 此代码在Python 2.7中有效。 But it gives an error message in Python 3.6: 但是它在Python 3.6中给出了一条错误消息:

Unresolved reference 'unicode'

How to adapt it to Python 3.6? 如何使其适应Python 3.6?

Simply ensure your data is str , not a bytestring, to begin with, and just use csv.reader without this decoding stuff. 只需确保您的数据以str开头,而不是字节csv.reader ,然后仅使用csv.reader而不进行解码即可。

data = utf8_data.decode('utf-8')
for row in csv.reader(data, dialect=csv.excel, ...):
    # ...

Python 3 has excellent unicode support already. Python 3已经具有出色的unicode支持。 Any time you open a file in text mode, you can use a specific encoding, or let it default over to UTF-8. 每次以文本模式打开文件时,都可以使用特定的编码,也可以将其默认设置为UTF-8。 There is no longer a difference between str and unicode in Python 3. The latter does not exist, and the former has full unicode support. Python 3中的strunicode之间不再存在区别。后者不存在,并且前者具有完整的unicode支持。 This simplifies your job greatly since you don't need your setup method at all. 由于根本不需要设置方法,因此极大地简化了您的工作。 You can just iterate over a normal csv.reader . 您可以遍历普通的csv.reader

As an additional note, you should always open files in a with block, so things get cleaned up if there is any kind of exception. 另外要注意的是,您应该始终在with块中打开文件,以便在出现任何异常的情况下对它们进行清理。 As a bonus, your file will get closed automatically when the block ends: 另外,当块结束时,您的文件将自动关闭:

with open(filename) as f:  # The default mode is 'rt', with utf-8 encoding
    for question, answer in csv.reader(f):
        # Do your thing here. Both question and answer are normal strings

This will only work properly if you are sure that each row contains exactly 2 elements. 仅当您确定每一行都包含2个元素时,此方法才能正常工作。 You may be better off doing something more like 您最好做一些类似的事情

with open(filename) as f:  # The default mode is 'rt', with utf-8 encoding
    for row in csv.reader(f):
        if len(row) != 2:
            continue  # Or handle the anomaly by other means
        question, answer = row
        # Do your thing here as before

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM