简体   繁体   中英

Python reading unicode from local files

I am trying to read some unicode files that I have locally. How do I read unicode files while using a list? I've read the python docs, and a ton of stackoverflow Q&A's, which have answered a lot of other questions I had, but I can't find the answer to this one.

Any help is appreciated.

Edit: Sorry, my files are in utf-8.

You can open UTF-8-encoded files by using

import codecs
with codecs.open("myutf8file.txt", encoding="utf-8-sig") as infile:
    for line in infile:
        # do something with line

Be aware that codecs.open() does not translate \\r\\n to \\n , so if you're working with Windows files, you need to take that into account.

The utf-8-sig codec will read UTF-8 files with or without a BOM (Byte Order Mark) (and strip it if it's there). On writing, you should use utf-8 as a codec because the Unicode standard recommends against writing a BOM in UTF-8 files .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM