简体   繁体   中英

Reading unicode filenames using python

I have a file with unicode filename : aγλώσ.txt in a folder "test". I am doing os.walk on this folder and trying to open the file but I get this error

IOError: [Errno 22] invalid mode ('rb') or filename: 'C:\\Users\\username\\Documents\\test\\a???s.txt'

Below is the code that I am using.

path = r"C:\Users\username\Documents\test"
for rootFile, dirs, files in os.walk(path):
  for filename in files:
    absolutePath = os.path.abspath(rootFile)
    fullFileName = os.path.join(absolutePath, filename)
    with open(fullFileName , 'rb') as f:
       #do something

I also tried using for rootFile, dirs, files in os.walk(path.encode('utf-8'))

Update:

I tried rootFile, dirs, files in os.walk(unicode(path, 'utf-8')): and before opening the file I did fullFileName = fullFileName .encode('utf-8') This gives me the following error

IOError: [Errno 2] No such file or directory: 'C:\\Users\\username\\Documents\\test\\a\xc3\x8e\xc2\xb3\xc3\x8e\xc2\xbb\xc3\x8f\xc5\xbd\xc3\x8f\xc6\x92.txt'

The actual file name is aγλώσ.txt

Remove the fullFileName = fullFileName .encode('utf-8') line you added. The Windows file APIs don't understand UTF-8, they use either UTF-16 or a locale dependent multibyte encoding.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM