简体   繁体   中英

Strange `UnicodeEncodeError` using `os.path.exists`

In a web-application (using Flask), I get the following error:

Unable to retrieve the thumbnail for u'/var/data/uploads/2012/03/22/12 Gerd\xb4s Banjo Trio 1024.jpg'
Traceback (most recent call last):
 File "/var/www/beta/env/lib/python2.7/site-packages/dblib-1.0dev3-py2.7.egg/dblib/orm/file.py", line 169, in get_thumbnail
   if not exists(filename):
 File "/usr/lib/python2.7/genericpath.py", line 18, in exists
   os.stat(path)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xb4' in position 52: ordinal not in range(128)

Note that I include the repr() of the file name in the logged error. This shows that the file name is passed as a Unicode instance. So much is correct...

If I run the culprit using the python interpreter, it works as expected:

>>> from os.path import exists
>>> exists(u'/var/data/uploads/2012/03/22/12 Gerd\xb4s Banjo Trio 1024.jpg')
True

So obviously, while running in the Flask environment, Python thinks it should encode the file-name using the ASCII codec instead of UTF-8. I deployed the application using mod_wsgi behind the Apache httpd.

I assume I have to tell either one of them to use UTF-8 somewhere? But where?

See Django docs for same issue. When using mod_wsgi, should be same solution:

https://docs.djangoproject.com/en/dev/howto/deployment/wsgi/modwsgi/#if-you-get-a-unicodeencodeerror

Excerpt from the above linked doc:

[...] you must ensure that the environment used to start Apache is configured to accept non-ASCII file names. If your environment is not correctly configured, you will trigger UnicodeEncodeError exceptions when calling functions like the ones in os.path on filenames that contain non-ASCII characters.

To avoid these problems, the environment used to start Apache should contain settings analogous to the following:

 export LANG='en_US.UTF-8' export LC_ALL='en_US.UTF-8' 

Consult the documentation for your operating system for the appropriate syntax and location to put these configuration items; /etc/apache2/envvars is a common location on Unix platforms. Once you have added these statements to your environment, restart Apache.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM