简体   繁体   中英

Why web-server complains about Cyrillic letters and command line not?

I have a web-server on which I try to submit a form containing Cyrillic letters. As a result I get the following error message:

UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)

This message comes from the following line of the code:

ups = 'rrr {0}'.format(body.replace("'","''"))

( body contains Cyrillic letters). Strangely I cannot reproduce this error message in the python command line. The following works fine:

>>> body = 'ппп'
>>> ups = 'rrr {0}'.format(body.replace("'","''"))

It's working in the interactive prompt because your terminal is using your locale to determine what encoding to use. Directly from the Python docs :

Whereas the other file-like objects in python always convert to ASCII unless you set them up differently, using print() to output to the terminal will use the user's locale to convert before sending the output to the terminal.

On the other hand, while your server is running the scripts, there is no such assumption. Everything read as a byte str from a file-like object is encoded as ASCII in memory unless otherwise specified. Your Cyrillic characters, presumably encoded as UTF-8, can't be converted; they're far beyond the U+007F code point that maps directly between UTF-8 and ASCII. (Unicode uses hex to map its code points; U+007F, then, is U+00127 in decimal. In fact, ASCII only has 127 zero-indexed code points because it uses only 1 byte, and of that one byte, only the least-significant 7 bits. The most significant bit is always 0.)

Back to your problem. If you want to operate on the body of the file, you'll have to specify that it should be opened with a UTF-8 encoding. (Again, I'm assuming it's UTF-8 because it's information submitted from the web. If it's not -- well, it really should be.) The solution has already been given in other StackOverflow answers, so I'll just link to one of them rather than reiterate what's already been answered. The best answer may vary a little bit depending on your version of Python -- if you let me know in a comment I could give you a clearer recommendation.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM