简体   繁体   English

Python CGI-UTF-8不起作用

[英]Python CGI - UTF-8 doesn't work

For HTML5 and Python CGI: 对于HTML5和Python CGI:

If I write UTF-8 Meta Tag, my code doesn't work. 如果我编写UTF-8元标记,则我的代码无效。 If I don't write, it works. 如果我不写,那就行了。

Page encoding is UTF-8. 页面编码为UTF-8。

print("Content-type:text/html")
print()
print("""
    <!doctype html>
    <html>
    <head>
        <meta charset="UTF-8">
    </head>
    <body>
        şöğıçü
    </body>
    </html>
""")

This codes doesn't work. 该代码无效。

print("Content-type:text/html")
    print()
    print("""
        <!doctype html>
        <html>
        <head></head>
        <body>
            şöğıçü
        </body>
        </html>
    """)

But this codes works. 但是此代码有效。

For CGI, using print() requires that the correct codec has been set up for output. 对于CGI,使用print()要求为输出设置正确的编解码器。 print() writes to sys.stdout and sys.stdout has been opened with a specific encoding and how that is determined is platform dependent and can differ based on how the script is run. print()写入sys.stdout ,并且已使用特定的编码打开sys.stdout ,并且确定方式取决于平台并且会根据脚本的运行方式有所不同。 Running your script as a CGI script means you pretty much do not know what encoding will be used. 将脚本作为CGI脚本运行意味着您几乎不知道将使用哪种编码。

In your case, the web server has set the locale for text output to a fixed encoding other than UTF-8. 在您的情况下,Web服务器已将文本输出的语言环境设置为UTF-8以外的固定编码。 Python uses that locale setting to produce output in in that encoding, and without the <meta> header your browser correctly guesses that encoding (or the server has communicated it in the Content-Type header), but with the <meta> header you are telling it to use a different encoding, one that is incorrect for the data produced. Python使用该语言环境设置以该编码生成输出,如果没有<meta>标头,浏览器会正确猜测该编码(或服务器已在Content-Type标头中传达了该编码),但是使用<meta>标头,您可以告诉它使用另一种编码,这种编码对于所产生的数据是不正确的。

You can write directly to sys.stdout.buffer , after explicitly encoding to UTF-8. 在显式编码为UTF-8后,可以直接写入sys.stdout.buffer Make a helper function to make this easier: 使一个辅助函数使之更容易:

import sys

def enc_print(string='', encoding='utf8'):
    sys.stdout.buffer.write(string.encode(encoding) + b'\n')

enc_print("Content-type:text/html")
enc_print()
enc_print("""
    <!doctype html>
    <html>
    <head>
        <meta charset="UTF-8">
    </head>
    <body>
        şöğıçü
    </body>
    </html>
""")

Another approach is to replace sys.stdout with a new io.TextIOWrapper() object that uses the codec you need: 另一种方法是用新的io.TextIOWrapper()对象替换sys.stdout ,该对象使用所需的编解码器:

import sys
import io

def set_output_encoding(codec, errors='strict'):
    sys.stdout = io.TextIOWrapper(
        sys.stdout.detach(), errors=errors,
        line_buffering=sys.stdout.line_buffering)

set_output_encoding('utf8')

print("Content-type:text/html")
print()
print("""
    <!doctype html>
    <html>
    <head></head>
    <body>
        şöğıçü
    </body>
    </html>
""")

From https://ru.stackoverflow.com/a/352838/11350 https://ru.stackoverflow.com/a/352838/11350

First dont forget to set encoding in file 首先不要忘记在文件中设置编码

#!/usr/bin/env python
# -*- coding: utf-8 -*-

Then try 然后尝试

import sys
import codecs

sys.stdout = codecs.getwriter("utf-8")(sys.stdout.detach())

Or if you use apache2, add to your conf. 或者,如果您使用apache2,则将其添加到conf。

AddDefaultCharset UTF-8    
SetEnv PYTHONIOENCODING utf8

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM