简体   繁体   中英

Why does parse_qs give different result depending on the way wsgi application has been started?

I use Python 3.2.5 and Oracle Linux 6.4. I have written my wsgi application but I have some trouble: function urllib.parse.parse_qs behave differently depending on the way I started my application (Apache with mod_wsgi or wsgiref.simple_server). In my application function I have the following code:

def application(environ, start_response):
    print(environ["QUERY_STRING"])
    requestParams = parse_qs(environ["QUERY_STRING"])
    print(requestParams)
    .......

So. When I start my program using wsgiref.simple_server and make a query /query?name=Иван (it's Russian name) I get the following output:

name=%D0%98%D0%B2%D0%B0%D0%BD
{'name': ['Иван']}

But my application with Apache + mod_wsgi gives me the following:

name=%D0%98%D0%B2%D0%B0%D0%BD
{'name': ['\xd0\x98\xd0\xb2\xd0\xb0\xd0\xbd']}

As you can see, the latter doesn't give me correct Russian word encoded in UTF-8 although the input to the function is the same. According to https://docs.python.org/3.2/library/urllib.parse.html function parse_qs has default parameter encoding='utf-8'. As a result I have other problems during further work. I can't understand why this function works differently.

I have the following Apache virtual host:

<VirtualHost *:80>
    DocumentRoot /var/www/my_project
    <Directory />
        Options FollowSymLinks
        AllowOverride None
    </Directory>
    <Directory /var/www/my_project/>
        Options Indexes FollowSymLinks MultiViews
        AllowOverride None
        Order allow,deny
        allow from all
    </Directory>
    WSGIDaemonProcess my_project processes=8 threads=1 python-path=/var/www/my_project display-name=%{GROUP}
    WSGIProcessGroup my_project
    WSGIScriptAlias /my_project /var/www/my_project/my_project.py
</VirtualHost>

My apache uses prefork MPM.

Graham Dumpleton (one of the developers of mod_wsgi) suggested me to use lang=en_US.UTF-8 locale=en_US.UTF-8 in WSGIDaemonProcess directive. These options aren't described in docs but they helped me to overcome my problems with string although python's print function doesn't print proper string. It still prints \\xd0\\x98\\xd0\\xb2\\xd0\\xb0\\xd0\\xbd in Apache log. But writing to a file works well. Therefore I can assume that it's because the environment variable PYTHONIOENCODING is not set for my python script which is started under Apache + mod_wsgi. Hope that my answer to my question will help someone.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM