简体   繁体   English

Apache,LDAP和WSGI编码问题

[英]Apache, LDAP and WSGI encoding issue

I am using Apache 2.4.7 with mod_wsgi 3.4 on Ubuntu 14.04.2 (x86_64) and python 3.4.0. 我在Ubuntu 14.04.2(x86_64)和python 3.4.0上将Apache 2.4.7与mod_wsgi 3.4一起使用。 My python app relies on apache to perform user authentication against our company's LDAP server (MS Active Directory 2008). 我的python应用程序依靠apache对我们公司的LDAP服务器(MS Active Directory 2008)执行用户身份验证。 It also passes some additional LDAP data to the python app using the OS environment. 它还使用操作系统环境将一些其他LDAP数据传递给python应用程序。 In the apache config, I query the LDAP like so: 在apache配置中,我像这样查询LDAP:

…
AuthLDAPURL "ldap://server:389/DC=company,DC=lokal?sAMAccountName,sn,givenName,mail,memberOf?sub?(objectClass=*)"
AuthLDAPBindDN …
AuthLDAPBindPassword …
AuthLDAPRemoteUserAttribute sAMAccountName
AuthLDAPAuthorizePrefix AUTHENTICATE_
…

This passes some user data to my WSGI script where I handle the info as follows: 这会将一些用户数据传递到我的WSGI脚本,在该脚本中,我按如下方式处理信息:

# Make sure the packages from the virtualenv are found
import site
site.addsitedir('/home/user/.virtualenvs/ispot-cons/lib/python3.4/site-packages')

# Patch path for app (so that libispot can be found)
import sys
sys.path.insert(0, '/var/www/my-app/')

import os
from libispot.web import app as _application

def application(environ, start_response):
    os.environ['REMOTE_USER'] = environ.get('REMOTE_USER', "")
    os.environ['REMOTE_USER_FIRST_NAME'] = environ.get('AUTHENTICATE_GIVENNAME', "")
    os.environ['REMOTE_USER_LAST_NAME'] = environ.get('AUTHENTICATE_SN', "")
    os.environ['REMOTE_USER_EMAIL'] = environ.get('AUTHENTICATE_MAIL', "")
    os.environ['REMOTE_USER_GROUPS'] = environ.get('AUTHENTICATE_MEMBEROF', "")
    return _application(environ, start_response)

I can then access this info in my python app using os.environ.get(…) . 然后,我可以使用os.environ.get(…)在我的python应用程序中访问此信息。 (BTW: If you have a more elegant solution, please let me know!) (顺便说一句:如果您有更好的解决方案,请告诉我!)

The problem is that some of the user names contain special characters (German umlauts, eg, äöüÄÖÜ ) that are not encoded correctly. 问题在于,某些用户名包含未正确编码的特殊字符(德国变音符号,例如äöüÄÖÜ )。 So, for example, the name Tölle arrives in my python app as Tölle . 因此,举例来说,这个名字Tölle到达我的Python应用程序为Tölle

Obviously, this is an encoding problem, because 显然,这是一个编码问题,因为

$ echo "Tölle" | iconv --from utf-8 --to latin1 

gives me the correct Tölle . 给我正确的Tölle

Another observation that might help: in my apache logs I found the character ü represented as \\xc3\\x83\\xc2\\xbc . 另一个可能有帮助的观察结果:在我的Apache日志中,我发现字符ü表示为\\xc3\\x83\\xc2\\xbc

I told my Apache in /etc/apache2/envvars to use LANG=de_DE.UTF-8 and python 3 is utf-8 aware as well. 我在/etc/apache2/envvars告诉我的Apache使用LANG=de_DE.UTF-8而python 3也支持utf-8。 I can't seem to specify anything about my LDAP server. 我似乎无法指定有关LDAP服务器的任何信息。 So my question is: where is the encoding getting mixed up and how do I mend it? 所以我的问题是:编码在哪里混淆了,我该如何修补?

It is bad practice to copy the values to os.environ on each request as this will fail miserable if the WSGI server is running with a multithreaded configuration, with concurrent requests interfering with each other. 不好的做法是在每个请求上将值复制到os.environ ,因为如果WSGI服务器运行在多线程配置下,并且并发请求之间会相互干扰,这将很惨。 Look at thread locals instead. 而是查看线程局部变量。

As to the issue of encoded data from LDAP, if I under stand the problem, you would need to do: 至于来自LDAP的编码数据问题,如果我理解这个问题,则需要这样做:

"Tölle".encode('latin-1').decode('utf-8')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM