[英]Weird Python behavior while using Dictionary and escape characters
I'm new to Python, and I'm trying to perform simple tasks the way I used to, but I've faced an interesting... feature?我是 Python 新手,我正在尝试像以前那样执行简单的任务,但是我遇到了一个有趣的...功能?
The code below works just how I want it to:下面的代码就像我想要的那样工作:
def cleanLDAP(search):
escChars = {'(':r'\28', ')':r'\29' }
for ch, val in escChars.items():
if ch in search:
search = search.replace(ch, val)
return search
cleanLDAP('(123)')
The output is '\\\\28123\\\\29'
as I expect, but when I change escChars
as follows:正如我所料,输出是
'\\\\28123\\\\29'
,但是当我按如下方式更改escChars
时:
escChars = {'(':r'\28', ')':r'\29', '\\': '\5c' }
the output become a bit weird: '\\x05c28123\\x05c29'
输出变得
'\\x05c28123\\x05c29'
: '\\x05c28123\\x05c29'
I understand that I might miss some implicit encoding changes, but still I want to know the reason why does this happening?我知道我可能会错过一些隐式编码更改,但我仍然想知道发生这种情况的原因? Thank you in advance!
先感谢您!
5c
in utf-8
is \\
. utf-8
5c
是\\
。
When you try save a string as \\5c
Python returns the utf-8
hex for 5c
since you prefixed the string with \\
this causes your value to become: \\x05c
.当您尝试保存一个字符串作为
\\5c
的Python返回utf-8
十六进制为5c
因为你前缀字符串\\
这会导致你的价值,成为: \\x05c
。
'\5c'
#'\x05c'
'5c'
#'5c'
escChars
#{'(': '\\28', ')': '\\29', '\\': '\x05c'}
When you iterate over your keys, the ch
it tests against in your iteration is \\
, because you did not save your key with the raw
format string r
.当您迭代您的密钥时,它在您的迭代中测试的
ch
是\\
,因为您没有使用raw
格式字符串r
保存您的密钥。
for ch, value in escChars.items():
print(ch, value)
#( \28
#) \29
#\ c
Finally, since you are modifying your everytime you find a match during iteration, you're checking if the \\
exists after you added it in via replace()
最后,由于您在迭代期间每次找到匹配项时都在修改您的内容,因此您正在通过
replace()
添加它后检查\\
存在
This leads you to do your first replacement, then replaces the \\
you inserted into the string with the utf-8
symbol for \\
.这会导致你做你的第一个替换,然后替换
\\
你插入与字符串utf-8
的符号\\
。
The simple fix here is to save your key with the r
to ensure the code will only match against \\\\
and not \\
, and save your value with the same to ensure it does not get converted to hex.这里的简单修复是用
r
保存您的密钥,以确保代码仅匹配\\\\
而不是\\
,并使用相同的值保存您的值以确保它不会被转换为十六进制。
def cleanLDAP(search):
escChars = {'(':r'\28', ')':r'\29', r'\\': r'\5c' }
for ch, val in escChars.items():
if ch in search:
search = search.replace(ch, val)
return search
>>> cleanLDAP('(123)')
#'\\28123\\29'
Change to -改成 -
escChars = {'(':r'\28', ')':r'\29', '\\': r'\5c' }
You missed adding r'\\5c'
and just did '\\5c'
.你错过了添加
r'\\5c'
而只是做了'\\5c'
。 This makes it hexadecimal.这使它成为十六进制。
To understand with an example -举个例子来理解——
a='\5'
a
ord(a)
Returns '\\x05'
and 5
respectively分别返回
'\\x05'
和5
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.