简体   繁体   English

如何在python中注入转义序列

[英]How to inject escape sequences in python

I need to put escape sequences in a string for certain characters (using double quote as an example here). 我需要将转义序列放在字符串中以表示某些字符(在这里使用双引号作为示例)。 For example, if I have a string abra"cada"bra , I need to generate this: abra\\"cada\\"bra . 例如,如果我有一个字符串abra"cada"bra ,我需要生成这个: abra\\"cada\\"bra But if the string is already has escape characters for my interested literals (ie double quote in this example) abra\\"cada\\"bra , I need to leave it alone. 但是如果字符串已经有我感兴趣的文字的转义字符(即本例中的双引号) abra\\"cada\\"bra ,我需要不管它。 What is the easiest way to do it in python? 在python中最简单的方法是什么?

(The idea is to write it to a text file which is read by another utility.) (想法是将其写入由另一个实用程序读取的文本文件。)

首先解码字符串可能最容易,因此没有任何内容被转义,然后重新转义结果字符串。

You can get it with the appropriate negative look behind assertion in regular expressions: 您可以使用正则表达式中的断言背后的相应负面看法来获取它:

import re

PAT = re.compile(r'(?<!\\)"')
txt1 = '"abra"cada"bra'
txt2 = '\\"abra\\"cada\\"bra'
print PAT.sub(r'\\"', txt1)
print PAT.sub(r'\\"', txt2)

This would make sure, it even works correctly, if the quote is the first character of the sting, as in the example above. 如果引号是sting的第一个字符,这将确保它甚至可以正常工作,如上例所示。

something like this 这样的事情

def esc_string(mystring, delim, esc_char='\\'):
    return (esc_char+delim).join([s[:-1] if s.endswith(esc_char) else s for s in mystring.split(delim)])

then 然后

print esc_string('abra"cada"bra', '"')
abra\"cada\"bra
print esc_string('abra\\"cada\\"bra', '"')
abra\"cada\"bra
print esc_string('"boundary test"', '"')
\"boundary test\"
print esc_string('\\"boundary test\\"', '"')
\"boundary test\"

Assuming \\ has no special meaning other than immediately before certain characters (eg, '"' ) then @chepner's suggestion to unescape first could be implemented as: 假设\\比立即某些字符(例如,先于其他没有特殊含义'"' ),那么@ chepner的建议首先取消转义可以实现为:

def escape(text, char='"', escape="\\"):
    escaped_char = escape + char
    text = text.replace(escaped_char, char) # unescape
    return text.replace(char, escaped_char) # escape

Input 输入

"abra"cada"bra\"
\"abra\"cada\"bra"
"abra\"cada"bra\"
abra\"cada\\"bra\"
abra\"cada\\\"bra\"

Output 产量

\"abra\"cada\"bra\"
\"abra\"cada\"bra\"
\"abra\"cada\"bra\"
abra\"cada\\"bra\"
abra\"cada\\\"bra\"

Regular expressions will do it. 正则表达式会这样做。 This one says to match the " character if it is not preceded by a backslash. I used an 'r' at the front of the strings to tell python not to treat the '\\' character specially and I had to put it in twice to tell the regular expression parser not to use it specially. Try help(re) for what the (? 这个用来匹配“如果它没有反斜杠前面的字符。我在字符串的前面使用'r'来告诉python不要特别对待'\\'字符,我不得不把它放两次到告诉正则表达式解析器不要特别使用它。尝试帮助(重新)为什么(?

import re
re.sub(r'(?<!\\)"', r'\"', 'abra"cada\\"bra')
# Returns 'abra\\"cada\\"bra'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM