[英]Unescape strings in Python
I have an input file that contains a list of inputs, one per line. 我有一个输入文件,其中包含输入列表,每行一个。 Each line of input is enclosed in double quotes.
输入的每一行都用双引号引起来。 The inputs sometimes have a backslash or few double-quotes as within the enclosing double-quotes (check example below).
输入有时会包含反斜杠或一些双引号,如括在双引号内(请参见下面的示例)。
Sample inputs — 样本输入—
"each line is enclosed in double-quotes"
"Double quotes inside a \"double-quoted\" string!"
"This line contains backslashes \\not so cool\\"
"too many double-quotes in a line \"\"\"too much\"\"\""
"too many backslashes \\\\\\\"horrible\"\\\\\\"
I would like to take the above inputs and simply convert the ones with the escaped double quotes in the lines to a back-tick ` . 我想接受以上输入并将行中带有转义双引号的输入转换为反引号` 。
I assume that there is a straightforward one-line solution to this. 我认为对此有一个简单的单线解决方案。 I tried the following but it doesn't work.
我尝试了以下操作,但不起作用。 Any other one-liner solution or a fix to the below code would be greatly appreciated.
任何其他单线解决方案或对以下代码的修复将不胜感激。
def fix(line):
return re.sub(r'\\"', '`', line)
It fails for input lines 3 and 5 . 输入线路3和5失败。
"each line is enclosed in double-quotes"
"Double quotes inside a `double-quoted` string!"
"This line contains backslashes \\not so cool\`
"too many double-quotes in a line ```too much```"
"too many backslashes \\\\\\`horrible`\\\\\`
Any fix I can think of breaks other lines. 我能想到的任何修复方法都会破坏其他方面。 Please help!
请帮忙!
This is not quite what you asked for as it replaces with "
rather than `, but I'll mention it ... you could always leverage off csv
to do \\"
conversion correctly for you: 这不是您所要求的,因为它被替换为
"
而不是`,但是我会提到它……您始终可以利用csv
为您正确地进行\\"
转换:
>>> for line in csv.reader(["each line is enclosed in double-quotes",
... "Double quotes inside a \"double-quoted\" string!",
... "This line contains backslashes \\not so cool\\",
... "too many double-quotes in a line \"\"\"too much\"\"\"",
... "too many backslashes \\\\\\\"horrible\"\\\\\\",
... ]):
... print(line)
...
['each line is enclosed in double-quotes']
['Double quotes inside a "double-quoted" string!']
['This line contains backslashes \\not so cool\\']
['too many double-quotes in a line """too much"""']
['too many backslashes \\\\\\"horrible"\\\\\\']
If it is then important that they be actual `'s, you could simply do a replace on the text returned by the csv
module. 如果很重要的一点是要让它们成为实际的`,可以简单地对
csv
模块返回的文本进行替换。
在反斜杠后添加+
。
return re.sub(r'\\+"', '`', line)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.