[英]Python Encode Special JSON characters when string contain regex?
Does anyone know of a way to get json.dumps to properly encode a string that contains a regular expression?有谁知道让 json.dumps 正确编码包含正则表达式的字符串的方法? Or if there is an alternative way to encode data for a JSON payload that doesn't utilize json.dumps that will correctly handle this?或者,是否有另一种方法可以为不使用 json.dumps 的 JSON 有效负载编码数据来正确处理此问题?
For example:例如:
import json
MyString = 'regex "Network\sInformation:[\s\S]+?Workstation\sName:\t+(?<src_host>[^\r]+)"'
data = {}
data['MyString'] = MyString
data['date'] = '2017-09-18T11:28:06'
json_data = json.dumps(data)
print json_data
Will generate:会产生:
{
"date": "2017-09-18T11:28:06",
"MyString": "regex \"Network\\sInformation:[\\s\\S]+?Workstation\\sName:\t+(?<src_host>[^\r]+)\""
}
However, you'll notice that the [^\\r]
isn't properly escaped, should be [^\\\\r]
, which when processed by the API results in a parse error.但是,您会注意到[^\\r]
没有正确转义,应该是[^\\\\r]
,当由 API 处理时会导致解析错误。
In the end, the JSON payload I am building here will be submitted to a web API using requests, similar to this:最后,我在这里构建的 JSON 负载将使用请求提交到 Web API,类似于:
requests.post(url, auth=(uname, passwd), data=json_data, headers=headers)
Note: I have considered simply creating a function that issues a bunch of replace commands to manually encode this myself, and this is my plan B at the moment, but I am hoping there is already a solution/module out there that I can utilize to do this.注意:我已经考虑过简单地创建一个函数来发出一堆替换命令来自己手动编码,这是我目前的计划 B,但我希望已经有一个解决方案/模块可以用来做这个。
Your regex definition is flawed, not the JSON output:您的正则表达式定义有缺陷,而不是 JSON 输出:
>>> MyString = 'regex "Network\sInformation:[\s\S]+?Workstation\sName:\t+(?<src_host>[^\r]+)"'
>>> MyString[-5:-4]
'\r'
>>> len(MyString[-5:-4])
1
>>> print(MyString[-5:-4]) # produces an empty line
You defined a carriage return, not a separate backslash and r
character;您定义了回车符,而不是单独的反斜杠和r
字符; Python interpreted the two as an escape sequence. Python 将两者解释为转义序列。 JSON then encoded that carriage return with \\r
too: JSON 然后也用\\r
对该回车进行编码:
>>> import json
>>> chr(13) # ASCII code 13 is a carriage return
'\r'
>>> print(json.dumps(chr(13)))
"\r"
Use a raw string literal instead:改用原始字符串文字:
MyString = r'regex "Network\sInformation:[\s\S]+?Workstation\sName:\t+(?<src_host>[^\r]+)"'
Now you have two separate characters, \\
and r
:现在你有两个单独的字符, \\
和r
:
>>> MyString = r'regex "Network\sInformation:[\s\S]+?Workstation\sName:\t+(?<src_host>[^\r]+)"'
>>> MyString[-6:-4]
'\\r'
>>> len(MyString[-6:-4])
2
>>> print(MyString[-6:-4])
\r
and those two characters produce your expected JSON output:并且这两个字符产生您预期的 JSON 输出:
>>> import json
>>> print(json.dumps(MyString))
"regex \"Network\\sInformation:[\\s\\S]+?Workstation\\sName:\\t+(?<src_host>[^\\r]+)\""
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.