简体   繁体   中英

How to insert backslash before a group with regex re.sub() in python

I have a text string that I want to convert to JSON, like:

{text1: text2}

however text2 is filled with illegal characters: "{[]}, so it won't be parsed correctly.

I would like to escape each illegal character by inserting a backslash before it, but I can't get it to work.

The closest I can get is :

In [6]: re.sub('([\[\]\{\},"]{1})', r'\\\1', 'abc[def') 
Out[6]: 'abc\\[def'

But this inserts two backslashes instead of one... I can't get it to insert one.

On second though, perhaps the problem is with my json.loads()? Here's an example:

In [41]: z
Out[41]: '{"abc": "sdfd\\[sfsdfdf"}'
In [42]: print(z)
Out[42]: {"abc": "sdfd\[sfsdfdf"}

As you can see by the difference between z and print(z), the backslash is properly escaped. But when I execute

json.loads(z)

I still get the Invalid escape error on the backslash.

Any ideas?

You don't need to escape brackets for JSON . JSON expects a unicode character number or " , \ and widespaces. The problem is rather how Python handles escape sequences in strings. Just feed it as raw string to json.loads() :

import json

json.loads(r'{"abc": "abc[def"}')
json.loads(r'{"abc": "ab\\cd\"e\tf"}')
json.loads('{"abc": "abc'+ re.escape('abc\def') +'def"}')

would print:

{'abc': 'abc[def'}
{'abc': 'ab\cd"e\tf'}
{'abc': 'abcabc\defdef'}

So you can keep your code but you need to escape the right characters:

import json
import re

json.loads(r'{"abc": "' + re.sub(r'\\', r'\\\\', 'abc\def') + '"}')

{'abc': 'abc\def'}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM