简体   繁体   中英

Python (json.load) to set value to a string missing escape characters

I am parsing json file that has the following data subset.

"title": "Revert \"testcase for check\""

In my python script I do the following:

with open('%s/staging_area/pr_info.json' % cwd) as data_file:
                        pr_info = json.load(data_file)              
        pr_title=pr_info["title"]

pr_title will contain the following information after getting the title from json object.

Revert "testcase for check"

It seems that escape characters \ are not part of the string assignment. Is there any way to retain the entire string including escape characters? Thank you so much!

If you really need it, you should escape it again with json and remove first and last quote:

pr_title = json.dumps(pr_title)[1:-1]

but escape characters is for escaping, raw value of string is still Revert "testcase for check" . So escaping function will depend on where you data is applied (DB, HTML, XML, etc).

To explain [1:-1] , the dumps escapes raw string to be JSON-valid which adds \ and surrounds the string with quotation marks " . You have to remove these quotes from resulting string. Since Python could work with string same as list you can get all letters from second to penultimate with [1:-1] which literally removes the first and last quotes:

print(pr_title)                                                                                       
>>> "Revert \"testcase for check\""

print(pr_title[1:-1])                                                                                 
>>> Revert \"testcase for check\"

In case you really need to maintain the escape characters, you will have to escape the escape characters right after reading the file and before parsing the JSON.

with open('%s/staging_area/pr_info.json' % cwd) as data_file:
        raw_data_file = data_file.read().replace("\\", "\\\\\\")
        pr_info = json.JSONDecoder().decode(raw_data_file)

Then pr_title should still have the original escaped characters.


What is happening is:

  1. Replace each single backslash for three backslashes: original escape character (\) + an escaped escape character (\\).
  2. raw_data_file is now a string instead of a file pointer, so we cannot use json.load(). But the method decode from json.JSONDecoder admits a string input.
  3. The decoder will parse the JSON string and remove the escaped escape character , while maintaining the original one from your file.

If your goal is to print pr_title, then you can probably use json.dumps() to print the original text.

>>> import json
>>> j = '{"name": "\"Bob\""}'
>>> print(j)
{"name": ""Bob""}
>>> json.dumps(j)
'"{\\"name\\": \\"\\"Bob\\"\\"}"'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM