简体   繁体   中英

Remove backslash from decoded json string using python

I have records that have a list of maps(dict). in which I have encoded data in it. when we decode the data we get a JSON string with extra backslashes. Now I need to remove the backslashes '' from the JSON string.

{"records": [{'data': 'eyJ0aW1lIjogMTQzMjgyNjg1NTAwMCwiaG9zdCI6ICJhcm46YXdzOmlhbTo6MTIzNDU2Nzg5MDEyOnJvbGUvRmlyZWhvc2V0b1MzUm9sZSIsInNvdXJjZSI6ICJwdGZlX2F0bGFzX2xvZ19ldmVudHMiLCJzb3VyY2V0eXBlIjoiYXdzOmNsb3Vkd2F0Y2hsb2dzOnB0ZmVfYXRsYXMiLCJpbnN0YW5jZV9pZCI6IjEyMzQ1Njc4OTAxMl9DbG91ZFRyYWlsX3VzLWVhc3QtMSIsImxvZ19maWxlIjoiQ2xvdWRUcmFpbCIsImV2ZW50IjogIntcImV2ZW50VmVyc2lvblwiOlwiMS4wNFwiLFwidXNlcklkZW50aXR5XCI6e1widHlwZVwiOlwiUm9vdFwifSJ9Cgp7InRpbWUiOiAxNDMyODI2ODU1MDAwLCJob3N0IjogImFybjphd3M6aWFtOjoxMjM0NTY3ODkwMTI6cm9sZS9GaXJlaG9zZXRvUzNSb2xlIiwic291cmNlIjogInB0ZmVfYXRsYXNfbG9nX2V2ZW50cyIsInNvdXJjZXR5cGUiOiJhd3M6Y2xvdWR3YXRjaGxvZ3M6cHRmZV9hdGxhcyIsImluc3RhbmNlX2lkIjoiMTIzNDU2Nzg5MDEyX0Nsb3VkVHJhaWxfdXMtZWFzdC0xIiwibG9nX2ZpbGUiOiJDbG91ZFRyYWlsIiwiZXZlbnQiOiAie1wiZXZlbnRWZXJzaW9uXCI6XCIxLjA1XCIsXCJ1c2VySWRlbnRpdHlcIjp7XCJ0eXBlXCI6XCJSb290XCJ9In0KCnsidGltZSI6IDE0MzI4MjY4NTUwMDAsImhvc3QiOiAiYXJuOmF3czppYW06OjEyMzQ1Njc4OTAxMjpyb2xlL0ZpcmVob3NldG9TM1JvbGUiLCJzb3VyY2UiOiAicHRmZV9hdGxhc19sb2dfZXZlbnRzIiwic291cmNldHlwZSI6ImF3czpjbG91ZHdhdGNobG9nczpwdGZlX2F0bGFzIiwiaW5zdGFuY2VfaWQiOiIxMjM0NTY3ODkwMTJfQ2xvdWRUcmFpbF91cy1lYXN0LTEiLCJsb2dfZmlsZSI6IkNsb3VkVHJhaWwiLCJldmVudCI6ICJ7XCJldmVudFZlcnNpb25cIjpcIjEuMDZcIixcInVzZXJJZGVudGl0eVwiOntcInR5cGVcIjpcIlJvb3RcIn0ifQoK', 'result': 'Ok', 'recordId': '12345'}]}

JSON output (rawdata), after decoded data.

"{\"time\": 1432826855000,\"host\": \"arn:aws:iam::123456789012:role/FirehosetoS3Role\",\"source\": \"ptfe_atlas_log_events\",\"sourcetype\":\"aws:cloudwatchlogs:ptfe_atlas\",\"instance_id\":\"123456789012_CloudTrail_us-east-1\",\"log_file\":\"CloudTrail\",\"event\": \"{\\\"eventVersion\\\":\\\"1.04\\\",\\\"userIdentity\\\":{\\\"type\\\":\\\"Root\\\"}\"}\n\n{\"time\": 1432826855000,\"host\": \"arn:aws:iam::123456789012:role/FirehosetoS3Role\",\"source\": \"ptfe_atlas_log_events\",\"sourcetype\":\"aws:cloudwatchlogs:ptfe_atlas\",\"instance_id\":\"123456789012_CloudTrail_us-east-1\",\"log_file\":\"CloudTrail\",\"event\": \"{\\\"eventVersion\\\":\\\"1.05\\\",\\\"userIdentity\\\":{\\\"type\\\":\\\"Root\\\"}\"}\n\n{\"time\": 1432826855000,\"host\": \"arn:aws:iam::123456789012:role/FirehosetoS3Role\",\"source\": \"ptfe_atlas_log_events\",\"sourcetype\":\"aws:cloudwatchlogs:ptfe_atlas\",\"instance_id\":\"123456789012_CloudTrail_us-east-1\",\"log_file\":\"CloudTrail\",\"event\": \"{\\\"eventVersion\\\":\\\"1.06\\\",\\\"userIdentity\\\":{\\\"type\\\":\\\"Root\\\"}\"}\n\n"

my python program:

import base64

for r in records
    SRC=r["data"]
    rawdata = base64.b64decode(SRC).decode('utf-8')
    Data = rawdata.replace('\\', '')
    removenewline = Data.replease('\n', '')
    mydata = removenewline.replace("\'", '"')
    
return mydata

but this is able to remove double backslashes and newline (\n), but it leaves single backslashes. output:

"{\"time\": 1432826855000,\"host\": \"arn:aws:iam::123456789012:role/FirehosetoS3Role\",\"source\": \"ptfe_atlas_log_events\",\"sourcetype\":\"aws:cloudwatchlogs:ptfe_atlas\",\"instance_id\":\"123456789012_CloudTrail_us-east-1\",\"log_file\":\"CloudTrail\",\"event\": \"{\"eventVersion\":\"1.04\",\"userIdentity\":{\"type\":\"Root\"}\"}{\"time\": 1432826855000,\"host\": \"arn:aws:iam::123456789012:role/FirehosetoS3Role\",\"source\": \"ptfe_atlas_log_events\",\"sourcetype\":\"aws:cloudwatchlogs:ptfe_atlas\",\"instance_id\":\"123456789012_CloudTrail_us-east-1\",\"log_file\":\"CloudTrail\",\"event\": \"{\"eventVersion\":\"1.05\",\"userIdentity\":{\"type\":\"Root\"}\"}{\"time\": 1432826855000,\"host\": \"arn:aws:iam::123456789012:role/FirehosetoS3Role\",\"source\": \"ptfe_atlas_log_events\",\"sourcetype\":\"aws:cloudwatchlogs:ptfe_atlas\",\"instance_id\":\"123456789012_CloudTrail_us-east-1\",\"log_file\":\"CloudTrail\",\"event\": \"{\"eventVersion\":\"1.06\",\"userIdentity\":{\"type\":\"Root\"}\"}"

If python is failing to parse that data, do another replace function for the variable mydata as follows to get it in a json serializable form. It is a somewhat "jank" but acceptable method.

import json
mydata.replace('\\"', '"')
json.loads(mydada)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM