简体   繁体   English

如何使用 re.sub 删除方括号内的重复文本块?

[英]How do I use re.sub to remove a repeated block of text within square brackets?

I have a response from an API that is a pseudo dictionary with some 'key':'values' but mostly just a blob of text with 'key:values' .我收到了来自 API 的回复,它是一个伪字典,带有一些'key':'values'但大部分只是带有'key:values'的文本块。 I convert it with .json() to this:我用.json()将其转换为:

{'status': 'done', 'nextLogId': 'AQAAAXb', 'logs': [{'content': {'service': 't2pipeline', 'tags': ['tag1:value1', 'tag2:value2', 'tag3:value3'], 'timestamp': '2021-01-05T05:25:03.416Z', 'host': 'i-00e17', 'attributes': {'caller': 'psignal/state_machine.go:451', 'ts': 1609824303.416246, 'level': 'warn'}, 'message': 'psignal: Ignoring scte35 segmentation_descriptor (type:Program Start eventID:0 refUTC:Jan 5 05:25:02.387626333): there is an active segment with the same event_id'}, 'id': 'AQAAAXb'}, {'content': {'service': 't2pipeline', 'tags': ['tag1:value1', 'tag2:value2', 'tag3:value3'], 'timestamp': '2021-01-05T05:25:03.416Z', 'host': 'i-00e17', 'attributes': {'caller': 'psignal/state_machine.go:713', 't2': {'scte35': {'event_id': 0, 'event_ptr': '0xc009f32b40', 'seg_type_id': 16}}, 'ts': 1609824303.4161847, 'level': 'info'}, 'message': 'psignal: scte35 segdesc eventID:0 type:Program Start'}, 'id': 'AQAAAXb'}], 'requestId': 'OVZRd3hv'}

There are two entries in here and in reality there will be more.这里有两个条目,实际上还会有更多。

I convert to a string with json.dumps()我使用json.dumps()转换为字符串

And then use re.sub() to remove the 'tags': [] , section from the response and return the string like so然后使用re.sub()从响应中删除'tags': [] , 部分并像这样返回字符串

res = re.sub(r'"tags": \[.*"\],\s', "", response_string)

The problem is it only return the last entry.问题是它只返回最后一个条目。

print(res)

{"status": "done", "nextLogId": "AQAAAXb", "logs": [{"content": {"service": "t2pipeline", "timestamp": "2021-01-05T05:25:03.416Z", "host": "i-00e17b8e872ec7d05", "attributes": {"caller": "psignal/state_machine.go:713", "t2": {"scte35": {"event_id": 0, "event_ptr": "0xc009f32b40", "seg_type_id": 16}}, "ts": 1609824303.4161847, "level": "info"}, "message": "psignal: scte35 segdesc eventID:0 type:Program Start"}, "id": "AQAAAXb"}], "requestId": "OVZRd3hv"}

How do I modify the regex so that every instance of 'tags': [] , is removed and returns the whole string with all entries?如何修改正则表达式,以便删除'tags': []每个实例并返回包含所有条目的整个字符串?

Note: Since I can't del by key I think the only way to remove content is treating the response like a string and remove tag with regex.注意:由于我不能按键del ,我认为删除内容的唯一方法是将响应视为字符串并使用正则表达式删除标签。

No need using regex.无需使用正则表达式。 Use利用

import json
res = {'status': 'done', 'nextLogId': 'AQAAAXb', 'logs': [{'content': {'service': 't2pipeline', 'tags': ['tag1:value1', 'tag2:value2', 'tag3:value3'], 'timestamp': '2021-01-05T05:25:03.416Z', 'host': 'i-00e17', 'attributes': {'caller': 'psignal/state_machine.go:451', 'ts': 1609824303.416246, 'level': 'warn'}, 'message': 'psignal: Ignoring scte35 segmentation_descriptor (type:Program Start eventID:0 refUTC:Jan  5 05:25:02.387626333): there is an active segment with the same event_id'}, 'id': 'AQAAAXb'}, {'content': {'service': 't2pipeline', 'tags': ['tag1:value1', 'tag2:value2', 'tag3:value3'], 'timestamp': '2021-01-05T05:25:03.416Z', 'host': 'i-00e17', 'attributes': {'caller': 'psignal/state_machine.go:713', 't2': {'scte35': {'event_id': 0, 'event_ptr': '0xc009f32b40', 'seg_type_id': 16}}, 'ts': 1609824303.4161847, 'level': 'info'}, 'message': 'psignal: scte35 segdesc eventID:0 type:Program Start'}, 'id': 'AQAAAXb'}], 'requestId': 'OVZRd3hv'}
for i in range(len(res['logs'])):
    del res['logs'][i]['content']['tags']
print(res)

See Python proofPython 证明

Results :结果

{'status': 'done', 'nextLogId': 'AQAAAXb', 'logs': [{'content': {'service': 't2pipeline', 'timestamp': '2021-01-05T05:25:03.416Z', 'host': 'i-00e17', 'attributes': {'caller': 'psignal/state_machine.go:451', 'ts': 1609824303.416246, 'level': 'warn'}, 'message': 'psignal: Ignoring scte35 segmentation_descriptor (type:Program Start eventID:0 refUTC:Jan  5 05:25:02.387626333): there is an active segment with the same event_id'}, 'id': 'AQAAAXb'}, {'content': {'service': 't2pipeline', 'timestamp': '2021-01-05T05:25:03.416Z', 'host': 'i-00e17', 'attributes': {'caller': 'psignal/state_machine.go:713', 't2': {'scte35': {'event_id': 0, 'event_ptr': '0xc009f32b40', 'seg_type_id': 16}}, 'ts': 1609824303.4161847, 'level': 'info'}, 'message': 'psignal: scte35 segdesc eventID:0 type:Program Start'}, 'id': 'AQAAAXb'}], 'requestId': 'OVZRd3hv'}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM