繁体   English   中英

Python中的正则表达式和替换

[英]Regular expression and substitution in Python

我有一个内容字符串,如:

content =
"""
the patient monitoring system shall perform a daily device check from 1:30 am to 4:30 am (patient local time). if a device malfunction is detected, the daily device check shall send the malfunction to the clinician. if a patient health alarm is detected, the daily device check shall turn into full interrogation as specified in srs-3003. if no device or patient health issue identified, the daily device check shall end without further notification to the clinicians or patient. if a scheduled interrogation happens on the same day, the daily device check shall be skipped. if any device issue detected during the daily device check, the patient monitoring system shall alarm the patient with red urgent light. . if any patient health issue detected during the daily device check, the patient monitoring system shall alarm the patient with yellow warning light. . if a daily device check fails, it should be retried in 15 minutes up to 3 times. if a daily device check still fails after 3 times, the patient monitoring system shall end the interrogation and notify patient of the failed device check at 8 am that morning. there are 3 types of interrogations as below:
1. scheduled interrogation.
2. daily device check
3. patient initiated interrogation. an interrogation could fail due to the following reasons:
1. failed to establish communication.
2. communication lost.
3. failed to obtain a key data from the implanted device.
"""

我想替换像 1. 2. 3. 这样的小标题,但不想像 srs-3003 这样影响实际的内容编号。

如果我使用以下正则表达式: re.findall("\d{1}\.", content)结果是['3.', '1.', '2.', '3.', '1.', '2.', '3.']和 '3.' 在 srs-300 3.将在下一步中替换内容:

num_dot = re.findall("\d+\.", content)
for num in num_dot:
    content = content.replace(num, "")

我该如何进行?

您的正则表达式符合要求。 只是为了不匹配srs-3003.中的3. .。 您可以添加^锚。 就像是:

^\d+\.

上述正则表达式的解释:

  • ^ - 代表行的开始。
  • \d+ - 代表数字 class 出现一次或多次。
  • \. - 比赛. 字面上地。 如果您还想删除每个编号点线前面的空间; 请使用+\s+

您可以在此处找到上述正则表达式的演示。


Python 中的示例实现:

import re

regex = r"^\d+\."

test_str = ("the patient monitoring system shall perform a daily device check from 1:30 am to 4:30 am (patient local time). if a device malfunction is detected, the daily device check shall send the malfunction to the clinician. if a patient health alarm is detected, the daily device check shall turn into full interrogation as specified in srs-3003. if no device or patient health issue identified, the daily device check shall end without further notification to the clinicians or patient. if a scheduled interrogation happens on the same day, the daily device check shall be skipped. if any device issue detected during the daily device check, the patient monitoring system shall alarm the patient with red urgent light. . if any patient health issue detected during the daily device check, the patient monitoring system shall alarm the patient with yellow warning light. . if a daily device check fails, it should be retried in 15 minutes up to 3 times. if a daily device check still fails after 3 times, the patient monitoring system shall end the interrogation and notify patient of the failed device check at 8 am that morning. there are 3 types of interrogations as below:\n"
    "1. scheduled interrogation.\n"
    "2. daily device check\n"
    "3. patient initiated interrogation. an interrogation could fail due to the following reasons:\n"
    "1. failed to establish communication.\n"
    "2. communication lost.\n"
    "3. failed to obtain a key data from the implanted device.")

subst = ""

# You can manually specify the number of replacements by changing the 4th argument
result = re.sub(regex, subst, test_str, 0, re.MULTILINE)

if result:
    print (result)

请在此处找到上述程序的示例运行。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM