繁体   English   中英

如何使用正则表达式从字符串中提取子字符串

[英]How to extract a substring from a string using regex

我有一个像下面这样的字符串,如果可能的话,我想使用正则表达式或任何其他方式从该字符串中提取突出显示的部分

密尔沃基/沙利文的国家气象局发布了\\n\\n* Tornado Warning for...\\nNorthwestern Columbia County in south central Wisconsin...\\nSouthwestern Marquette County in south central Wisconsin...\\n\\n* \\n Tornado Warning for...\\nNorthwestern Columbia County in south central Wisconsin...\\nSouthwestern Marquette County in south central Wisconsin...\\n\\n*直到CDT 晚上 945 点。\\n\\n* CDT 下午 911 点,一场能够产生龙卷风的强雷暴\\n位于威斯康星戴尔以东 8 英里处,以 45\\nmph 的速度向东北移动。\\n\\n危险...龙卷风。\\n \\n来源...雷达指示旋转。\\n\\n影响...飞行碎片对于那些没有\\n庇护所捕获的人来说是危险的。 移动房屋将被损坏或毁坏。\\n屋顶、窗户和车辆将发生损坏。 树木\\n可能会损坏。\\n\\n* 受影响的地点包括...\\nPackwaukee、Endeavor 和 Briggsville。

description = 'The National Weather Service in Milwaukee/Sullivan has issued a\n\n* Tornado Warning for...\nNorthwestern Columbia County in south central Wisconsin...\nSouthwestern Marquette County in south central Wisconsin...\n\n* Until 945 PM CDT.\n\n* At 911 PM CDT, a severe thunderstorm capable of producing a tornado\nwas located 8 miles east of Wisconsin Dells, moving northeast at 45\nmph.\n\nHAZARD...Tornado.\n\nSOURCE...Radar indicated rotation.\n\nIMPACT...Flying debris will be dangerous to those caught without\nshelter. Mobile homes will be damaged or destroyed.\nDamage to roofs, windows, and vehicles will occur.  Tree\ndamage is likely.\n\n* Locations impacted include...\nPackwaukee, Endeavor and Briggsville.'

#now I want to match substring between (Tornado Warning for... *** ...\n\n*)

# I tried to like this

re.search('Tornado Warning for...(.*)\n\n*', description)

# I am getting results like this

<re.Match object; span=(67, 90), match='Tornado Warning for...\n'>

#expected result 

<re.Match object; span=(any, any), match='Tornado Warning for...\nNorthwestern Columbia County in south central Wisconsin...\nSouthwestern Marquette County in south central Wisconsin...\n\n*'>

它不匹配完​​整子字符串,它唯一匹配Tornado Warning for...\\n

我想匹配Tornado Warning for...\\nNorthwestern Columbia County in south central Wisconsin...\\nSouthwestern Marquette County in south central Wisconsin...\\n\\n*

其中 substring 开始Tornado Warning for...并结束\\n\\n*

感谢您的帮助,并为我的英语不好而道歉

你可以匹配

\bTornado Warning for\.\.\.(?:\n.*)*?\n\n

模式匹配:

  • \\bTornado Warning for\\.\\.\\. 匹配Tornado Warning for前面有一个单词边界并转义点以逐字匹配它们
  • (?:\\n.*)*? 匹配尽可能少的换行符和该行的其余部分
  • \\n\\n匹配 2 个换行符

正则表达式演示| Python 演示

例如

import re

description = 'The National Weather Service in Milwaukee/Sullivan has issued a\n\n* Tornado Warning for...\nNorthwestern Columbia County in south central Wisconsin...\nSouthwestern Marquette County in south central Wisconsin...\n\n* Until 945 PM CDT.\n\n* At 911 PM CDT, a severe thunderstorm capable of producing a tornado\nwas located 8 miles east of Wisconsin Dells, moving northeast at 45\nmph.\n\nHAZARD...Tornado.\n\nSOURCE...Radar indicated rotation.\n\nIMPACT...Flying debris will be dangerous to those caught without\nshelter. Mobile homes will be damaged or destroyed.\nDamage to roofs, windows, and vehicles will occur.  Tree\ndamage is likely.\n\n* Locations impacted include...\nPackwaukee, Endeavor and Briggsville.'

m = re.search(r'\bTornado Warning for\.\.\.(?:\n.*)*?\n\n', description)
if m:
    print(m.group())

输出

Tornado Warning for...
Northwestern Columbia County in south central Wisconsin...
Southwestern Marquette County in south central Wisconsin...

正则表达式可能如下所示:

matched_string = re.findall("Tornado[a-zA-Z\s\.\\\*]+\\n\\n\*", description)
print(matched_string)

. 无法匹配\\n 改用[\\W\\w] .

import re
description = 'The National Weather Service in Milwaukee/Sullivan has issued a\n\n* Tornado Warning for...\nNorthwestern Columbia County in south central Wisconsin...\nSouthwestern Marquette County in south central Wisconsin...\n\n* Until 945 PM CDT.\n\n* At 911 PM CDT, a severe thunderstorm capable of producing a tornado\nwas located 8 miles east of Wisconsin Dells, moving northeast at 45\nmph.\n\nHAZARD...Tornado.\n\nSOURCE...Radar indicated rotation.\n\nIMPACT...Flying debris will be dangerous to those caught without\nshelter. Mobile homes will be damaged or destroyed.\nDamage to roofs, windows, and vehicles will occur.  Tree\ndamage is likely.\n\n* Locations impacted include...\nPackwaukee, Endeavor and Briggsville.'

print(re.search(r'Tornado Warning for\.\.\.([\W\w]*?)\n\n\*', description).group())

"""
Tornado Warning for...
Northwestern Columbia County in south central Wisconsin...
Southwestern Marquette County in south central Wisconsin...

*
"""

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM