简体   繁体   English

具有多个匹配的正则表达式 - Python

[英]Regular Expression with multiple matches - Python

I have searched the web to find a similar problem but couldn't. 我在网上搜索了一个类似的问题,但不能。

Here is an address: 这是一个地址:

the fashion potential hq 116 w 23rd st ste 5 5th floor new york ny 10011 时尚潜力116 w 23rd st ste 5 5th floor new york ny 10011

Using the following regex in python I tried to find the all possible main addresses in the above line: 在python中使用以下正则表达式我试图在上面的行中找到所有可能的主要地址:

re.findall(r'^(.*)(\b\d+\b)(.+)(\bst\b|\bste\b)(.*)$', 'the fashion potential hq 116 w 23rd st ste 5 5th floor new york ny 10011')

I get result as: 我得到的结果如下:

[('the fashion potential hq ', '116', ' w 23rd st ', 'ste', ' 5 5th floor new york ny 10011')] . [('the fashion potential hq ', '116', ' w 23rd st ', 'ste', ' 5 5th floor new york ny 10011')]

I also want the result to include this: ('the fash....', '116', 'w 23rd ', 'st', 'ste 5 5th....') . 我也希望结果包括:( ('the fash....', '116', 'w 23rd ', 'st', 'ste 5 5th....') I expected findall would do the trick but didn't. 我希望findall可以做到这一点,但事实并非如此。 Any help is greatly appreciated. 任何帮助是极大的赞赏。

To make it clear what I want as output (or similar which includes all possibilities): [ ('the fashion potential hq ', '116', ' w 23rd ', 'st', 'ste 5 5th floor new york ny 10011'), ('the fashion potential hq ', '116', ' w 23rd st ', 'ste', ' 5 5th floor new york ny 10011')] 为了清楚我想要的输出(或类似的包括所有可能性): [ ('the fashion potential hq ', '116', ' w 23rd ', 'st', 'ste 5 5th floor new york ny 10011'), ('the fashion potential hq ', '116', ' w 23rd st ', 'ste', ' 5 5th floor new york ny 10011')]

Online Python code 在线Python代码

You need to run 2 regex expressions, one with lazy dot and another with a greedy dot. 你需要运行2个正则表达式,一个带有懒点,另一个带有贪婪点。

First one is this : 第一个是这样

^(.*?)(\b\d+\b)(.+)\b(ste|st|ave|blvd)\b\s*(.*)$

The second one with the use lazy dot matching pattern inside: 第二个使用懒惰点匹配模式:

^(.*?)(\b\d+\b)(.+?)\b(ste|st|ave|blvd)\b\s*(.*)$
                ^^^    ^^^^^^^^^^^^^^^

See the regex demo 请参阅正则表达式演示

Output: 输出:

the fashion potential hq 
116
 w 23rd 
st
ste 5 5th floor new york ny 10011

Python sample code : Python示例代码

import re
p = re.compile(r'^(.*?)(\b\d+\b)(.+?)\b(ste|st|ave|blvd)\b\s*(.*)$')
p2 = re.compile(r'^(.*?)(\b\d+\b)(.+)\b(ste|st|ave|blvd)\b\s*(.*)$')
s = "the fashion potential hq 116 w 23rd st ste 5 5th floor new york ny 10011"
m = p.search(s)
if m:
    n = p2.search(s)
    if n:
        print([m.groups(), n.groups()])

Results: 结果:

[
   ('the fashion potential hq ', '116', ' w 23rd ', 'st', 'ste 5 5th floor new york ny 10011'), 
   ('the fashion potential hq ', '116', ' w 23rd st ', 'ste', '5 5th floor new york ny 10011')
 ]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM