简体   繁体   English

匹配多个单词Regex,python

[英]Matching multiple words Regex , python

i need to match a pattern from a string. 我需要匹配字符串中的模式。 The string is variable so i need to develop some amount of variability in it . 字符串是可变的,所以我需要在其中增加一些可变性。
What i need to do is extract words occurring with "layout" and they occur in 4 different manners 我需要做的是提取出现在“布局”中的单词,它们以4种不同的方式出现

1 word -- layout` eg: hsr layout

2words -- layout eg: golden garden layout

digit-word -- layout eg: 19th layout

digit-word word --layout eg:- 20th garden layout

It can be seen that i need the digits field to be optional. 可以看出,我需要数字字段是可选的。 a single regex must do it. 单个正则表达式必须做到这一点。 here's what i did: 这是我所做的:

import re
p = re.compile(r'(?:\d*)?\w+\s(?:\d*)?\w+l[ayout]*')
text = "opp when the 19th hsr layut towards"
q = re.findall(p,text)

i need 19th hsr layout in this expression. 我需要在此表达式中的19th hsr布局。 but the above code returns none. 但以上代码未返回任何内容。 What is the problem with my code above? 我上面的代码有什么问题?

Some string examples are: 一些字符串示例是:

str1 = " 25/4 16th june road ,watertank layout ,blr"  #extract watertank layout 
str2 = " jacob circle 16th rusthumbagh layout , 5th cross" #extract 16th rustumbagh layout
str3 = " oberoi splendor garden blossoms layout , 5th main road"  #extract garden blossoms layout
str4 = " belvedia heights , 15th layout near Jaffrey gym" #extract 15th layout

Use r'(?:\\w+\\s+){1,2}layout' as I commented: 我评论时使用r'(?:\\w+\\s+){1,2}layout'

>>> import re
>>> p = re.compile(r'(?:\w+\s+){1,2}layout')
>>> p.findall(" 25/4 16th june road ,watertank layout ,blr")
['watertank layout']
>>> p.findall(" jacob circle 16th rusthumbagh layout , 5th cross")
['16th rusthumbagh layout']
>>> p.findall(" oberoi splendor garden blossoms layout , 5th main road")
['garden blossoms layout']
>>> p.findall(" belvedia heights , 15th layout near Jaffrey gym")
['15th layout']

{1,2} is used to match at most 2 words. {1,2}用于匹配最多2个单词。

This seems to work - 这似乎有效-

import re

l = [" 25/4 16th june road ,watertank layout ,blr",
" jacob circle, 16th rusthumbagh layout , 5th cross",
" oberoi splendor , garden blossoms layout , 5th main road",
" belvedia heights , 15th layout near Jaffrey gym",]

for ll in l:
    print re.search(r'\,([\w\s]+)layout', ll).groups()

Output: 输出:

('watertank ',)
(' 16th rusthumbagh ',)
(' garden blossoms ',)
(' 15th ',)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM