[英]Matching multiple words Regex , python
i need to match a pattern from a string. 我需要匹配字符串中的模式。 The string is variable so i need to develop some amount of variability in it .
字符串是可变的,所以我需要在其中增加一些可变性。
What i need to do is extract words occurring with "layout" and they occur in 4 different manners 我需要做的是提取出现在“布局”中的单词,它们以4种不同的方式出现
1 word -- layout` eg: hsr layout
2words -- layout eg: golden garden layout
digit-word -- layout eg: 19th layout
digit-word word --layout eg:- 20th garden layout
It can be seen that i need the digits field to be optional. 可以看出,我需要数字字段是可选的。 a single regex must do it.
单个正则表达式必须做到这一点。 here's what i did:
这是我所做的:
import re
p = re.compile(r'(?:\d*)?\w+\s(?:\d*)?\w+l[ayout]*')
text = "opp when the 19th hsr layut towards"
q = re.findall(p,text)
i need 19th hsr layout in this expression. 我需要在此表达式中的19th hsr布局。 but the above code returns none.
但以上代码未返回任何内容。 What is the problem with my code above?
我上面的代码有什么问题?
Some string examples are: 一些字符串示例是:
str1 = " 25/4 16th june road ,watertank layout ,blr" #extract watertank layout
str2 = " jacob circle 16th rusthumbagh layout , 5th cross" #extract 16th rustumbagh layout
str3 = " oberoi splendor garden blossoms layout , 5th main road" #extract garden blossoms layout
str4 = " belvedia heights , 15th layout near Jaffrey gym" #extract 15th layout
Use r'(?:\\w+\\s+){1,2}layout'
as I commented: 我评论时使用
r'(?:\\w+\\s+){1,2}layout'
:
>>> import re
>>> p = re.compile(r'(?:\w+\s+){1,2}layout')
>>> p.findall(" 25/4 16th june road ,watertank layout ,blr")
['watertank layout']
>>> p.findall(" jacob circle 16th rusthumbagh layout , 5th cross")
['16th rusthumbagh layout']
>>> p.findall(" oberoi splendor garden blossoms layout , 5th main road")
['garden blossoms layout']
>>> p.findall(" belvedia heights , 15th layout near Jaffrey gym")
['15th layout']
{1,2}
is used to match at most 2 words. {1,2}
用于匹配最多2个单词。
This seems to work - 这似乎有效-
import re
l = [" 25/4 16th june road ,watertank layout ,blr",
" jacob circle, 16th rusthumbagh layout , 5th cross",
" oberoi splendor , garden blossoms layout , 5th main road",
" belvedia heights , 15th layout near Jaffrey gym",]
for ll in l:
print re.search(r'\,([\w\s]+)layout', ll).groups()
Output: 输出:
('watertank ',)
(' 16th rusthumbagh ',)
(' garden blossoms ',)
(' 15th ',)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.