[英]Python regex to get certain pattern
['/allstar/NBA-allstar-career-stats.html', '/allstar/NBA_2022.html', '/allstar/NBA_2022.html', '/allstar/NBA_2021.html', '/allstar/NBA_2021.html', '/allstar/NBA_2021_voting.html', '/allstar/NBA_2021.html', '/allstar/NBA_2020.html', '/allstar/NBA_2020.html', '/allstar/NBA_2020_voting.html', '/allstar/NBA_2020.html', '/allstar/NBA_2019.html', '/allstar/NBA_2019.html', '/allstar/NBA_2019_voting.html', '/allstar/NBA_2019.html', '/allstar/NBA_2018.html', '/allstar/NBA_2018.html', '/allstar/NBA_2018_voting.html', '/allstar/NBA_2018.html', '/allstar/NBA_2017.html', '/allstar/NBA_2017.html'] ['/allstar/NBA-allstar-career-stats.html'、'/allstar/NBA_2022.html'、'/allstar/NBA_2022.html'、'/allstar/NBA_2021.html'、'/allstar/NBA_2021.html ', '/allstar/NBA_2021_voting.html', '/allstar/NBA_2021.html', '/allstar/NBA_2020.html', '/allstar/NBA_2020.html', '/allstar/NBA_2020_voting.html', '/allstar /NBA_2020.html'、'/allstar/NBA_2019.html'、'/allstar/NBA_2019.html'、'/allstar/NBA_2019_voting.html'、'/allstar/NBA_2019.html'、'/allstar/NBA_2018.html' , '/allstar/NBA_2018.html', '/allstar/NBA_2018_voting.html', '/allstar/NBA_2018.html', '/allstar/NBA_2017.html', '/allstar/NBA_2017.html']
I want to get only /allstar/NBA_2017.html
, /allstar/NBA_2018.html
, /allstar/NBA_2019.html
using re.compile().我只想使用 re.compile() 获得
/allstar/NBA_2017.html
、 /allstar/NBA_2018.html
、 /allstar/NBA_2019.html
。
Does anyone have an idea?有没有人有想法?
I'm no expert in regex, but this works.我不是正则表达式方面的专家,但这行得通。
import re
li = [
'/allstar/NBA-allstar-career-stats.html', '/allstar/NBA_2022.html',
'/allstar/NBA_2022.html', '/allstar/NBA_2021.html',
'/allstar/NBA_2021.html', '/allstar/NBA_2021_voting.html',
'/allstar/NBA_2021.html', '/allstar/NBA_2020.html',
'/allstar/NBA_2020.html', '/allstar/NBA_2020_voting.html',
'/allstar/NBA_2020.html', '/allstar/NBA_2019.html',
'/allstar/NBA_2019.html', '/allstar/NBA_2019_voting.html',
'/allstar/NBA_2019.html', '/allstar/NBA_2018.html',
'/allstar/NBA_2018.html', '/allstar/NBA_2018_voting.html',
'/allstar/NBA_2018.html', '/allstar/NBA_2017.html',
'/allstar/NBA_2017.html'
]
prog = r'.*201[789].html'
def match(x):
return prog.match(x)
prog = re.compile(prog)
res = list(filter(match, li))
print(res)
And this yields the following:这会产生以下结果:
[
'/allstar/NBA_2019.html', '/allstar/NBA_2019.html',
'/allstar/NBA_2019.html', '/allstar/NBA_2018.html',
'/allstar/NBA_2018.html', '/allstar/NBA_2018.html',
'/allstar/NBA_2017.html', '/allstar/NBA_2017.html'
]
Hope this is what you want!希望这是你想要的!
It's well known that compiling regular expressions in Python is unnecessary unless you have very large numbers of expressions being used in the same program.众所周知,在 Python 中编译正则表达式是不必要的,除非您在同一个程序中使用了大量的表达式。 However, as it seems that you have to compile the expression, you could do this:
但是,由于您似乎必须编译表达式,因此可以执行以下操作:
li = ['/allstar/NBA-allstar-career-stats.html', '/allstar/NBA_2022.html', '/allstar/NBA_2022.html', '/allstar/NBA_2021.html', '/allstar/NBA_2021.html', '/allstar/NBA_2021_voting.html', '/allstar/NBA_2021.html', '/allstar/NBA_2020.html', '/allstar/NBA_2020.html', '/allstar/NBA_2020_voting.html',
'/allstar/NBA_2020.html', '/allstar/NBA_2019.html', '/allstar/NBA_2019.html', '/allstar/NBA_2019_voting.html', '/allstar/NBA_2019.html', '/allstar/NBA_2018.html', '/allstar/NBA_2018.html', '/allstar/NBA_2018_voting.html', '/allstar/NBA_2018.html', '/allstar/NBA_2017.html', '/allstar/NBA_2017.html']
m = re.compile('.*NBA_201[789].html')
print(list(set(filter(m.match, li))))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.