简体   繁体   English

python中两个单词之间的正则表达式

[英]regular expression in python between two words

I am trying to get value 我试图获得价值

l1 = [u'/worldcup/archive/southafrica2010/index.html', u'/worldcup/archive/germany2006/index.html', u'/worldcup/archive/edition=4395/index.html', u'/worldcup/archive/edition=1013/index.html', u'/worldcup/archive/edition=84/index.html', u'/worldcup/archive/edition=76/index.html', u'/worldcup/archive/edition=68/index.html', u'/worldcup/archive/edition=59/index.html', u'/worldcup/archive/edition=50/index.html', u'/worldcup/archive/edition=39/index.html', u'/worldcup/archive/edition=32/index.html', u'/worldcup/archive/edition=26/index.html', u'/worldcup/archive/edition=21/index.html', u'/worldcup/archive/edition=15/index.html', u'/worldcup/archive/edition=9/index.html', u'/worldcup/archive/edition=7/index.html', u'/worldcup/archive/edition=5/index.html', u'/worldcup/archive/edition=3/index.html', u'/worldcup/archive/edition=1/index.html']

I'm trying to do regular expression starting off with something like this below 我正在尝试从下面这样的东西开始做正则表达式

m = re.search(r"\d+", l)
print m.group()

but I want value between "archive/" and "/index.html" 但我想要“ archive /”和“ /index.html”之间的值
I goggled and have tried something like (?<=archive/\\/index.html).*(?=\\/index.html:) 我瞪了一眼,尝试了类似(?<=archive/\\/index.html).*(?=\\/index.html:)

but It didn't work for me .. how can I get my result list as ' 但这对我不起作用..我如何将结果列表显示为“

result = ['germany2006','edition=4395','edition=1013' , ...]

If you know for sure that the pattern will match always, you can use this 如果您确定模式将始终匹配,则可以使用此模式

import re
print [re.search("archive/(.*?)/index.html", l).group(1) for l in l1]

Or you can simply split like this 或者您可以像这样简单地拆分

print [l.rsplit("/", 2)[-2] for l in l1]

Look-arounds is what you need. 环顾四周就是您所需要的。 You need to use it like this: 您需要像这样使用它:

>>> [re.search(r"(?<=archive/).*?(?=/index.html)", s).group() for s in l1]
[u'southafrica2010', u'germany2006', u'edition=4395', u'edition=1013', u'edition=84', u'edition=76', u'edition=68', u'edition=59', u'edition=50', u'edition=39', u'edition=32', u'edition=26', u'edition=21', u'edition=15', u'edition=9', u'edition=7', u'edition=5', u'edition=3', u'edition=1']

The regular expression 正则表达式

m = re.search(r'(?<=archive\/).+(?=\/index.html)', s)

can solve this, suppose that s is a string from your list. 可以解决这个问题,假设s是您列表中的一个字符串。

You can take help from below code .It will solve your problem. 您可以从下面的代码获得帮助,它将解决您的问题。

>>> import re
>>> p = '/worldcup/archive/southafrica2010/index.html'
>>> r = re.compile('archive/(.*?)/index.html')
>>> m = r.search(p)
>>> m.group(1)
'southafrica2010'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python 正则表达式 - 查找不同行上两个特定单词之间的所有单词 - Python Regular Expression - Find all words between two specific words on different lines 禁用词之间的间距 - 正则表达式Python - Spacing in between banned words - Regular Expression Python 最多两个单词的 Python 正则表达式输出 - Python Regular Expression Output with Up to Two Words 正则表达式使用python(或nltk)提取两个特定单词之间的内容 - Regular expression to extract contents between two specific words using python(or nltk) 如何在Python 2.7中编写正则表达式以返回字符串中的两个单词,并在它们之间使用下划线 - How to I write a regular expression in Python 2.7 to return two words in a string with an underscore between them python中的正则表达式可捕获2个单词之间的所有内容 - Regular expression in python to capture between everything between 2 words Python 正则表达式查找在两个标识符之间 - Python Regular Expression finding In between two identifiers python正则表达式中的整个单词 - Whole words in python regular expression 如何在python中使用正则表达式来捕获两个单词之间的字符? - How to use regular expressions in python to capture the characters between two words? 正则表达式:匹配空格之间的单词 - Regular expression: matching words between white space
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM