如果找到另一个关键字，则从python中的一行中提取子字符串

Question

I am trying to work with regex in python to extract a small substring from a large string, if another keyword is found in the string. 我正在尝试使用python中的regex从大字符串中提取一个小子字符串，如果在字符串中找到另一个关键字。

eg - 例如 -

s = "1  0001    1   UG  science,ee;YEAR=onefour;standard->2;district->9"

if "year" in s:
    print ("The year is = ",VALUE_OF_YEAR)<--- here I hope to somehow get the year substring from the above string and print it.

ie the answer will look like 即答案看起来像

The year is = onefour

Please note - the value will change if its denoting a different number like onethree, oneseven, etc 请注意 - 如果值表示不同的数字，例如onethree，oneseven等，则值将会改变

I basically want to copy whatever starts from after 我基本上想要复制从后面开始的任何内容

till the 直到

if I find 如果我找到

YEAR

in the string and print it out 在字符串中打印出来

I am not too sure how to do this. 我不太清楚如何做到这一点。

I tried using string manipulation methods in python, but so far I haven't found any way to precisely copy off all the words till the ';' 我尝试在python中使用字符串操作方法，但到目前为止，我还没有找到任何方法来精确复制所有单词，直到';' in the string. 在字符串中。

Any help will be appreciated. 任何帮助将不胜感激。 Any other method is also welcome. 任何其他方法也欢迎。

Answer 1

You can also have a saving group capture the year value: 您还可以使用保存组捕获year值：

>>> import re
>>> 
>>> pattern = re.compile(r"YEAR=(\w+);")
>>> s = "1  0001    1   UG  science,ee;YEAR=onefour;standard->2;district->9"
>>> pattern.search(s).group(1)
'onefour'

You may also need to handle cases when there is no match. 您可能还需要在没有匹配时处理案例。 For example, return None : 例如，返回None ：

import re

def get_year_value(s):
    pattern = re.compile(r"YEAR=(\w+);")
    match = pattern.search(s)

    return match.group(1) if match else None

Answer 2

You can use a regex to grab that value: 您可以使用正则表达式来获取该值：

(?<=\bYEAR=)[^;]+

The regex matches: 正则表达式匹配：

(?<=\\bYEAR=) If the string we are looking for is preceded with a whole word YEAR= ... (?<=\\bYEAR=)如果我们要查找的字符串前面有一个完整的单词YEAR= ...
[^;]+ - match 1 or more characters other than ; [^;]+ - 匹配除1之外的1个或多个字符; . 。

Here is a regex demo 这是一个正则表达式演示

Here is sample Python code : 以下是Python代码示例：

import re
p = re.compile(r'(?<=\bYEAR=)[^;]+')
test_str = "1  0001    1   UG  science,ee;YEAR=onefour;standard->2;district->9"
robj = re.search(p, test_str)
if robj:
    print(robj.group(0))

If everyone is so fond of capturing groups, here is the same expression with the lookbehind replaced with a capturing group: 如果每个人都非常喜欢捕捉群组，那么这里的表情背后被一个捕捉群所取代：

\bYEAR=([^;]+)

And in Python: 在Python中：

p = re.compile(r'\bYEAR=([^;]+)')
test_str = "1  0001    1   UG  science,ee;YEAR=onefour;standard->2;district->9"
robj = re.search(p, test_str)
if robj:
    print(robj.group(1))

Note that in case your YEAR value has hyphens or other non-word characters in it, \\w will not help you. 请注意，如果您的YEAR值中包含连字符或其他非单词字符， \\w将无法帮助您。 The negated character class is your best friend here. 被否定的角色类是你最好的朋友。

Answer 3

This is what I use, 这是我用的，

if "YEAR" in s:
    year= s.split('YEAR=')[1].split(';')[0]
    print ("The year is = " +year)
#this is the output
> The year is = onefour

Basically what it is doing is splitting the line after YEAR= and before ; 基本上它正在做的是在YEAR=之后分割线; . 。 The [1] splits the right of the sub string YEAR= and the [0] splits the left of the sub string ; [1]分割子字符串YEAR=的右边， [0]分割子字符串的左边;

Answer 4

YEAR=(?P<year>\w+);

这应该工作。

Answer 5

Try this regex: 试试这个正则表达式：

".*(?=YEAR).*YEAR=(.*?);.*"g

with substitution /1 替换/1

[Regex Demo] [正则表达式演示]

如果找到另一个关键字，则从python中的一行中提取子字符串

问题描述

5 个解决方案

解决方案1
4 2015-07-29 21:10:59

解决方案2
2 已采纳 2015-07-29 21:09:29

解决方案3
1 2015-07-29 21:48:19

解决方案4
0 2015-07-29 21:12:32

解决方案5
0 2015-07-29 21:13:17

如果找到另一个关键字，则从python中的一行中提取子字符串

问题描述

5 个解决方案

解决方案1 4 2015-07-29 21:10:59

解决方案2 2 已采纳 2015-07-29 21:09:29

解决方案3 1 2015-07-29 21:48:19

解决方案4 0 2015-07-29 21:12:32

解决方案5 0 2015-07-29 21:13:17

解决方案1
4 2015-07-29 21:10:59

解决方案2
2 已采纳 2015-07-29 21:09:29

解决方案3
1 2015-07-29 21:48:19

解决方案4
0 2015-07-29 21:12:32

解决方案5
0 2015-07-29 21:13:17