使用正则表达式查找字符后紧跟的单词

Question

I am trying to look for the word that is immediately after '%' in the following line:我正在尝试在以下行中查找紧跟在 '%' 之后的单词：

RP/0/RP0/CPU0:Feb 26 20:04:01.869 UTC: esd[361]: %PKT_INFRA-FM-3-FAULT_MAJOR : ALARM_MAJOR :SWITCH_LINK_ERR_E :DECLARE :0/RP0/CPU0/7:

LC/0/9/CPU0:Feb 26 20:00:25.560 UTC: npu_drvr[253]: %PLATFORM-OFA-6-INFO : NPU #1 Initialization Completed

To start, I used the following Python code, and it is working.首先，我使用了以下 Python 代码，它正在运行。

result = re.search(r"\%.* \: ", txt)
result.group()

And here is the result:而这里是结果：

However, my reg ex fails in lines like this:但是，我的 reg ex 在这样的行中失败：

LC/0/9/CPU0:Feb 27 15:33:58.509 UTC: npu_drvr[253]: %FABRIC-NPU_DRVR-1-PACIFIC_ERROR : [5821] : [PACIFIC A0]: For asic 0 : A0 Errata: Observed RX CODE errors on link 120 , This is expected if you have A0 asic versions in the system and do triggers like OIR, reload etc.

Answer 1

Repetitions ( * and + ) in regular expressions default to "greedy" mode: they try to match the longest piece of text.正则表达式中的重复（ *和+ ）默认为“贪婪”模式：它们尝试匹配最长的文本段。 In the failure case you provided, there are additional colons ( : ) in the message after the word to match, so the greedy star * matched them all.在您提供的失败案例中，消息中要匹配的单词后面还有额外的冒号 ( : )，因此贪婪之星*将它们全部匹配。

You can change the behavior to "lazy" (or "non-greedy") by adding a question mark ( ? ) after the repetition, changing it to:您可以通过在重复后添加问号 ( ? ) 将行为更改为“懒惰”（或“非贪婪”），将其更改为：

result = re.search(r"\%.*? \: ", txt)

Check out the results here .在此处查看结果。 For more information, consider reading this article .有关更多信息，请考虑阅读本文。

Answer 2

What you want is a percent sign followed by one or more non-spaces:你想要的是一个百分号后跟一个或多个非空格：

re.search("%\S+", s)
#<_sre.SRE_Match object; span=(52, 84), match='%FABRIC-NPU_DRVR-1-PACIFIC_ERROR'>

Answer 3

you could use:你可以使用：

re.search(r'%([^\s]+)', s).group(1)

output (tested against the line for which your regex fails):输出（针对您的正则表达式失败的行进行测试）：

FABRIC-NPU_DRVR-1-PACIFIC_ERROR

or you can use:或者你可以使用：

 re.search(r'%(\S+)', s).group(1) # \S is the same with [^\s]

Answer 4

Try:尝试：

import re

x="LC/0/9/CPU0:Feb 27 15:33:58.509 UTC: npu_drvr[253]: %FABRIC-NPU_DRVR-1-PACIFIC_ERROR : [5821] : [PACIFIC A0]: For asic 0 : A0 Errata: Observed RX CODE errors on link 120 , This is expected if you have A0 asic versions in the system and do triggers like OIR, reload etc."

res=re.findall(r"(?<=%)[^\s]+", x)

Outputs:输出：

>>> res

['FABRIC-NPU_DRVR-1-PACIFIC_ERROR']

(?<=%)[^\\s]+ - first brackets will be a match only if % is preceding the second brackets, without actually returning % . (?<=%)[^\\s]+ - 仅当%位于第二个括号之前时，第一个括号才会匹配，而不实际返回% 。 Next brackets are a match only for the word - meaning string of 1, or more characters, that aren't white space.下一个括号仅匹配单词 - 表示 1 个或多个字符的字符串，不是空格。

使用正则表达式查找字符后紧跟的单词

问题描述

4 个解决方案

解决方案1
2 已采纳 2020-03-15 19:46:26

解决方案2
1 2020-03-15 19:47:52

解决方案3
1 2020-03-15 19:49:10

解决方案4
0 2020-03-15 20:02:39

使用正则表达式查找字符后紧跟的单词

问题描述

4 个解决方案

解决方案1 2 已采纳 2020-03-15 19:46:26

解决方案2 1 2020-03-15 19:47:52

解决方案3 1 2020-03-15 19:49:10

解决方案4 0 2020-03-15 20:02:39

解决方案1
2 已采纳 2020-03-15 19:46:26

解决方案2
1 2020-03-15 19:47:52

解决方案3
1 2020-03-15 19:49:10

解决方案4
0 2020-03-15 20:02:39