[英]Take first word after a regex match
I am trying to extract some substring using regex from a string.我正在尝试使用正则表达式从字符串中提取一些子字符串。 I have as a parameter a word in my function, and the goal is to extract the very next word(my definition of word) after this match.我的函数中有一个词作为参数,目标是在匹配后提取下一个词(我对词的定义)。 I have tried lookbehind and some other logics, but I failed to obtain the results so any help is welcome.我试过后视和其他一些逻辑,但我没有得到结果,所以欢迎任何帮助。
As example, given the first case, I have as input in my function: **THttpServer**
例如,在第一种情况下,我在我的函数中输入: **THttpServer**
23:25:04.805: INFO: THttpServer: transportTCPChanged(state: DISCONNECTED 2)
23:25:13.120: INFO: THttpServer: transportUDPOpened(state: Port 54)
Expected result: transportTCPChanged
and transportUDPOpened
for both cases.预期结果:两种情况下的transportTCPChanged
和transportUDPOpened
。
Another case, I have as input CurrentUserConnection另一种情况,我有作为输入CurrentUserConnection
23:25:16.622: INFO: CurrentUserConnection#1:RQ : subscribed(userID: 1)
23:25:16.622: INFO: CurrentUserConnection#8:RP : disconnected
Expected result: subscribed, disconnected
.预期结果:已subscribed, disconnected
。
Things I have tried (the lookbehind changes depending on the example) on Notepad++:我在 Notepad++ 上尝试过的事情(后视变化取决于示例):
(?<=THttpServer)(\\w+)
: No matches (?<=THttpServer)(.*)
: Obviously returns all the sentence, not expected match (?<=THttpServer)(\\w+)
: 无匹配(?<=THttpServer)(.*)
: 显然返回所有句子,不是预期的匹配
I am bit confused, maybe it's not even possible?我有点困惑,也许这甚至不可能? Or do I need some pre-processing?还是我需要一些预处理?
You need to match :
after THttpServer
and any non-word chars up to the word and match and capture it with (\\w+)
.您需要匹配:
在THttpServer
之后和任何非单词字符直到单词并使用(\\w+)
匹配和捕获它。
Eg you may use例如你可以使用
THttpServer:\W*(\w+)
See the regex demo .请参阅正则表达式演示。
Details细节
THttpServer:
- a literal substring THttpServer:
- 文字子串\\W*
- any 0+ non-word chars \\W*
- 任何 0+ 个非单词字符(\\w+)
- Capturing group 1 (later accessible via m.group(1)
): 1 or more word chars. (\\w+)
- 捕获组 1(稍后可通过m.group(1)
访问):1 个或多个字字符。See the Python demo :请参阅Python 演示:
import re
strs = ['23:25:04.805: INFO: THttpServer: transportTCPChanged(state: DISCONNECTED 2)',
'23:25:13.120: INFO: THttpServer: transportUDPOpened(state: Port 54)']
rx = re.compile(r'THttpServer:\W*(\w+)')
for s in strs:
m = rx.search(s)
if m:
print("Found '{}' in '{}'.".format(m.group(1), s))
Output:输出:
Found 'transportTCPChanged' in '23:25:04.805: INFO: THttpServer: transportTCPChanged(state: DISCONNECTED 2)'.
Found 'transportUDPOpened' in '23:25:13.120: INFO: THttpServer: transportUDPOpened(state: Port 54)'.
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.