Python RegExp從匹配的字符串中檢索值

Question

全部交易

我在解析日志時遇到了一些不小的問題。

我需要檢查一個文件，並檢查該行是否與模式匹配：如果為YES，則獲取此行中指定的ClientID。

這行看起來像：

17.02.09 10:42:31.242 TRACE [1245] GDS:     someText(SomeText).ClientID: '' -> '99071901'

所以我需要獲得99071901。

我試圖構造正則表達式搜索模式，但是它不完整..卡在“ TRACE”上：

regex = '(^[(\d\.)]+) ([(\d\:)]+) ([\bTRACE\b]+) ([(\d)]+) ([\bGDS\b:)]+) ([\ClientID\b])'

腳本代碼為：

log=open('t.log','r')
for i in log:
    key=re.search(regex,i)
    print(key.group()) #print string matching 
    for g in key:
        client_id=re.seach(????,g) # find ClientIt    
log.close()

如果您給我提示如何解決此挑戰，請多加贊賞。

謝謝。

Answer 1

您不需要太具體。 您可以只捕獲各個部分並分別進行分析。

讓我們從您的一行開始，例如：

line = "17.02.09 10:42:31.242 TRACE [1245] GDS:     someText(SomeText).ClientID: '' -> '99071901'"

然后添加添加所有部分的第一個正則表達式：

import re
line_regex = re.compile(r'(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+):\s+(.+)')
# now extract each section
date, time, level, thread, module, message = line_regex.match(line).groups()

現在，如果我們查看不同的部分，他們將擁有做出更多決策或進一步解析它們所需的所有信息。 現在，當顯示正確的消息時，讓我們獲取客戶端ID。

client_id_regex = re.compile(r".*ClientID: '' -> '(\d+)'")

if 'ClientID' in message:
    client_id = client_id_regex.match(message).group(1)

現在我們有了client_id 。

只要將邏輯工作到循環中，便一切就緒。

line_regex = re.compile(r'(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+):\s+(.+)')
client_id_regex = re.compile(r".*ClientID: '' -> '(\d+)'")

with open('t.log','r') as f:  # use with context manager to auto close the file
    for line in f:  # lets iterate over the lines
        sections = line_regex.match(line)  # make a match object for sections
        if not sections:
            continue  # probably you want to handle this case
        date, time, level, thread, module, message = sections.groups()
        if 'ClientID' in message:  # should we even look here for a client id?
            client_id = client_id_regex.match(message).group(1)
# now do what you wanted to do

Answer 2

您可以在所需模式中的那些部分周圍使用捕獲括號，然后使用group(n)訪問這些部分，其中n是相應的組ID：

import re
s = "17.02.09 10:42:31.242 TRACE [1245] GDS:     someText(SomeText).ClientID: '' -> '99071901'"
regex = r"^([\d.]+)\s+([\d.:]+)\s+(TRACE)\s+\[(\d+)] GDS:.*?ClientID:\s*''\s*->\s*'(\d+)'$"
m = re.search(regex, s)
if m:
    print(m.group(1))
    print(m.group(2))
    print(m.group(3))
    print(m.group(4))
    print(m.group(5))

觀看Python在線演示

模式是

^([\d.]+)\s+([\d.:]+)\s+(TRACE)\s+\[(\d+)] GDS:.*?ClientID:\s*''\s*->\s*'(\d+)'$

在此處查看其在線演示。

請注意，您已經將字符類與組弄亂了： (...)將子模式分組並捕獲它們，而[...]定義了與單個字符匹配的字符類。

Python RegExp從匹配的字符串中檢索值

問題描述

2 個解決方案

解決方案1
2 2017-02-13 11:09:18

解決方案2
1 已采納 2017-02-13 12:13:42

Python RegExp從匹配的字符串中檢索值

問題描述

2 個解決方案

解決方案1 2 2017-02-13 11:09:18

解決方案2 1 已采納 2017-02-13 12:13:42

解決方案1
2 2017-02-13 11:09:18

解決方案2
1 已采納 2017-02-13 12:13:42