在 Python 中使用正則表達式進行模式匹配

Question

我希望你們都做得很好。 我有個問題。 我有這串字符“首先，經過預處理，對引文數據進行仔細分析，用於分析任務（T1，T2，T3，T4），以建立新的排名model。” 如何匹配括號前后的 3 或 4 個單詞 (T1,T2,T3,T4)。

Answer 1

我認為OP希望在此塊（。*）之前或之后獲得一定數量的“單詞”。

import re
string = "First, after preprocessing, citation data are carefully analysed for the analysis tasks (T1, T2, T3, T4) to establish a new ranking model."
before, after = [i.strip() for i in re.split(r'\(.*\)', string)]
print(before)
print(after)

Output

First, after preprocessing, citation data are carefully analysed for the analysis tasks
to establish a new ranking model.

print(before.split(' ')[-3:])
print(after.split(' ')[:3])

Output

['the', 'analysis', 'tasks']
['to', 'establish', 'a']

Answer 2

使用正則表達式\w的任何單詞字符：

import re

line =  'First, after preprocessing, citation data are carefully analysed for the analysis tasks (T1, T2, T3, T4) to establish a new ranking model based on citation context.' 

reg = r'(\w+\s){3}\('
re.compile(reg).search(line).group()[:-1]
>>> 'the analysis tasks '

它搜索單詞和空格的組合三次（因此三個），最后有一個括號\( 。最后的切片是減去括號，也找到了，但你不感興趣在括號中，但只能在三個單詞中。由於最后一個字符是空格，因此還可以對最后兩個字符進行切片。

當括號前面有逗號（這是一個非單詞字符）時，您可以重寫您的行以查找具有單詞邊界的單詞：

reg = r'(\w+\W+){3}\('

在 Python 中使用正則表達式進行模式匹配

問題描述

2 個解決方案

解決方案1
0 2021-11-25 07:51:33

解決方案2
0 已采納 2021-11-25 07:56:08

在 Python 中使用正則表達式進行模式匹配

問題描述

2 個解決方案

解決方案1 0 2021-11-25 07:51:33

解決方案2 0 已采納 2021-11-25 07:56:08

解決方案1
0 2021-11-25 07:51:33

解決方案2
0 已采納 2021-11-25 07:56:08