[英]Getting strings in between two keywords from a file in python
I tried to get some string between two keywords from a large text file with the following pattern searching each line by line and print it as well as store in another text file 我试图从大型文本文件中的两个关键字之间获取一些字符串,其格式如下,逐行搜索并打印并存储在另一个文本文件中
'Event_WheelMonitorReleased' (253) 'Event_WheelMonitorPressed' (252) 'Event_WheelMonitorPressed' (252) 'Event_WheelMonitorPressed' (252) 'Event_WheelMonitorPressed'(253)'Event_WheelMonitorPressed'(252)'Event_WheelMonitorPressed'(252)'Event_WheelMonitorPressed'(252)
Here I would like to extract only the strings inbetween EVENT() 在这里,我只想提取EVENT()之间的字符串
Here I would say I need X_0_Gui_Menu_610_Menu_Status_System
在这里我会说我需要
X_0_Gui_Menu_610_Menu_Status_System
I tried the following code 我尝试了以下代码
def get_navigated_pages():
os.chdir('log_file')
log_file = open('messages','r')
data = log_file.read()
navigated_pages = re.findall(r'EVENT(X(.*?)) ',data,re.DOTALL|re.MULTILINE)
with open('navigated_page_file', 'w') as navigated_page_file:
navigated_page_file.write(navigated_pages)
I expected the output in the text file to be something like this 我希望文本文件中的输出是这样的
X_0_Gui_Menu_650_Menu_Status_Version
X_0_Gui_Menu_610_Menu_Status_System
X_0_Gui_Menu_670_Menu_Status_Media
As mentioned above I would like to get the output only which is starting with X_0
and ignoring starting with other keywords 如上所述,我只想获取以
X_0
开头而忽略以其他关键字开头的输出
Try escaping your outermost parentheses pair. 尝试转义最外面的括号对。
navigated_pages = re.findall(r'EVENT\(X(.*?)\) ',data,re.DOTALL|re.MULTILINE)
This appears to make it match properly, at least for my little sample input: 这似乎使其正确匹配,至少对于我的小样本输入而言:
>>> s = "EVENT(X_HELLO) ... EVENT(X_HOW_ARE_YOU_DOING_TODAY)... EVENT(this one shouldn't appear because it doesn't start with X)"
>>> re.findall(r"EVENT\(X(.*?)\)", s)
['_HELLO', '_HOW_ARE_YOU_DOING_TODAY']
If you want the starting X too, you should nudge the inner parentheses to the left by one. 如果您也想要起始X,则应将内部括号向左微移一个。 Don't worry, I'm pretty sure the
*?
不用担心,我很确定
*?
will still have the proper precedence. 仍将具有适当的优先级。
>>> re.findall(r"EVENT\((X.*?)\)", s)
['X_HELLO', 'X_HOW_ARE_YOU_DOING_TODAY']
might get away with using split: 可能会因为使用split而脱身:
s = "Jan 01 08:11:13 AMIRA-134500021 user.notice gui-monitor[770]: ACTION:401b0836:8:EVENT(X_0_Gui_Menu_610_Menu_Status_System) 'Event_WheelMonitorReleased' (253)"
print(s.split("EVENT(")[1].rsplit(") ",1)[0])
X_0_Gui_Menu_610_Menu_Status_System
with open('message','r') as log_file:
for line in log_file:
print(line.split("EVENT(")[1].rsplit(") ",1)[0])
X_0_Gui_Menu_610_Menu_Status_System
X_0_Gui_Menu_610_Menu_Status_System
global_ExportActive_Popup
global_FileOverwrite_Confirm_Popup
global_Global_Reactions
To get only X_
lines: 仅获取
X_
行:
with open('message','r') as log_file:
for line in log_file:
chk = line.split("EVENT(")[1].rsplit(") ",1)[0]
if chk.startswith("X_"):
print(chk)
X_0_Gui_Menu_610_Menu_Status_System
X_0_Gui_Menu_610_Menu_Status_System
If you are confident X_
only appears in the lines you want: 如果您确信
X_
仅出现在您想要的行中:
for line in log_file:
if "X_" in line:
chk = line.split("EVENT(")[1].rsplit(") ",1)[0]
print(chk)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.