简体   繁体   English

Python检查是否存在行(两个关键字)而忽略了之间的空白

[英]Python check if line (two keywords) exists ignoring white spaces between

My example file have a structure: 我的示例文件具有以下结构:

Library      Collections
Library      XML    use_lxml=True
Library      ute_wtssim
Library      libraries.ta_infomodel
Library      ute_tshark
Library      PDMLChecker.py
Library    libraries.ta_ue_configuration
Library      libraries.ta_tshark
Library    libraries.ta_file_system
Resource     environment_setup.robot
Variables    ../tshark_filters.py
Variables    ../global_variables.py
Variables    tracing_tshark_filters.py

I would like to find in file line: Library libraries.ta_infomodel ignoring whitespaces (column width, which is not constant). 我想在文件行中找到: Library libraries.ta_infomodel忽略空格(列宽,这不是恒定的)。

Could you give me advice? 你能给我建议吗?

EDIT: I would like to check if line exist... Line which contains exactly two keywords, ignoring white spaces between. 编辑:我想检查是否存在行...行,其中包含正好两个关键字,忽略之间的空白。

Just use \\s* . 只需使用\\s* \\s stands for a whitespace. \\s代表空格。 * means any number of (including zero): *表示任意数量(包括零):

import re
s = '''Library      Collections
Library      XML    use_lxml=True
Library      ute_wtssim
Library      libraries.ta_infomodel
Library      ute_tshark
Library      PDMLChecker.py
Library      resources.DevWro1.pdml_validation.PdmlValidation
Library    libraries.ta_ue_configuration
Library      libraries.ta_tshark
Library    libraries.ta_file_system
Resource     environment_setup.robot
Variables    ../tshark_filters.py
Variables    ../global_variables.py
Variables    tracing_tshark_filters.py'''

matches = re.findall('Library\s*libraries\.ta_infomodel', s)

for match in matches:
    print(match)

Just try this: 尝试一下:

content = """
Library      Collections
Library      XML    use_lxml=True
Library      ute_wtssim
Library      libraries.ta_infomodel
Library      ute_tshark
Library      PDMLChecker.py
Library      resources.DevWro1.pdml_validation.PdmlValidation
Library    libraries.ta_ue_configuration
Library      libraries.ta_tshark
Library    libraries.ta_file_system
Resource     environment_setup.robot
Variables    ../tshark_filters.py
Variables    ../global_variables.py
Variables    tracing_tshark_filters.py
"""

import re
regex = re.compile(r'Library\s+libraries\.ta_infomodel')
line = regex.search(content)

Some explanations. 一些解释。 In order to find one line by a regex, you should use re.search , for many re.findall . 为了通过正则表达式查找一行,您应该对许多re.findall使用re.search

\\s+ - for several spaces. \\s+ -多个空格。

\\. - for the dot. -点。 Use have to add the slash because just dot means any character. 使用必须添加斜杠,因为仅点表示任何字符。

Can be done with list comprehension too 也可以使用列表理解

import re
lines = open(fname, 'r').readlines()
found = [s for s in lines if re.match(".* Library\s+libraries\.ta_infomodel.*", s)]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM