[英]Regex to find substring starting with [ ]
The below is the sample substring present in a much larger string (detaildesc_final) that I have obtained. 下面是我获得的更大字符串(detaildesc_final)中存在的示例子字符串。 I need to use a regex search across the string so that I can retrieve all the lines that begin with " [] " (The two square brackets I mean) from the [Data] Section.
我需要对字符串使用正则表达式搜索,以便可以从[Data]部分检索所有以“ []”(我的意思是两个方括号)开头的行。 All lines should be retrieved in the [Data] section until the [Logs] line is encountered.
在[Data]部分中应检索所有行,直到遇到[Logs]行。
[Data]
[] some text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[Logs]
I'm using Python to work the code and I've used the following command (which clearly is incorrect). 我正在使用Python处理代码,并且使用了以下命令(这显然是不正确的)。
re.findall(r'\b\\[\\]\w*', detaildesc_final)
I need the result to be in the following format: 我需要结果采用以下格式:
some text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
I have already looked a lot online and I could figure out to find any line starting with a single double character instead of two ( [] in this case). 我已经在网上看到了很多东西,因此可以找出以单个双字符而不是两个(在这种情况下为[])开头的任何行。 Any help would be greatly appreciated.
任何帮助将不胜感激。 Thank you.
谢谢。
Don't over-complicate things. 不要使事情过于复杂。
for line in detaildesc_final.split('\n'):
if line.startswith('[]'):
do_something()
import re
str = """
[Data]
[] some text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[Logs]
"""
print re.sub("([[a-zA-Z ]{0,}][ ]?)", '',str)
output: 输出:
some text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
You need positive look behind : 您需要正面评价:
import re
pattern=r'(?<=\[\])(.\w.+)'
string_1="""[Data]
[] some text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[Logs]"""
match=re.finditer(pattern,string_1,re.M)
for item in match:
print(item.group(1))
output: 输出:
some text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
Regex explanation :
正则表达式说明:
Positive Lookbehind (?<=\[\])
It tells the regex engine to temporarily step backwards in the string, to check if the text inside the lookbehind can be matched there.
它告诉正则表达式引擎暂时向后退字符串,以检查后面的内部文本是否可以匹配。
\\[
matches the character [
literally (case sensitive) \\[
匹配字符[
从字面上(区分大小写) \\]
matches the character ]
literally (case sensitive) \\]
从字面上匹配字符]
(区分大小写) .
matches any character (except for line terminators) \\w
matches any word character (equal to [a-zA-Z0-9_]
) \\w
匹配任何单词字符(等于[a-zA-Z0-9_]
) +
Quantifier —
Matches between one and unlimited times, as many times as possible, giving back as needed (greedy) +
量词—
匹配一次和无限次,尽可能多地匹配,并根据需要返回(贪婪) import re
re.findall(r'\[\] (.*)\n\n', detaildesc_final)
Output: 输出:
['some text',
'some_other_text',
'some_other_text',
'some_other_text',
'some_other_text',
'some_other_text',
'some_other_text',
'some_other_text',
'some_other_text',
'some_other_text',
'some_other_text',
'some_other_text']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.