The below is the sample substring present in a much larger string (detaildesc_final) that I have obtained. I need to use a regex search across the string so that I can retrieve all the lines that begin with " [] " (The two square brackets I mean) from the [Data] Section. All lines should be retrieved in the [Data] section until the [Logs] line is encountered.
[Data]
[] some text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[Logs]
I'm using Python to work the code and I've used the following command (which clearly is incorrect).
re.findall(r'\b\\[\\]\w*', detaildesc_final)
I need the result to be in the following format:
some text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
I have already looked a lot online and I could figure out to find any line starting with a single double character instead of two ( [] in this case). Any help would be greatly appreciated. Thank you.
Don't over-complicate things.
for line in detaildesc_final.split('\n'):
if line.startswith('[]'):
do_something()
import re
str = """
[Data]
[] some text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[Logs]
"""
print re.sub("([[a-zA-Z ]{0,}][ ]?)", '',str)
output:
some text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
You need positive look behind :
import re
pattern=r'(?<=\[\])(.\w.+)'
string_1="""[Data]
[] some text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[] some_other_text
[Logs]"""
match=re.finditer(pattern,string_1,re.M)
for item in match:
print(item.group(1))
output:
some text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
some_other_text
Regex explanation :
Positive Lookbehind (?<=\[\])
It tells the regex engine to temporarily step backwards in the string, to check if the text inside the lookbehind can be matched there.
\\[
matches the character [
literally (case sensitive) \\]
matches the character ]
literally (case sensitive) .
matches any character (except for line terminators) \\w
matches any word character (equal to [a-zA-Z0-9_]
) +
Quantifier —
Matches between one and unlimited times, as many times as possible, giving back as needed (greedy) import re
re.findall(r'\[\] (.*)\n\n', detaildesc_final)
Output:
['some text',
'some_other_text',
'some_other_text',
'some_other_text',
'some_other_text',
'some_other_text',
'some_other_text',
'some_other_text',
'some_other_text',
'some_other_text',
'some_other_text',
'some_other_text']
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.