I would like to write a regex that captures "my_code" line and the two lines that are indented unde it only
//abs[matches(@class,"her")]
//abs[matches(@class,"him")]
i was using my_code\n\s\s(.+)
my_code
//abs[matches(@class,"her")]
//abs[matches(@class,"him")]
xxxx //time
xxxxx //h1
i was using my_code\n\s\s(.+)
my_code
//abs[matches(@class,"her")]
//abs[matches(@class,"him")]
xxxx //time
xxxxx //h1
The \s
matches a space and also a newline.
To make sure it is indentend, you might match 2 times the newline and 1 or more spaces or tabs [\t ]+
using a character class.
^my_code\r?\n[\t ]+.+\r?\n[ \t]+.+
^
Start of string my_code\r?\n
match literally followed by a newline [\t ]+
Match 1+ spaces or tabs .+
Match 1+ times any char except a newline \r?\n[ \t]+.+
Again match a newline, 1+ spaces or tabs and any char except a newline To match the indented part 1 or more times, you could repeat a non capturing group and use a quantifier +
^my_code(?:\r?\n[\t ]+.+)+
I managed to get it working like this:
test_str = """
my_code
//abs[matches(@class,"her")]
//abs[matches(@class,"him")]
xxxx //time
xxxxx //h1
"""
pattern = re.compile('my_code\n\s+[^\n]+\n\s+[^\n]+')
res = re.search(pattern, test_str)
print(res.group())
The [^\n]+
means match every character except new line and there should be 1 or more of these characters. This produces output like:
my_code
//abs[matches(@class,"her")]
//abs[matches(@class,"him")]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.