简体   繁体   中英

How do I capture string between certain Character and String in multi line String? Python

Let's say we have a string

string="This is a test code [asdf -wer -a2 asdf] >(ascd asdfas -were)\

 test \

(testing test) test >asdf  \

       test"

I need to get the string between character > and string "test".

I tried

re.findall(r'>[^)](.*)test',string, re.MULTILINE )

However I get

(ascd asdfas -were)\ test \ (testing test) test >asdf.

However I need:

(ascd asdfas -were)\ 

AND

asdf

How can I get those 2 string?

What about:

import re

s="""This is a test code [asdf -wer -a2 asdf] >(ascd asdfas -were)
test
(testing test) test >asdf
test"""

print(re.findall(r'>(.*?)\btest\b', s, re.DOTALL))

Output:

['(ascd asdfas -were)\n', 'asdf\n']

The only somewhat interesting parts of this pattern are:

  • .*? , where ? makes the .* "ungreedy", otherwise you'd have a single, long match instead of two.
  • Using \\btest\\b as the "ending" identifier (see Jan's comment below) instead of test . Where ,

    \\b Matches the empty string, but only at the beginning or end of a word....

Note, it may be reading up on re.DOTALL , as I think that's really what you want. DOTALL lets . characters include newlines, while MULTILINE lets anchors ( ^ , $ ) match start and end of lines instead of the entire string. Considering you don't use anchors, I'm thinking DOTALL is more appropriate.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM