[英]How do I capture string between certain Character and String in multi line String? Python
Let's say we have a string 假设我们有一个字符串
string="This is a test code [asdf -wer -a2 asdf] >(ascd asdfas -were)\
test \
(testing test) test >asdf \
test"
I need to get the string between character > and string "test". 我需要在字符>和字符串“test”之间获取字符串。
I tried 我试过了
re.findall(r'>[^)](.*)test',string, re.MULTILINE )
However I get 但是我得到了
(ascd asdfas -were)\ test \ (testing test) test >asdf.
However I need: 但是我需要:
(ascd asdfas -were)\
AND 和
asdf
How can I get those 2 string? 我怎样才能获得这两个字符串?
What about: 关于什么:
import re
s="""This is a test code [asdf -wer -a2 asdf] >(ascd asdfas -were)
test
(testing test) test >asdf
test"""
print(re.findall(r'>(.*?)\btest\b', s, re.DOTALL))
Output: 输出:
['(ascd asdfas -were)\n', 'asdf\n']
The only somewhat interesting parts of this pattern are: 这种模式中唯一有趣的部分是:
.*?
, where ?
?
makes the .*
"ungreedy", otherwise you'd have a single, long match instead of two. .*
“ungreedy”,否则你将有一个长的匹配而不是两个。 \\btest\\b
as the "ending" identifier (see Jan's comment below) instead of test
. \\btest\\b
作为“结束”标识符(参见下面的Jan的评论)而不是test
。 Where ,
\\b
Matches the empty string, but only at the beginning or end of a word....\\b
匹配空字符串,但仅限于单词的开头或结尾....
Note, it may be reading up on re.DOTALL
, as I think that's really what you want. 注意,它可能是在
re.DOTALL
上re.DOTALL
,因为我认为这真的是你想要的。 DOTALL
lets .
DOTALL
让.
characters include newlines, while MULTILINE
lets anchors ( ^
, $
) match start and end of lines instead of the entire string. 字符包括换行符,而
MULTILINE
允许锚点( ^
, $
)匹配行的开头和结尾而不是整个字符串。 Considering you don't use anchors, I'm thinking DOTALL
is more appropriate. 考虑到你不使用锚点,我认为
DOTALL
更合适。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.