简体   繁体   English

如何在多行String中捕获某些字符和字符串之间的字符串? 蟒蛇

[英]How do I capture string between certain Character and String in multi line String? Python

Let's say we have a string 假设我们有一个字符串

string="This is a test code [asdf -wer -a2 asdf] >(ascd asdfas -were)\

 test \

(testing test) test >asdf  \

       test"

I need to get the string between character > and string "test". 我需要在字符>和字符串“test”之间获取字符串。

I tried 我试过了

re.findall(r'>[^)](.*)test',string, re.MULTILINE )

However I get 但是我得到了

(ascd asdfas -were)\ test \ (testing test) test >asdf.

However I need: 但是我需要:

(ascd asdfas -were)\ 

AND

asdf

How can I get those 2 string? 我怎样才能获得这两个字符串?

What about: 关于什么:

import re

s="""This is a test code [asdf -wer -a2 asdf] >(ascd asdfas -were)
test
(testing test) test >asdf
test"""

print(re.findall(r'>(.*?)\btest\b', s, re.DOTALL))

Output: 输出:

['(ascd asdfas -were)\n', 'asdf\n']

The only somewhat interesting parts of this pattern are: 这种模式中唯一有趣的部分是:

  • .*? , where ? 在哪里? makes the .* "ungreedy", otherwise you'd have a single, long match instead of two. 使.* “ungreedy”,否则你将有一个长的匹配而不是两个。
  • Using \\btest\\b as the "ending" identifier (see Jan's comment below) instead of test . 使用\\btest\\b作为“结束”标识符(参见下面的Jan的评论)而不是test Where , 哪里

    \\b Matches the empty string, but only at the beginning or end of a word.... \\b匹配空字符串,但仅限于单词的开头或结尾....

Note, it may be reading up on re.DOTALL , as I think that's really what you want. 注意,它可能是在re.DOTALLre.DOTALL ,因为我认为这真的是你想要的。 DOTALL lets . DOTALL. characters include newlines, while MULTILINE lets anchors ( ^ , $ ) match start and end of lines instead of the entire string. 字符包括换行符,而MULTILINE允许锚点( ^$ )匹配行的开头和结尾而不是整个字符串。 Considering you don't use anchors, I'm thinking DOTALL is more appropriate. 考虑到你不使用锚点,我认为DOTALL更合适。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM