简体   繁体   中英

Find a specific sentence with a given word in Python

I would like to find a sentence with a give word string1 in the following passage: Note that

HEAD,Content,11005,{A:1,json:{B:0,C:5,D:-1,E:false,F:Failure},suffix:_A}DC0DHEAD,Content,11005,{A:1,json:{BC:true,DE:2,FG:0,HI:0,JK:0,string1:Error},suffix:_A}D646HEAD,Content,11005,{A:1,json:{Z:{Y:false,X:0,Q:1},suffix:}3AA8

So the expected result would be:

HEAD,Content,11005,{A:1,json:{BC:true,DE:2,FG:0,HI:0,JK:0,string1:Error},suffix:_A}D646

So far, I have used the regular expression as follows to chop the desired sentence:

([^.]*?string1[^.]*)

However, the result is not the desired one as the whole sentence cannot be captured but as follows:

A:1,json:{BC:true,DE:2,FG:0,HI:0,JK:0,string1:Error},suffix:_A}D646

Therefore, I hope is there anyone can help to solve this little issue. Thanks!

If all sentences begin with HEAD, you can do the following:

temp = s.split('HEAD')
res = 'HEAD' + [i for i in temp if 'string1' in i][0]

>>> print(res)
'HEAD,Content,11005,{A:1,json:{BC:true,DE:2,FG:0,HI:0,JK:0,string1:Error},suffix:_A}D646'

If you want to use a regex, you can match HEAD. Then match any char except then directly followed by HEAD.

Then match string1 again followed by matching any char except directly followed by HEAD

HEAD(?:(?!HEAD).)*string1(?:(?!HEAD).)*

Regex demo | Python demo

import re

pattern = r"HEAD(?:(?!HEAD).)*string1(?:(?!HEAD).)*"
s = ("HEAD,Content,11005,{A:1,json:{B:0,C:5,D:-1,E:false,F:Failure},suffix:_A}DC0DHEAD,Content,11005,{A:1,json:{BC:true,DE:2,FG:0,HI:0,JK:0,string1:Error},suffix:_A}D646HEAD,Content,11005,{A:1,json:{Z:{Y:false,X:0,Q:1},suffix:}3AA8\n")
matches = re.findall(pattern, s)

print(matches)

Output

['HEAD,Content,11005,{A:1,json:{BC:true,DE:2,FG:0,HI:0,JK:0,string1:Error},suffix:_A}D646']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM