简体   繁体   中英

Get string until a particular type of string is encountered

I am fairly new to python and programming as a whole. Just about learning my ABCs. Let's say, I have a string like this.

s = "DEALER:'S up, Bubbless? BUBBLES: Hey. DEALER: Well, there you go. JUNKIE: Well, what you got?DEALER: I got some starters.";

I want the string to end when a word with a uppercase and a colon(:) at the end is encountered. And then a new string is created that stores the other string. For the string above, I will get

s1 = "DEALER:'S up, Bubbless?
s2 = "BUBBLES: Hey."
s3 = "DEALER: Well, there you go."

etc..

This is my regex code for getting such words.

p = re.compile('([A-Z]*):')
s = set(p.findall(l))

I have been stuck on this for a while. I tried googling it, but to no avail. Any help would be greatly appreciated. Thanks.

This is the regex you need:

[AZ]+:.*?(?=[AZ]+:|$)

An explanation of the parts:

  • [AZ]+: matches the speaker
  • .*? matches the line they say; use ? (non-greedy) so it only matches up to the next speaker
  • (?=[AZ]+:|$) asserts that, following the speaker's line, we have either the next speaker or the end of the string ( (?=) is a positive lookahead, which only does an assertion but does not put the string into your match)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM