简体   繁体   中英

Tinier string between two substrings

I am trying to parse IRC logs, like this one :

2013-09-26T01:52:40  <Shan-x> some stuff

I want to have the pseudo, so I use re :

re.search('%s(.*)%s' % ('<', '>'), s).group(1)

But if the log is like this :

2013-09-26T01:52:40  <Shan-x> some stuff > foo bar

Then, I obtain this : Shan-x> some stuff . How can I parse to have only the pseudo ?

You need to make the .* non greedy by adding a ? to the * quantifier:

re.search('%s(.*?)%s' % ('<', '>'), s).group(1)

Now the . matches the minimum number of characters that satisfies the pattern, rather than the default maximum.

Not sure why you use string interpolation here, though; for static characters, just use:

re.search('<(.*?)>', s).group(1)

You could also capture all characters that do not match the end character:

re.search('<([^>]*)>', s).group(1)

Here [^>] forms a character class matching any character that is not in the class; so any character that is not > would qualify.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM