简体   繁体   中英

REGEX (python) match or return a string after '?', but in a new line, til the end of that line

Here us what I'm trying to do... I have a string structured like this:

stringparts.bst? (carriage return) 765945559287eghc1bg60aa26e4c9ccf8ac425725622f65a6lsa6ahskchksyttsutcuan99 (carriage return) SPAM /198975/

I need it to match or return this:

765945559287eghc1bg60aa26e4c9ccf8ac425725622f65a6lsa6ahskchksyttsutcuan99

What RegEx will do the trick?

I have tried this, but to no avail :(

bst\\?(.*)\\n

Thanks in advc

I tried this. Assuming the newline is only one character.

>>> s
'stringparts.bst?\n765945559287eghc1bg60aa26e4c9ccf8ac425725622f65a6lsa6ahskchks
yttsutcuan99\nSPAM /198975/'
>>> m = re.match('.*bst\?\s(.+)\s', s)
>>> print m.group(1)
765945559287eghc1bg60aa26e4c9ccf8ac425725622f65a6lsa6ahskchksyttsutcuan99

Your regex will match everything between the bst? and the first newline which is nothing. I think you want to match everything between the first two newlines.

bst\?\n(.*)\n

will work, but you could also use

\n(.*)\n

although it may not work for some other more specific cases

This is more robust against different kinds of line breaks, and works if you have a whole list of such strings. The $ and ^ represent the beginning and end of a line, but not the actual line break character (hence the \\s+ sequence).

import re

BST_RE = re.compile(
    r"bst\?.*$\s+^(.*)$",
    re.MULTILINE
)

INPUT_STR = r"""
stringparts.bst?
765945559287eghc1bg60aa26e4c9ccf8ac425725622f65a6lsa6ahskchksyttsutcuan99
SPAM /198975/

stringparts.bst?
another
SPAM /.../
"""

occurrences = BST_RE.findall(INPUT_STR)

for occurrence in occurrences:
    print occurrence

This pattern allows additional whitespace before the \\n :

r'bst\?\s*\n(.*?)\s*\n'

If you don't expect any whitespace within the string to be captured, you could use a simpler one, where \\s+ consumes whitespace, including the \\n , and (\\S+) captures all the consecutive non-whitespace:

r'bst\?\s+(\S+)'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM