简体   繁体   中英

Match string between special characters

I've messed around with regex a little bit but am pretty unfamiliar with it for the most part. The string will in the format:

\n\n*text here, can be any spaces, etc. etc.*

The string that I will get will have two line breaks, followed by an asterisk, followed by text, and then ending with another asterisk.

I want to exclude the beginning \\n\\n from the returned text. This is the pattern that I've come up with so far and it seems to work:

pattern = "(?<=\\n\\n)\*(.*)(\*)"

match = re.search(pattern, string)
if match:
    text = match.group()
    print (text)
else:
    print ("Nothing")

I'm wondering if there is a better way to go about matching this pattern or if the way I'm handling it is okay.

Thanks.

You can avoid capturing groups and have the whole match as result using:

pattern = r'(?<=\n\n\*)[^*]*(?=\*)'

Example:

import re
print re.findall(r'(?<=\n\n\*)[^*]*(?=\*)','\n\n*text here, can be any spaces, etc. etc.*')

If you want to include the asterisk in the result you can use instead:

pattern = r'(?<=\n\n)\*[^*]*\*'

Regular expressions are overkill in a case like this -- if the delimiters are always static and at the head/tail of the string:

>>> s = "\n\n*text here, can be any spaces, etc. etc.*"
>>> def CheckString(s):
...     if s.startswith("\n\n*") and s.endswith("*"):
...         return s[3:-1]
...     else:
...         return "(nothing)"
>>> CheckString(s)
'text here, can be any spaces, etc. etc.'
>>> CheckString("no delimiters")
'(nothing)'

(adjusting the slice indexes as needed -- it wasn't clear to me if you want to keep the leading/trailing '*' characters. If you want to keep them, change the slice to

return s[2:]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM