简体   繁体   中英

Regex to capture string if other string present within brackets

I am trying to create a Python regex to capture a file name, but only if the text "external=true" appears within the square brackets after the alleged file name.

I believe I am nearly there, but am missing a specific use-case. Essentially, I want to capture the text between qrcode: and the first [ , but only if the text external=true appears between the two square brackets.

I have created the regex qrcode:([^:].*?)\[.*?external=true.*?\] , which does not work for the second line below: it incorrectly returns vcard3.txt and does not return vcard4.txt.

qrcode:vcard1.txt[external=true] qrcode:vcard2.txt[xdim=2,ydim=2]
qrcode:vcard3.txt[xdim=2,ydim=2] qrcode:vcard4.txt[xdim=2,ydim=2,external=true]
qrcode:vcard5.txt[xdim=2,ydim=2,external=true,foreground=red,background=white]
qrcode:https://www.github.com[foreground=blue]

https://regex101.com/r/bh3IMb/3

Using positive look-ahead (for qrcode: ) and positive look-behind (for [*external=true with lazy matching to capture the smallest of such groups.

Regex101 explanation: https://regex101.com/r/bOezIm/1

A complete python example:

import re

pattern = r"(?<=qrcode:)[^:]*?(?=\[[^\]]*?external=true)"
string = """
qrcode:vcard1.txt[external=true] qrcode:vcard2.txt[xdim=2,ydim=2]
qrcode:vcard3.txt[xdim=2,ydim=2] qrcode:vcard4.txt[xdim=2,ydim=2,external=true]
qrcode:vcard5.txt[xdim=2,ydim=2,external=true,foreground=red,background=white]
qrcode:https://www.github.com[foreground=blue]
"""
print(re.findall(pattern, string))

As an alternative you can use

qrcode:([\w\.]+)(?=\[[\w\=,]*external=true[^\]]*)

See the regex demo .

Python demo:

import re

regex = re.compile(r"qrcode:([\w\.]+)(?=\[[\w\=,]*external=true[^\]]*)")

sample = """
qrcode:vcard1.txt[external=true] qrcode:vcard2.txt[xdim=2,ydim=2]
qrcode:vcard3.txt[xdim=2,ydim=2] qrcode:vcard4.txt[xdim=2,ydim=2,external=true]
qrcode:vcard5.txt[xdim=2,ydim=2,external=true,foreground=red,background=white]
qrcode:https://www.github.com[foreground=blue]
"""

print(regex.findall(sample))

Output:

['vcard1.txt', 'vcard4.txt', 'vcard5.txt']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM