简体   繁体   中英

Python Regex to replace whole word including some special characters

I am new to regex and was wondering how the following could be implemented. For example, I have a css file with url('Inter.ttf') and my python program would convert this url to url('user/Inter.ttf') .

However, I run into a problem when I try to avoid double replacement. So how can I use regex to tell python the difference between url('Inter.ttf') and url('/hello/Inter.ttf') when using re.sub to replace them.

I have tried re.sub(r"\boriginalurl.ttf\b", "/user/" + originalurl.ttf, file) . But this seems to not work.

So how would I tell python to replace the whole word 'Inter.ttf' with '/user/Inter.ttf' and '/hello/Inter.ttf' with '/user/hello/Inter.ttf' .

You can use a look-around method to insert the /user/ dynamically:

(?<=url\(')/*(?=(?:.*?Inter\.ttf)'\))

And then use re.sub to replace with /user/ :

strings = ["url('Inter.ttf')", "url('/hello/Inter.ttf')"]

p = re.compile(r"(?<=url\(')/?(?=(?:.*?Inter\.ttf)'\))")

for s in strings:
    s = re.sub(p, "/user/", s)
    print(s)
url('user/Inter.ttf')
url('user/hello/Inter.ttf')

Pattern Explanation

(?<=url\(') : Positive lookbehind; matches strings that come after a string like url(' .

/? : Matches zero or one forward slashes / . This is important for matching paths like /hello/Inter.ttf because it starts with the / . This is going to be selected and replaced with the ending forward slash in the replacement string, /user/ .

(?=(?:.*?Inter.ttf)'\) : Positive lookahead; matches strings that come before a string that ends with Inter.ttf') .

I suggest playing around with it on https://regex101.com , selecting the Substitution method on the left-hand-side.

Edit

If you want to match multiple fonts, you can just remove the Inter.ttf part of the regex:

(?<=url\(')/?(?=(?:.*?)'\))

Alternatively, if you wanted it to append /user/ to paths that had a file extension, you can replace Inter\.ttf with \.\w{3} , which effectively matches 3 of any character in [a-zA-Z0-9_] :

(?<=url\(')/?(?=(?:.*?\.\w{3})'\))

a simple way to do that is like this without regex:

fin = open("input.css", "rt")
fout = open("out.css", "wt")
for line in fin:
    if "'Inter.ttf'" in line:
        fout.write(line.replace("'Inter.ttf'", "'/user/Inter.ttf'"))
    elif "'/hello/Inter.ttf'" in line:
        fout.write(line.replace("'/hello/Inter.ttf'", "'/user/hello/Inter.ttf'"))
    else:
        fout.write(line)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM