简体   繁体   English

Python 正则表达式替换整个单词,包括一些特殊字符

[英]Python Regex to replace whole word including some special characters

I am new to regex and was wondering how the following could be implemented.我是正则表达式的新手,想知道如何实现以下内容。 For example, I have a css file with url('Inter.ttf') and my python program would convert this url to url('user/Inter.ttf') .例如,我有一个带有url('Inter.ttf')的 css 文件,我的 python 程序会将此 url 转换为url('user/Inter.ttf')

However, I run into a problem when I try to avoid double replacement.但是,当我尝试避免双重替换时遇到了问题。 So how can I use regex to tell python the difference between url('Inter.ttf') and url('/hello/Inter.ttf') when using re.sub to replace them.那么在使用 re.sub 替换它们时,如何使用正则表达式告诉 python url('Inter.ttf')url('/hello/Inter.ttf')之间的区别。

I have tried re.sub(r"\boriginalurl.ttf\b", "/user/" + originalurl.ttf, file) .我试过re.sub(r"\boriginalurl.ttf\b", "/user/" + originalurl.ttf, file) But this seems to not work.但这似乎行不通。

So how would I tell python to replace the whole word 'Inter.ttf' with '/user/Inter.ttf' and '/hello/Inter.ttf' with '/user/hello/Inter.ttf' .那么我如何告诉 python 用'/user/Inter.ttf''/hello/Inter.ttf'替换整个单词'Inter.ttf' ' 和 ' '/user/hello/Inter.ttf'

You can use a look-around method to insert the /user/ dynamically:您可以使用look-around方法动态插入/user/

(?<=url\(')/*(?=(?:.*?Inter\.ttf)'\))

And then use re.sub to replace with /user/ :然后使用re.sub替换为/user/

strings = ["url('Inter.ttf')", "url('/hello/Inter.ttf')"]

p = re.compile(r"(?<=url\(')/?(?=(?:.*?Inter\.ttf)'\))")

for s in strings:
    s = re.sub(p, "/user/", s)
    print(s)
url('user/Inter.ttf')
url('user/hello/Inter.ttf')

Pattern Explanation模式说明

(?<=url\(') : Positive lookbehind; matches strings that come after a string like url(' . (?<=url\(') : Positive lookbehind; 匹配像url('这样的字符串之后的字符串。

/? : Matches zero or one forward slashes / . : 匹配个或一个正斜杠/ This is important for matching paths like /hello/Inter.ttf because it starts with the / .这对于匹配/hello/Inter.ttf之类的路径很重要,因为它以/开头。 This is going to be selected and replaced with the ending forward slash in the replacement string, /user/ .这将被选中并替换为替换字符串/user/中的结尾正斜杠。

(?=(?:.*?Inter.ttf)'\) : Positive lookahead; (?=(?:.*?Inter.ttf)'\) :正向前瞻; matches strings that come before a string that ends with Inter.ttf') .匹配以Inter.ttf')结尾的字符串之前的字符串。

I suggest playing around with it on https://regex101.com , selecting the Substitution method on the left-hand-side.我建议在https://regex101.com上使用它,选择左侧的Substitution方法。

Edit编辑

If you want to match multiple fonts, you can just remove the Inter.ttf part of the regex:如果要匹配多个 fonts,只需删除正则表达式的Inter.ttf部分:

(?<=url\(')/?(?=(?:.*?)'\))

Alternatively, if you wanted it to append /user/ to paths that had a file extension, you can replace Inter\.ttf with \.\w{3} , which effectively matches 3 of any character in [a-zA-Z0-9_] :或者,如果您希望将 append /user/替换为具有文件扩展名的路径,您可以将Inter\.ttf替换为\.\w{3} ,它有效匹配[a-zA-Z0-9_]

(?<=url\(')/?(?=(?:.*?\.\w{3})'\))

a simple way to do that is like this without regex:没有正则表达式的简单方法是这样的:

fin = open("input.css", "rt")
fout = open("out.css", "wt")
for line in fin:
    if "'Inter.ttf'" in line:
        fout.write(line.replace("'Inter.ttf'", "'/user/Inter.ttf'"))
    elif "'/hello/Inter.ttf'" in line:
        fout.write(line.replace("'/hello/Inter.ttf'", "'/user/hello/Inter.ttf'"))
    else:
        fout.write(line)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM