简体   繁体   English

如何查找和替换正则表达式代码

[英]How to find and replace in a regex code

I am trying to find and replace in a regex code 我正在尝试查找并替换正则表达式代码

<div class="gallery-image-container">
    <div jstcache="1116"
         class="gallery-image-high-res loaded"
         style="width: 396px;
                height: 264px;
                background-image: url(&quot;https://lh5.googleusercontent.com/p/AF1QipMcTfMPZj_d5iip9WKtN2SQB9Je5U4rRB0nT_t8=s396-k-no&quot;);
                background-size: 396px 264px;"
         jsan="7.gallery-image-high-res,7.loaded,5.width,5.height,5.background-image,5.background-size">
    </div>
</div>

In the code above I used This 在上面的代码中,我使用了This

(https:\/\/[^&]*)

To extract this URL 提取此URL

https://lh5.googleusercontent.com/p/AF1QipMcTfMPZj_d5iip9WKtN2SQB9Je5U4rRB0nT_t8=s396-k-no

I used This regex s\\d{3} to get s396 我使用了此正则表达式s\\d{3}来获取s396

Now I want to replace s396 to s1000 in the URL 现在我想将URL中的s396替换为s1000

Now am Stock and don't know how to go about it. 现在是Stock,不知道该怎么做。

Please is there anyway all these can be done in just one regex code not multiple codes? 无论如何,请问所有这些都可以用一个正则表达式代码而不是多个代码完成吗?

I would suggest using an HTML parser, but I understand sometimes that is not possible. 我建议使用HTML解析器,但我知道有时是不可能的。 Here is a little example in python. 这是python中的一个小例子。

import re

data = '''
<div class="gallery-image-container">
    <div jstcache="1116"
         class="gallery-image-high-res loaded"
         style="width: 396px;
            height: 264px;
            background-image: url(&quot;https://lh5.googleusercontent.com/p/AF1QipMcTfMPZj_d5iip9WKtN2SQB9Je5U4rRB0nT_t8=s396-k-no&quot;);
            background-size: 396px 264px;"
         jsan="7.gallery-image-high-res,7.loaded,5.width,5.height,5.background-image,5.background-size">
    </div>
</div>
'''
match = re.search("(https?://[^&]+)", data)
url = match.group(1)
url = re.sub("s\d{3}", "s1000", url)
print(url)

They key part is the regex of 他们的关键部分是正则表达式

(https?://[^&]+)

It is using a negative character class. 它使用否定字符类。 It's saying, look for http with an optional s followed by :// and then all the non & You can use this site to play around with regexs: 就是说,使用可选的s://来查找http ,然后查找所有 &您可以使用此站点使用正则表达式:

https://regex101.com/r/b0APFA/1 https://regex101.com/r/b0APFA/1

I'm sure you could do a clever 1 liner nested regex to find and replace all at once, but it's going to be easier to troubleshoot if you have a few lines. 我敢肯定,您可以做一个聪明的1班轮嵌套正则表达式来一次查找和替换所有内容,但是如果您有几行内容,它将更容易进行故障排除。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM