如何修改文件以替换与此模式匹配的字符串

Question

我有一个像这样的json文件：

{
    "title": "Pilot",
    "image": [
        {
            "resource": "http://images2.nokk.nocookie.net/__cb20110227141960/notr/images/8/8b/pilot.jpg",
            "description": "not yet implemented"
        }
    ],
    "content": "<p>The pilot ...</p>"
},
{
    "title": "Special Christmas (Part 1)",
    "image": [
        {
            "resource": "http://images1.nat.nocookie.net/__cb20090519172121/obli/images/e/ed/SpecialChristmas.jpg",
            "description": "not yet implemented"
        }
    ],
    "content": "<p>Last comment...</p>"
}

我需要替换文件中所有资源值的内容，因此如果字符串具有以下格式：

"http://images1.nat.nocookie.net/__cb20090519172121/obli/images/e/ed/SpecialChristmas.jpg"

结果应该是：

"../img/SpecialChristmas.jpg"

有人可以告诉我如何匹配该模式以修改文件？

我尝试过这样的建议：

https://stackoverflow.com/a/4128192/521728

但我不知道如何适应我的情况。

提前致谢！

Answer 1

如果它们都是"../img"图像，我相信你可以这样做：

resourceVal = "http://images1.nat.nocookie.net/__cb20090519172121/obli/images/e/ed/SpecialChristmas.jpg"
lastSlash = resourceVal.rfind('/')
result = "../img" + resourceVal[lastSlash:]

如果有其他类型的资源，这可能会有点复杂 - 让我知道，我会尝试编辑这个答案来帮助。

Answer 2

这是我的答案，不是很简洁，但您可以将re.search(".jpg",line)行中使用的正则表达式调整为您想要的任何正则表达式。

import re

with open("new.json", "wt") as out:
for line in open("test.json"):
    match = re.search(".jpg",line)
    if match:
      sp_str = line.split("/")
      new_line = '\t"resource":' + '"../img/'+sp_str[-1]
      out.write(new_line)

    else:
      out.write(line)

Answer 3

我在组中使用正则表达式：

from StringIO import StringIO    
import re

reader = StringIO("""{
    "title": "Pilot",
    "image": [
        {
            "resource": "http://images2.nokk.nocookie.net/__cb20110227141960/notr/images/8/8b/pilot.jpg",
            "description": "not yet implemented"
        }
    ],
    "content": "<p>The pilot ...</p>"
},
{
    "title": "Special Christmas (Part 1)",
    "image": [
        {
            "resource": "http://images1.nat.nocookie.net/__cb20090519172121/obli/images/e/ed/SpecialChristmas.jpg",
            "description": "not yet implemented"
        }
    ],
    "content": "<p>Last comment...</p>"
}""")

# to open a file just use reader = open(filename)

text = reader.read()
pattern = r'"resource": ".+/(.+).jpg"'
replacement = '"resource": "../img/\g<1>.jpg"'
text = re.sub(pattern, replacement, text)

print(text)

解释模式。 "resource": ".+/(.+)?.jpg" ：查找以"resource": "开头的任何文本"resource": "然后在正斜杠之前有一个或多个字符，然后在.jpg"之前有一个或多个字符.jpg" 。 括号()意味着我想要作为一个组内部找到的东西。 由于我只有一组括号，我可以用'\\g<1>'代替我。 （注意'\\g<0>'将匹配整个字符串： ' “resources”：etc'`）

如何修改文件以替换与此模式匹配的字符串

问题描述

3 个解决方案

解决方案1
1 2013-10-11 00:08:01

解决方案2
1 2013-10-11 00:15:28

解决方案3
1 已采纳 2013-10-11 02:09:25

如何修改文件以替换与此模式匹配的字符串

问题描述

3 个解决方案

解决方案1 1 2013-10-11 00:08:01

解决方案2 1 2013-10-11 00:15:28

解决方案3 1 已采纳 2013-10-11 02:09:25

解决方案1
1 2013-10-11 00:08:01

解决方案2
1 2013-10-11 00:15:28

解决方案3
1 已采纳 2013-10-11 02:09:25