在Python中使用正则表达式从字符串中消除中心子字符串

Question

我正在Python3中定义一个函数，以使用正则表达式操作字符串。

我在查找正则表达式以提取部分字符串时遇到问题。 考虑以下输入字符串

str1 = "http://99.199.9.90:22/some/path/here/id_type_51549851/read"
str2 = "http://99.199.9.90:22/some/path/here/myid_31654/read"

对于以上字符串，我想获取以下字符串作为输出：

output_str1: "http://99.199.9.90:22/some/path/here/id_type_/read"
output_str2: "http://99.199.9.90:22/some/path/here/myid_/read"

输出字符串中的最后一个下划线不是必需的。

更笼统地说，最好使其与以下字符串一起使用（如果可能）：

str3 =  "http://99.199.9.90:22/some/path/here/myid_alphaBeta/read"

输出

"http://99.199.9.90:22/some/path/here/myid_/read"

注意，发明了IP，端口，路径，但是结构是这样的。

考虑到之前可能还有另一个下划线，我想从read之前和最后一个下划线之后的字符串部分中删除。

因此，基本上我的输出应包含原始字符串的第一部分，最后一部分，并匹配不属于输出部分的中央部分。 换句话说，它应该剪切字符串的中央匹配部分

我从输出整个字符串的正则表达式开始：

"(.+?)/some/path/here/(.+?)/read"

我尝试过(.+?)/some/path/here/(.+?)_[.+?]/read

但它没有用。

现在的功能是这个（部分是：

def cutURL(str):
    res = str
    if (bool(re.search("(.+?)&someMatch=[0-9]+", str))):
        res = re.search("(.+?)&someMatch=[0-9]+", str).group()
    elif (bool(re.search("(.+?)/devices/(.+?)/read", str))):
        res = re.search("(.+?)/some/path/here/(.+?)/read", str)
    return res

Answer 1

从以上示例中，您可以替代

_\w+/read$

与

_/read

参见regex101.com上的演示 。

Answer 2

用这个

str2 = "http://99.199.9.90:22/some/path/here/myid_31654/read"
str2 = re.sub("myid_[0-9]+","myid_",str2)

有关sub方法的文档和更多应用程序，请参考文档

在Python中使用正则表达式从字符串中消除中心子字符串

问题描述

2 个解决方案

解决方案1
0 2018-12-18 10:27:04

解决方案2
0 2018-12-18 11:59:07

在Python中使用正则表达式从字符串中消除中心子字符串

问题描述

2 个解决方案

解决方案1 0 2018-12-18 10:27:04

解决方案2 0 2018-12-18 11:59:07

解决方案1
0 2018-12-18 10:27:04

解决方案2
0 2018-12-18 11:59:07