解析字符串中特定子字符串的最pythonic方法？

Question

I have the following log and want to extract the second "DDD-xxxxx" ID from each entry (if exist a second DDD id):我有以下日志并想从每个条目中提取第二个“DDD-xxxxx”ID（如果存在第二个 DDD id）：

cs:444 - br:/main/j_DDD-50535/DDD-68009
cs:445 - br:/main/j_DDD-50535/j_DDD-70220
cs:446 - br:/main/j_DDD-50535/j_DDD-70117
cs:447-Merge from branch: /main/j_DDD-50544/j_DDD-61183
Requested by: Smith, John (UserID1)
cs:448-Merge from branch: /main/j_DDD-4822
Requested by: Grant, Huge (userID2)
cs:449-Daily automated release of 3.5.5.4

Using regex I found a workaround to get them but I think it should be possible to get much easier:使用正则表达式我找到了一种解决方法来获取它们，但我认为应该可以变得更容易：

def read_log():
    log_file_name = "log"
    with open(log_file_name, "r") as file:
        log_file = file.read().split("cs:")
    return log_file

def key_creator():
    log_data = read_log()

    keys = []
    for line in log_data:
        # print(line)
        if line[:5].isdigit():
            search = re.search('/j_(.*)\n', line)
            if hasattr(search, "group"):
                search = search.group(1).split('/j_')

                if 1 < len(search) and search[1][:3] == "DDD":
                    keys.append(search[1])
                    print(line)
    return keys

key_creator()

Edit: Just to clarify: - the string DDD can be followed by indeterminate number of digits, (DDD-23, DDD-342, DDD-4842, DDD-44332... would be possibles entries as well)编辑：澄清一下： - 字符串 DDD 后面可以跟不确定的位数，（DDD-23、DDD-342、DDD-4842、DDD-44332...也可能是条目）

Answer 1

def key_creator():
    log_data = read_log()
    keys = []
    for line in log_data:
        s = re.findall(r'(DDD-\d+)', line)
        if s and len(s)>1:
            keys.append(s[1])
    return keys

Answer 2

You can use a proper regex pattern to match your request:您可以使用适当的正则表达式模式来匹配您的请求：

def key_creator():
    log_data = read_log()

    keys = []
    for line in log_data:
        # print(line)
        search = re.search('/j_(DDD_\d{5})\n', line)
        if search is not None:
             keys.append(search.group(1))
             print(line)
    return keys

The pattern requires the string DDD followed by an underscore and exactly 5 digits.该模式需要字符串DDD后跟一个下划线和 5 位数字。 The return value is non if the string is not found, and otherwise it returns two groups: one with the whole match (group(0)) and one with only the content of the parenthesis (group(1)), which is already what you are looking for.如果没有找到字符串，则返回值为非，否则返回两组：一组是整个匹配（组（0）），另一组只有括号的内容（组（1）），这已经是什么了你正在寻找。

解析字符串中特定子字符串的最pythonic方法？

问题描述

2 个解决方案

解决方案1
1 2020-02-20 14:53:39

解决方案2
1 2020-02-20 14:54:43

解析字符串中特定子字符串的最pythonic方法？

问题描述

2 个解决方案

解决方案1 1 2020-02-20 14:53:39

解决方案2 1 2020-02-20 14:54:43

解决方案1
1 2020-02-20 14:53:39

解决方案2
1 2020-02-20 14:54:43