如何使用正則表達式捕獲以下組？

Question

你好我有以下兩個字符串

txt = '/path/to/photo/file.jpg'
txt = '/path/to/photo/file_crXXX.jpg'

在第二個字符串中，XXX 是一個長變量路徑，名稱中包含信息，因為它已被處理。

我想在兩個路徑中提取名稱“文件”

為此，我嘗試了以下代碼

re.search(".*/(.*)\.jpg", txt).group(1)
re.search(".*/(.*)_cr.*", txt).group(1)

但是當我嘗試將一個表達式與以下代碼組合時

re.search(".*/(.*)(_cr.*)|(\.jpg)*", txt).group(1)
re.search(".*/(.*)(\.jpg)|(_cr.*)", txt).group(1)

不能正常工作，那我該怎么做呢？

謝謝

Answer 1

問題是您捕獲了一個不需要捕獲的組，但.*/(.*)(\.jpg)|(_cr.*)更接近答案。 請使用此正則表達式僅捕獲文件名或其前綴。

([^/]*?)(?:\.jpg|_cr.*)$

另外，請參閱正則表達式演示

import re

paths = ["/path/to/photo/file.jpg", "/path/to/photo/file_crXXX.jpg"]
for path in paths:
    print(re.search(r"([^/]*?)(?:\.jpg|_cr.*)$", path).group(1))

Answer 2

既然您正在處理路徑，為什么不使用pathlib ？

例如：

import pathlib

files = [
    "/path/to/photo/abc1.jpg",
    "/path/to/photo/def2.jpg",
    "/path/to/photo/ghi3.jpg",
    "/path/to/photo/file1_cr.jpg",
    "/path/to/photo/file2_cr2.jpg",
    "/path/to/photo/file3_crY.jpg",
    ]

stubs = []

for f in files:
    stem = pathlib.Path(f).stem
    try:
        stub, _ = stem.split("_", maxsplit=1)
    except ValueError:
        stub = stem
    stubs.append(stub)

print(stubs)  # ['abc1', 'def2', 'ghi3', 'file1', 'file2', 'file3']

如何使用正則表達式捕獲以下組？

問題描述

2 個解決方案

解決方案1
2 已采納 2022-08-15 15:07:05

解決方案2
1 2022-08-15 15:11:23

如何使用正則表達式捕獲以下組？

問題描述

2 個解決方案

解決方案1 2 已采納 2022-08-15 15:07:05

解決方案2 1 2022-08-15 15:11:23

解決方案1
2 已采納 2022-08-15 15:07:05

解決方案2
1 2022-08-15 15:11:23