從python字符串中提取子字符串

Question

我想在下面的 9 位數字之前提取字符串：

tmp = place1_128017000_gw_cl_mask.tif

輸出應該是place1

我可以這樣做： tmp.split('_')[0]但我也希望該解決方案適用於：

tmp = place1_place2_128017000_gw_cl_mask.tif結果將是： place1_place2

您可以假設該號碼也將是 9 位數字

Answer 1

假設我們可以將您的問題表述為希望子字符串達到，但不包括后跟所有數字的下划線，我們可以嘗試：

tmp = "place1_place2_128017000_gw_cl_mask.tif"
m = re.search(r'^([^_]+(?:_[^_]+)*)_\d+_', tmp)
print(m.group(1))  # place1_place2

Answer 2

使用正則表達式和正則表達式的前瞻功能，這是一個簡單的解決方案：

tmp = "place1_place2_128017000_gw_cl_mask.tif"
m = re.search(r'.+(?=_\d{9}_)', tmp)
print(m.group())

結果：

place1_place2

請注意， \d{9}位正好匹配 9 個數字。 並且(?= ... )中的正則表達式位是前瞻，這意味着它不是實際匹配的一部分，但只有在匹配之后才匹配。

Answer 3

使用正則表達式：

import re

places = (
    "place1_128017000_gw_cl_mask.tif",
    "place1_place2_128017000_gw_cl_mask.tif",
)
pattern = re.compile("(place\d+(?:_place\d+)*)_\d{9}")
for p in places:
    matched = pattern.match(p)
    if matched:
        print(matched.group(1))

印刷：

地點1

地點1_地點2

正則表達式的工作方式如下（根據需要進行調整，例如，少於 9 位或可變位數）：

(開始捕獲
place\d+匹配“位置加 1 到多個數字”
(?:啟動一個組，但不捕獲它（無需捕獲）
_place\d+匹配更多“地點”
)關閉組
*表示前一組的零次或多次
)關閉捕獲
\d{9}匹配 9 位數字

結果在第一個（也是唯一的）捕獲組中。

Answer 4

這是一個沒有正則表達式的可能解決方案（未優化！）：

def extract(s):
    result = ''
    for x in s.split('_'):
        try: x = int(x)
        except: pass
        if isinstance(x, int) and len(str(x)) == 9:
            return result[:-1]
        else:
            result += x + '_'

tmp = 'place1_128017000_gw_cl_mask.tif'
tmp2 = 'place1_place2_128017000_gw_cl_mask.tif'

print(extract(tmp))   # place1
print(extract(tmp2))  # place1_place2

從python字符串中提取子字符串

問題描述

4 個解決方案

解決方案1
3 2022-07-13 03:47:00

解決方案2
3 已采納 2022-07-13 03:51:58

解決方案3
1 2022-07-13 03:50:47

解決方案4
1 2022-07-13 04:45:24

從python字符串中提取子字符串

問題描述

4 個解決方案

解決方案1 3 2022-07-13 03:47:00

解決方案2 3 已采納 2022-07-13 03:51:58

解決方案3 1 2022-07-13 03:50:47

解決方案4 1 2022-07-13 04:45:24

解決方案1
3 2022-07-13 03:47:00

解決方案2
3 已采納 2022-07-13 03:51:58

解決方案3
1 2022-07-13 03:50:47

解決方案4
1 2022-07-13 04:45:24