簡體   English   中英

Python 抓取兩個特定字符之間的子字符串

[英]Python grab substring between two specific characters

我有一個包含數百個文件的文件夾,名稱如下:

"2017_05_S2B_7VEG_20170528_0_L2A_B01.tif"

約定: year_month_ID_zone_date_0_L2A_B01.tif ( "_0_L2A_B01.tif""zone"永遠不會改變)

我需要的是遍歷每個文件並根據它們的名稱構建一個路徑以便下載它們。 例如:

name = "2017_05_S2B_7VEG_20170528_0_L2A_B01.tif"
path = "2017/5/S2B_7VEG_20170528_0_L2A/B01.tif"

路徑約定需要是: path = year/month/year_month_ID_zone_date_0_L2A/B08.tif

我想制作一個循環,每次遇到"_"字符時都會將我的字符串“剪切”成幾個部分,然后按正確的順序縫合不同的部分以創建我的路徑名。 我試過這個,但沒有用:

import re

filename = 
"2017_05_S2B_7VEG_20170528_0_L2A_B01.tif"

try:
    found = re.search('_(.+?)_', filename).group(1)
except AttributeError:
    # _ not found in the original string
    found = '' # apply your error handling

我怎么能在 Python 上實現呢?

由於您只有一個分隔符,您也可以簡單地使用 Python 的內置 split 函數:

import os

items = filename.split('_')
year, month = items[:2]
new_filename = '_'.join(items[2:])

path = os.path.join(year, month, new_filename)

試試下面的代碼片段

filename = "2017_05_S2B_7VEG_20170528_0_L2A_B01.tif"
found = re.sub('(\d+)_(\d+)_(.*)_(.*)\.tif', r'\1/\2/\3/\4.tif', filename)
print(found) # prints 2017/05/S2B_7VEG_20170528_0_L2A/B01.tif

不需要正則表達式——你可以只使用split()

filename = "2017_05_S2B_7VEG_20170528_0_L2A_B01.tif"
parts = filename.split("_")

year = parts[0]
month = parts[1]
filename = "2017_05_S2B_7VEG_20170528_0_L2A_B01.tif"
temp = filename.split('_')
result = "/".join(temp)
print(result)

結果是2017/05/S2B/7VEG/20170528/0/L2A/B01.tif

也許你可以這樣做:

from os import listdir, mkdir
from os.path import isfile, join, isdir

my_path = 'your_soure_dir'

files_name = [f for f in listdir(my_path) if isfile(join(my_path, f))]

def create_dir(files_name):
    for file in files_name:
        month = file.split('_', '1')[0]
        week = file.split('_', '2')[1]
        if not isdir(my_path):
            mkdir(month)
            mkdir(week)
            ### your download code

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM