根据Python中的模板提取部分字符串

Question

我想使用 Python 读取目录列表并将数据存储在基于/home/user/Music/%artist%/[%year%] %album%等模板的变量中。

一个例子是：

artist, year, album = None, None, None

template = "/home/user/Music/%artist%/[%year%] %album%"
path = "/home/user/Music/3 Doors Down/[2002] Away From The Sun"

if text == "%artist%":
    artist = key

if text == "%year%":
    year = key

if text == "%album%":
    album = key

print(artist)
# 3 Doors Down

print(year)
# 2002

print(album)
# Away From The Sun

我可以使用str.replace("%artist%", artist)轻松完成反向操作，但如何提取数据？

Answer 1

如果您的文件夹结构模板可靠，则无需正则表达式即可使用以下内容。

path = "/home/user/Music/3 Doors Down/[2002] Away From The Sun"

path_parts = path.split("/") # divide up the path into array by slashes

print(path_parts)  

artist = path_parts[4] # get element of array at index 4

year = path_parts[5][1:5] # get characters at index 1-5 for the element of array at index 5

album = path_parts[5][7:]

print(artist)
# 3 Doors Down

print(year)
# 2002
    
print(album)
# Away From The Sun
    
# to put the path back together again using an F-string (No need for str.replace)
reconstructed_path = f"/home/user/Music/{artist}/[{year}] {album}"
    
print(reconstructed_path)

output：

['', 'home', 'user', 'Music', '3 Doors Down', '[2002] Away From The Sun']
3 Doors Down
2002
Away From The Sun
/home/user/Music/3 Doors Down/[2002] Away From The Sun

Answer 2

以下对我有用：

from difflib import SequenceMatcher

def extract(template, text):
    seq = SequenceMatcher(None, template, text, True)
    return [text[c:d] for tag, a, b, c, d in seq.get_opcodes() if tag == 'replace']

template = "home/user/Music/%/[%] %"
path = "home/user/Music/3 Doors Down/[2002] Away From The Sun"

artist, year, album = extract(template, path)

print(artist)
print(year)
print(album)

Output：

3 Doors Down
2002
Away From The Sun

每个模板占位符可以是任何单个字符，只要该字符不出现在要返回的值中即可。

根据Python中的模板提取部分字符串

问题描述

2 个解决方案

解决方案1
0 2022-10-11 05:03:46

解决方案2
0 2022-11-30 00:48:52

根据Python中的模板提取部分字符串

问题描述

2 个解决方案

解决方案1 0 2022-10-11 05:03:46

解决方案2 0 2022-11-30 00:48:52

解决方案1
0 2022-10-11 05:03:46

解决方案2
0 2022-11-30 00:48:52