繁体   English   中英

根据Python中的模板提取部分字符串

[英]Extract part of string based on a template in Python

我想使用 Python 读取目录列表并将数据存储在基于/home/user/Music/%artist%/[%year%] %album%等模板的变量中。

一个例子是:

artist, year, album = None, None, None

template = "/home/user/Music/%artist%/[%year%] %album%"
path = "/home/user/Music/3 Doors Down/[2002] Away From The Sun"

if text == "%artist%":
    artist = key

if text == "%year%":
    year = key

if text == "%album%":
    album = key

print(artist)
# 3 Doors Down

print(year)
# 2002

print(album)
# Away From The Sun

我可以使用str.replace("%artist%", artist)轻松完成反向操作,但如何提取数据?

如果您的文件夹结构模板可靠,则无需正则表达式即可使用以下内容。

path = "/home/user/Music/3 Doors Down/[2002] Away From The Sun"

path_parts = path.split("/") # divide up the path into array by slashes

print(path_parts)  

artist = path_parts[4] # get element of array at index 4

year = path_parts[5][1:5] # get characters at index 1-5 for the element of array at index 5

album = path_parts[5][7:]

print(artist)
# 3 Doors Down

print(year)
# 2002
    
print(album)
# Away From The Sun
    
# to put the path back together again using an F-string (No need for str.replace)
reconstructed_path = f"/home/user/Music/{artist}/[{year}] {album}"
    
print(reconstructed_path)

output:

['', 'home', 'user', 'Music', '3 Doors Down', '[2002] Away From The Sun']
3 Doors Down
2002
Away From The Sun
/home/user/Music/3 Doors Down/[2002] Away From The Sun

以下对我有用:

from difflib import SequenceMatcher

def extract(template, text):
    seq = SequenceMatcher(None, template, text, True)
    return [text[c:d] for tag, a, b, c, d in seq.get_opcodes() if tag == 'replace']

template = "home/user/Music/%/[%] %"
path = "home/user/Music/3 Doors Down/[2002] Away From The Sun"

artist, year, album = extract(template, path)

print(artist)
print(year)
print(album)

Output:

3 Doors Down
2002
Away From The Sun

每个模板占位符可以是任何单个字符,只要该字符不出现在要返回的值中即可。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM