简体   繁体   English

在列表中的字符串和文件名之间创建匹配项

[英]Creating a match between string and file name in lists

Let's say I have a list containing strings of the form "{N} word" (without quotation marks), where N is some integer and word is some string. 假设我有一个包含“ {N} word”形式的字符串的列表(不带引号),其中N是一些整数,而word是一些字符串。 In some folder D:\\path\\folder , I have plenty of files with names of the form " {N}name.filetype ". 在某些文件夹D:\\path\\folder ,我有很多文件的名称格式为“ {N}name.filetype ”。 With an input of the aforementioned list (elements being "{N}"), how would I get an output of a list, where every element is of the following form: " {N} words D:\\path\\folder\\{N}name.filetype "? 使用上述列表的输入(元素为“ {N}”),我将如何获得列表的输出,其中每个元素都具有以下形式:“ {N} words D:\\path\\folder\\{N}name.filetype “?
For example... 例如...

InputList = [{75} Hello, {823} World, ...]  

OutputList = [{75} Hello D:\path\folder\{75}Stuff.docx, {823} World D:\path\folder\{823}Things.docx, ...]  

if folder at D:\\path\\folder contains, among other files, {75}Stuff.docx and {823}Things.docx . 如果位于D:\\path\\folder除其他文件外还包含{75}Stuff.docx{823}Things.docx

To generalize, my question is fundamentally: 概括地说,我的问题从根本上说是:
How do I get python to read a folder and take the absolute path of any file that contains only some part of every element in the list (in this case, we look for {N} in the file names and disregard the word) and add that path to every corresponding element in the list to make the output list? 如何让python读取文件夹并获取仅包含列表中每个元素的一部分的任何文件的绝对路径(在这种情况下,我们在文件名中查找{N}而不考虑单词)并添加列表中每个对应元素的路径以构成输出列表?

I understand this is a bit of a long question that combines a couple concepts so I highly thank anyone willing to help in advance! 我知道这是一个很长的问题,需要结合几个概念,所以我非常感谢任何愿意提前帮助的人!

The important step is to convert your InputList to a dict of {number: word} - this makes it significantly easier to work with. 重要的步骤是将InputList转换为{number: word}的字典-这使得使用起来非常容易。 After that it's just a matter of looping through the files in the folder, extracting the number from their name and looking them up in the dict: 之后,只需要遍历文件夹中的文件,从它们的名称中提取编号并在dict中查找它们即可:

InputList = ['{75} Hello', '{823} World']
folder_path= r'D:\path\folder'

# define a function to extract the number between curly braces
def extract_number(text):
    return text[1:text.find('}')]

from pathlib import Path

# convert the InputList to a dict for easy and efficient lookup
words= {extract_number(name):name for name in InputList}

OutputList= []
# iterate through the folder to find matching files
for path in Path(folder_path).iterdir():
    # extract the file name from the path, e.g. "{75}Stuff.docx"
    name= path.name

    # extract the number from the file name and find the matching word
    number= extract_number(name)
    try:
        word= words[number]
    except KeyError: # if no matching word exists, skip this file
        continue

    # put the path and the word together and add them to the output list
    path= '{} {}'.format(word, path)
    OutputList.append(path)

print(OutputList)
# output: ['{75} Hello D:\\path\\folder\\{75}Stuff.docx', '{823} World D:\\path\\folder\\{823}Things.docx']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM