简体   繁体   English

使用 pathlib 递归检查文件名前缀是否与父目录前缀匹配

[英]Checking if filename prefixes match parent directory prefix recursively with pathlib

I've written a script that uses pathlib to compare a list of files provided by the user to what is actually in a target directory.我编写了一个脚本,该脚本使用 pathlib 将用户提供的文件列表与目标目录中的实际文件进行比较。 It then returns lists of files that were expected but not found, and files that were found but were not expected.然后它返回预期但未找到的文件列表,以及已找到但未预期的文件。 It works just fine.它工作得很好。

My issue now is that I want to verify that filename prefixes match the prefix of their parent directory, and return an error when they don't.我现在的问题是我想验证文件名前缀是否与其父目录的前缀匹配,如果不匹配则返回错误。 So a folder named abc2022_001 should contain files that start with abc2022_ and not abc2023_ .因此,名为abc2022_001的文件夹应该包含以abc2022_而不是abc2023_开头的文件。 This is what I have so far:这是我到目前为止所拥有的:

from pathlib import Path

fileList = open("fileList.txt", "r")
data = fileList.read()
fileList_reformatted = data.replace('\n', '').split(",")
print(fileList_reformatted)

p = Path('C:/Users/Common/Downloads/compare').rglob('*')
filePaths = [x for x in p if x.is_file()]
filePaths_string = [str(x) for x in filePaths]
print(filePaths_string)

differences1 = []
for element in fileList_reformatted:
    if element not in filePaths_string:
        differences1.append(element)

print("The following files from the provided list were not found:",differences1)

differences2 = []
for element in filePaths_string:
    if element not in fileList_reformatted:
        differences2.append(element)

print("The following unexpected files were found:",differences2)

wrong_location = []
for element in p:
    if element.Path.name.split("_")[0:1] != element.Path.parent.split("_")[0:1]:
        wrong_location.append(element)
    
print("Following files may be in the wrong location:",wrong_location)

The script runs, but returns no errors on a test directory.该脚本运行,但在测试目录上未返回任何错误。 Where am I going wrong here?我哪里错了? Thanks!谢谢!

You could try just picking the first element from the splits in this line.您可以尝试只从该行的拆分中选择第一个元素。

if element.Path.name.split("_")[0:1] != element.Path.parent.split("_")[0:1]:

like so像这样

 if element.Path.name.split("_")[0] != element.Path.parent.split("_")[0]:

The first version compares two lists ['abc22'] == ['abc23'] and not the actual values 'abc22' == 'abc23' .第一个版本比较两个列表['abc22'] == ['abc23']而不是实际值'abc22' == 'abc23' That might be the cause.这可能是原因。

The answer turned out to be:答案竟然是:

for element in filePaths:
if element.parts[-1].split("_")[0] != element.parent.parts[-1].split("_")[0]:

Thanks for helping, folks.感谢您的帮助,伙计们。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM