[英]How to modify iteration list?
Following scenario of traversing dir structure. 以下是遍历目录结构的情况。
"Build complete dir tree with files but if files in single dir are similar in name list only single entity" “使用文件构建完整的目录树,但是如果单个目录中的文件在名称列表中相似,则仅单个实体”
Example tree ( let's assume they're are not sorted ): 示例树(假设它们未排序):
- rootDir
-dirA
fileA_01
fileA_03
fileA_05
fileA_06
fileA_04
fileA_02
fileA_...
fileAB
fileAC
-dirB
fileBA
fileBB
fileBC
Expected output: 预期产量:
- rootDir
-dirA
fileA_01 - fileA_06 ...
fileAB
fileAC
-dirB
fileBA
fileBB
fileBC
So I did already simple def findSimilarNames
that for fileA_01
(or any fileA_
) will return list [ fileA_01
... fileA_06
] 所以我已经做了简单的
def findSimilarNames
,它对于fileA_01
(或任何fileA_
)将返回列表[ fileA_01
... fileA_06
]
Now I'm in os.walk
and I'm doing loop over files so every file will be checked against similar filenames so eg fileA_03
I've got rest of them [ fileA_01
- fileA_06
] and now I want to modify the list that I iterate over to just skip items from findSimilarNames
, without need of using another loop or if
's inside. 现在我在
os.walk
,我正在循环遍历文件,因此每个文件都将根据相似的文件名进行检查,例如fileA_03
我剩下了其余的[ fileA_01
- fileA_06
],现在我想修改列表进行迭代以仅跳过findSimilarNames
项目,而无需使用另一个循环或if
的内部。
I searched here and people are suggesting avoidance of modifying iteration list, but doing so I would avoid every file iteration. 我在这里搜索,人们建议避免修改迭代列表,但是这样做可以避免每次文件迭代。
Pseudo code: 伪代码:
for root,dirs,files in os.walk( path ):
for file in files:
similarList = findSimilarNames( file )
#OVERWRITE ITERATION LIST SOMEHOW
files = (set(files)-set(similarList))
#DEAL WITH ELEMENT
What I'm trying to avoid is below - checking each file because maybe it's already found by findSimilarNames
. 我要避免的是在下面检查每个文件,因为也许
findSimilarNames
已经找到了findSimilarNames
。
for root,dirs,files in os.walk( path ):
filteredbysimilar = files[:]
for file in files:
similar = findSimilarNames( file )
filteredbysimilar = list(set(filteredbysimilar)-set(similar))
#--
for filteredFile in filteredbysimilar:
#DEAL WITH ELEMENT
#OVERWRITE ITERATION LIST SOMEHOW
You can get this effect by using a while-loop style iteration. 您可以使用while循环样式迭代来获得此效果。 Since you want to do set subtraction to remove the similar groups anyway, the natural approach is to start with a set of all the filenames, and repeatedly remove groups until nothing is left.
由于您想通过设置减法来删除相似的组,因此自然的方法是从一组所有文件名开始,然后重复删除组,直到没有剩余为止。 Thus:
从而:
unprocessed = set(files)
while unprocessed:
f = unprocessed.pop() # removes and returns an arbitrary element
group = findSimilarNames(f)
unprocessed -= group # it is not an error that `f` has already been removed.
doSomethingWith(group) # i.e., "DEAL WITH ELEMENT" :)
How about building up a list of files that aren't similar? 如何建立一个不相似的文件列表?
unsimilar = set()
for f in files:
if len(findSimilarNames(f).intersection(unsimilar))==0:
unsimilar.add(f)
This assumes findSimilarNames
yields a set. 假定
findSimilarNames
产生一个集合。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.