如何修改迭代列表？

Question

Following scenario of traversing dir structure. 以下是遍历目录结构的情况。

"Build complete dir tree with files but if files in single dir are similar in name list only single entity" “使用文件构建完整的目录树，但是如果单个目录中的文件在名称列表中相似，则仅单个实体”

Example tree ( let's assume they're are not sorted ): 示例树（假设它们未排序）：

    - rootDir
        -dirA
            fileA_01
            fileA_03
            fileA_05
            fileA_06
            fileA_04
            fileA_02
            fileA_...
            fileAB
            fileAC
        -dirB
            fileBA
            fileBB
            fileBC

Expected output: 预期产量：

    - rootDir
        -dirA
            fileA_01 - fileA_06 ...
            fileAB
            fileAC
        -dirB
            fileBA
            fileBB
            fileBC

So I did already simple def findSimilarNames that for fileA_01 (or any fileA_ ) will return list [ fileA_01 ... fileA_06 ] 所以我已经做了简单的def findSimilarNames ，它对于fileA_01 （或任何fileA_ ）将返回列表[ fileA_01 ... fileA_06 ]

Now I'm in os.walk and I'm doing loop over files so every file will be checked against similar filenames so eg fileA_03 I've got rest of them [ fileA_01 - fileA_06 ] and now I want to modify the list that I iterate over to just skip items from findSimilarNames , without need of using another loop or if 's inside. 现在我在os.walk ，我正在循环遍历文件，因此每个文件都将根据相似的文件名进行检查，例如fileA_03我剩下了其余的[ fileA_01 - fileA_06 ]，现在我想修改列表进行迭代以仅跳过findSimilarNames项目，而无需使用另一个循环或if的内部。

I searched here and people are suggesting avoidance of modifying iteration list, but doing so I would avoid every file iteration. 我在这里搜索，人们建议避免修改迭代列表，但是这样做可以避免每次文件迭代。

Pseudo code: 伪代码：

for root,dirs,files in os.walk( path ):
    for file in files:
        similarList = findSimilarNames( file )

        #OVERWRITE ITERATION LIST SOMEHOW
        files = (set(files)-set(similarList))

        #DEAL WITH ELEMENT

What I'm trying to avoid is below - checking each file because maybe it's already found by findSimilarNames . 我要避免的是在下面检查每个文件，因为也许findSimilarNames已经找到了findSimilarNames 。

for root,dirs,files in os.walk( path ):
    filteredbysimilar = files[:]
    for file in files:
        similar = findSimilarNames( file )
        filteredbysimilar = list(set(filteredbysimilar)-set(similar))
    #--
    for filteredFile in filteredbysimilar:
        #DEAL WITH ELEMENT

Answer 1

#OVERWRITE ITERATION LIST SOMEHOW

You can get this effect by using a while-loop style iteration. 您可以使用while循环样式迭代来获得此效果。 Since you want to do set subtraction to remove the similar groups anyway, the natural approach is to start with a set of all the filenames, and repeatedly remove groups until nothing is left. 由于您想通过设置减法来删除相似的组，因此自然的方法是从一组所有文件名开始，然后重复删除组，直到没有剩余为止。 Thus: 从而：

unprocessed = set(files)
while unprocessed:
    f = unprocessed.pop() # removes and returns an arbitrary element
    group = findSimilarNames(f)
    unprocessed -= group # it is not an error that `f` has already been removed.
    doSomethingWith(group) # i.e., "DEAL WITH ELEMENT" :)

Answer 2

How about building up a list of files that aren't similar? 如何建立一个不相似的文件列表？

unsimilar = set()
for f in files:
    if len(findSimilarNames(f).intersection(unsimilar))==0:
        unsimilar.add(f)

This assumes findSimilarNames yields a set. 假定findSimilarNames产生一个集合。

如何修改迭代列表？

问题描述

2 个解决方案

解决方案1
1 2019-07-21 01:48:43

解决方案2
0 2019-07-21 00:02:25

如何修改迭代列表？

问题描述

2 个解决方案

解决方案1 1 2019-07-21 01:48:43

解决方案2 0 2019-07-21 00:02:25

解决方案1
1 2019-07-21 01:48:43

解决方案2
0 2019-07-21 00:02:25