简体   繁体   English

Python:如何按名称分隔文件?

[英]Python: How to segregate files by name?

I'm trying to segregate files by their name to perform a operation to each group.我正在尝试按名称分隔文件以对每个组执行操作。 For example, if I have the following files:例如,如果我有以下文件:

name_a_1
name_a_2
name_a_3
name_b_4
name_b_5
name_b_6

I would like to first work with group a.我想先和a组一起工作。 When the operation is done, do the same operation to group b and so on.操作完成后,对b组进行同样的操作,以此类推。 Suggestions on how can this be approached?关于如何解决这个问题的建议?

You can group the files to a temporary dictionary and then do an operation on each group.您可以将文件分组到一个临时字典中,然后对每个组进行操作。 For example:例如:

filenames = [
    'name_a_1',
    'name_a_2',
    'name_a_3',
    'name_b_4',
    'name_b_5',
    'name_b_6'
]

# group the filenames
groups = {}
for f in filenames:
    g = f.split('_')[1]
    groups.setdefault(g, []).append(f)

#groups is now:
# {'a': ['name_a_1', 'name_a_2', 'name_a_3'], 
#  'b': ['name_b_4', 'name_b_5', 'name_b_6']}

for grp, items in groups.items():
    # your operation on files from group `grp`
    for f in items:
        work(f)

Define a function that extracts the group from a filename, and then use it as the key= parameter to sorted and then itertools.groupby :定义一个从文件名中提取组的函数,然后将其用作key=参数进行sorted ,然后使用itertools.groupby

import itertools

filenames = ["name_a_1", "name_a_2", "name_a_3", "name_b_4", "name_b_5", "name_b_6"]

def get_group(filename):
    return filename.split("_")[1]

for group_name, group in itertools.groupby(sorted(filenames, key=get_group), get_group):
    for filename in group:
        print(group_name, filename)  # "a name_a_1" and so on

A solution using defaultdict:使用 defaultdict 的解决方案:

#!/usr/local/cpython-3.9/bin/python3

"""Split strings and aggregate."""

import collections

list_ = [
    'name_b_5',
    'name_a_1',
    'name_b_4',
    'name_b_6',
    'name_a_2',
    'name_a_3',
]

dict_ = collections.defaultdict(list)


def get_2nd(string):
    """Get the second _ delimited field."""
    return string.split('_')[1]


for string in list_:
    dict_[get_2nd(string)].append(string)

for key, value in sorted(dict_.items()):
    print(key, ' '.join(sorted(value)))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM