使用 python 的内置库从列的唯一值创建多个文件

Question

我开始学习 python 并且想知道是否有一种方法可以从列的唯一值创建多个文件。 我知道有 100 种方法可以通过 pandas 完成它。 但我希望通过内置库来完成它。 我找不到通过内置库完成的单个示例。

这是示例 csv 文件数据：

uniquevalue|count
a|123
b|345
c|567
d|789
a|123
b|345
c|567

示例 output 文件：

a.csv
    uniquevalue|count
    a|123
    a|123

b.csv
    b|345
    b|345

我正在努力循环列中的唯一值，然后将它们打印出来。 有人可以用逻辑解释如何做到这一点吗？ 这将不胜感激。 谢谢。

Answer 1

import csv
with open('sample.csv', newline='') as csvfile:
    reader = csv.reader(csvfile)
    for row in reader:
        # this creates a file with unique value, 
        # or the first column of your columns
        open(row[0], 'a').close()

Answer 2

import csv from collections import defaultdict header = [] data = defaultdict(list) DELIMITER = "|" with open("inputfile.csv", newline="") as csvfile: reader = csv.reader(csvfile, delimiter=DELIMITER) for i, row in enumerate(reader): if i == 0: header = row else: key = row[0] data[key].append(row) for key, value in data.items(): filename = f"{key}.csv" with open(filename, "w", newline="") as f: writer = csv.writer(f, delimiter=DELIMITER) rows = [header] + value writer.writerows(rows)

Answer 3

该任务也可以在不使用 csv 模块的情况下完成。 读取文件的行，并使用read_file.read().splitlines()[1:]删除换行符，同时跳过 csv 文件的 header 行。 使用一组创建唯一的输入数据集合，用于计算重复次数并创建 output 文件。

with open("unique_sample.csv", "r") as read_file:
    items = read_file.read().splitlines()[1:]
    lines_set = set(items)
    for line in lines_set:
        with open(line[:line.index('|')] + '.csv', 'w') as output:
            output.write((line + '\n') * items.count(line))

使用 python 的内置库从列的唯一值创建多个文件

问题描述

3 个解决方案

解决方案1
0 2022-08-14 15:34:26

解决方案2
0 2022-08-14 16:29:19

解决方案3
0 2022-08-15 15:00:47

使用 python 的内置库从列的唯一值创建多个文件

问题描述

3 个解决方案

解决方案1 0 2022-08-14 15:34:26

解决方案2 0 2022-08-14 16:29:19

解决方案3 0 2022-08-15 15:00:47

解决方案1
0 2022-08-14 15:34:26

解决方案2
0 2022-08-14 16:29:19

解决方案3
0 2022-08-15 15:00:47