簡體   English   中英

從 csv 文件獲取唯一值,output 到新文件

[英]Getting unique values from csv file, output to new file

我正在嘗試從 csv 文件中獲取唯一值。 這是該文件的示例:

12,life,car,good,exellent
10,gift,truck,great,great
11,time,car,great,perfect

新文件中所需的 output 是這樣的:

12,10,11
life,gift,time
car,truck
good.great
excellent,great,perfect

這是我的代碼:

def attribute_values(in_file, out_file):
    fname = open(in_file)
    fout = open(out_file, 'w')

    # get the header line
    header = fname.readline()
    # get the attribute names
    attrs = header.strip().split(',')

    # get the distinct values for each attribute
    values = []
    
    for i in range(len(attrs)):
        values.append(set())

    # read the data
    for line in fname:
        cols = line.strip().split(',')
        
        for i in range(len(attrs)):
            values[i].add(cols[i])

        # write the distinct values to the file
        for i in range(len(attrs)):
            fout.write(attrs[i] + ',' + ','.join(list(values[i])) + '\n')

    fout.close()
    fname.close()

該代碼當前輸出如下:

12,10
life,gift
car,truck
good,great
exellent,great
12,10,11
life,gift,time
car,car,truck
good,great
exellent,great,perfect

我怎樣才能解決這個問題?

您可以嘗試使用zip遍歷輸入文件的列,然后消除重復項:

import csv

def attribute_values(in_file, out_file):
    with open(in_file, "r") as fin, open(out_file, "w") as fout:
        for column in zip(*csv.reader(fin)):
            items, row = set(), []
            for item in column:
                if item not in items:
                    items.add(item)
                    row.append(item)
            fout.write(",".join(row) + "\n")

示例文件的結果:

12,10,11
life,gift,time
car,truck
good,great
exellent,great,perfect

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM