简体   繁体   English

连接Python中具有相同第一列值的CSV文件的所有行

[英]Joining all rows of a CSV file that have the same 1st column value in Python

I have a CSV file that goes something like this: 我有一个类似这样的CSV文件:

['Name1', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '+'] ['Name1','','','','','','','','','','','','','','','', ”,“”,“,”,“ +”]
['Name1', '', '', '', '', '', 'b', '', '', '', '', '', '', '', '', '', '', '', '', '', ''] ['Name1',“,”,“,”,“,” b“,”,“,”,“,”,“,”,“,”,“ ,“,”,“,”,“]
['Name2', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', 'a', ''] ['Name2','','','','','','','','','','','','','','','', '', '', '', '一种', '']
['Name3', '', '', '', '', '+', '', '', '', '', '', '', '', '', '', '', '', '', '', '', ''] ['Name3','','','','','+','','','','','','','','','','' ,“,”,“,”,“]

Now, I need a way to join all of the rows that have the same 1st column name into one column, for instance: 现在,我需要一种将第一列名称相同的所有行连接到一个列的方法,例如:

['Name1', '', '', '', '', '', 'b', '', '', '', '', '', '', '', '', '', '', '', '', '', '+'] ['Name1',“,”,“,”,“,” b“,”,“,”,“,”,“,”,“,”,“ ,“,”,“,”,“ +”]
['Name2', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', 'a', ''] ['Name2','','','','','','','','','','','','','','','', '', '', '', '一种', '']
['Name3', '', '', '', '', '+', '', '', '', '', '', '', '', '', '', '', '', '', '', '', ''] ['Name3','','','','','+','','','','','','','','','','' ,“,”,“,”,“]

I can think of a way to do this by sorting the CSV and then going trough each row and column and compare each value, but there should probably be an easier way to do it. 我可以想到一种通过对CSV进行排序然后遍历每一行和每一列并比较每个值的方法,但是应该有一种更简单的方法。

Any ideas? 有任何想法吗?

You should use itertools.groupby: 您应该使用itertools.groupby:

t = [ 
['Name1', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '+'],
['Name1', '', '', '', '', '', 'b', '', '', '', '', '', '', '', '', '', '', '', '', '', ''],
['Name2', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', 'a', ''],
['Name3', '', '', '', '', '+', '', '', '', '', '', '', '', '', '', '', '', '', '', '', ''] 
]

from itertools import groupby

# TODO: if you need to speed things up you can use operator.itemgetter
# for both sorting and grouping
for name, rows in groupby(sorted(t), lambda x:x[0]):
    print join_rows(rows)

It's obvious that you'd implement the merging in a separate function. 显然,您将在单独的函数中实现合并。 For example like this: 例如这样:

def join_rows(rows):
    def join_tuple(tup):
        for x in tup:
            if x: 
                return x
        else:
            return ''
    return [join_tuple(x) for x in zip(*rows)]
def merge_rows(row1, row2):
    # merge two rows with the same name
    merged_row = ...
    return merged_row

r1 = ['Name1', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '+']
r2 = ['Name1', '', '', '', '', '', 'b', '', '', '', '', '', '', '', '', '', '', '', '', '', '']
r3 = ['Name2', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', 'a', '']
r4 = ['Name3', '', '', '', '', '+', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '']
rows = [r1, r2, r3, r4]
data = {}
for row in rows:
    name = row[0]
    if name in data:
        data[name] = merge_rows(row, data[name])
    else:
        data[name] = row

You now have all the rows in data where each key of this dictionary is the name and the corresponding value is that row. 现在,您将拥有data中的所有行,其中该字典的每个键都是名称,而相应的值是该行。 You can now write this data to a CSV file. 您现在可以将此数据写入CSV文件。

You can also use defaultdict : 您还可以使用defaultdict

>>> from collections import defaultdict
>>> d = defaultdict(list)
>>> _ = [d[i[0]].append(z) for i in t for z in i[1:]]
>>> d['Name1']
['', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '+', '', '', '', '', '', 'b', '', '', '', '', '', '', '', '', '', '', '', '', '', '']

Then do your column joining 然后做你的专栏加盟

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 加入具有相同第一列值的csv文件的所有行 - joining-all-rows-of-a-csv-file-that-have-the-same-1st-column-value 用python连接具有相同第一列的csv文件的所有行 - joining all rows of a csv file that have the same first column with python 在python(panda)中检查一列的第一行与anothter的所有行 - check 1st row of a column with all rows of anothter in python (panda) 在Python上的.csv文件上获取第一列值 - Get 1st column values on .csv file on python Import multiple csv files into pandas and concatenate into one DataFrame where 1st column same in all csv and no headers of data just file name - Import multiple csv files into pandas and concatenate into one DataFrame where 1st column same in all csv and no headers of data just file name 如何按列名过滤值,然后将具有相同值的行提取到另一个CSV文件? Python /熊猫 - How to filter values by Column Name and then extract the rows that have the same value to another CSV file? Python/Pandas 提取 HTML 表数据从 email 到 csv 文件,第一列值到行标题,使用 ZA7F5F35426B928727111 - Extracting HTML table data from email to csv file, 1st column values to row headers, using Python 如果第一列值相同,MatLab(或任何其他语言)转换矩阵或csv以将第二列值放到同一行? - MatLab (or any other language) to convert a matrix or a csv to put 2nd column values to the same row if 1st column value is the same? 取 CSV 的第一行并将其转换为新列中的默认值 - Take 1st row of CSV and turn it into a default value in a new column 基于第一列合并行的 Python 脚本 - Python script to merge rows based on 1st column
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM