简体   繁体   中英

how do i compare two rows from one column in a csv file and create a new column accordingly in Python

I have a CSV file that looks like this:

id   data 
1    abc
1    htf
2    kji
3    wdc
3    vnc
3    acd
4    mef
5    klm
5    def
... and so on 

what I want to do is compare the id from the current row to the previous one, if it's the same then I want to create, in a new CSV file, a new column containing the data from that row. so here's how I want the output CSV file: to look like:

id   data1  data2  data3
1    abc    htf
2    kji
3    wdc    vnc    acd
4    mef
5    klm    def

is it possible? or is it better to do it in the same CSV file?

This should help

from collections import defaultdict


def manipulate_file():
    dictionary = defaultdict(list)
    with open("sample.csv", "r") as f:
        data = f.read()
    data = data.split("\n")
    for i in range(len(data)-1):
        print(data)
        id_, data_ = data[i].split(",")
        dictionary[id_].append(data_)
    return dictionary

def rewrite(dictionary):
    file_ = "" 
    for id_ in dictionary.keys():
        row = str(id_)
        for word in dictionary[id_]:
            row += "," + word
        file_ += row + "\n"
    return file_

def main():
    dictionary = manipulate_file()
    file_ = rewrite(dictionary)
    with open("output.csv", "w") as f:
        f.write(file_)


main()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM