简体   繁体   中英

Appending a CSV file with a new column in python

I am trying to create a clean csv file by merging some of variables together from an old file and appending them to a new csv file.

I have no problem running the data the first time. I get the output I want but whenever I try to append the data with a new variable (ie new column) it appends the variable to the bottom and the output is wonky.

I have basically been running the same code for each variable, except changing the groupvariables variable to my desired variables and then using the f2= open('outputfile.csv', "ab") <--- but with an ab for amend. Any help would be appreciated

groupvariables=['x','y']

f2  = open('outputfile.csv', "wb")
writer = csv.writer(f2, delimiter=",")
writer.writerow(("ID","Diagnosis"))

for line in csv_f:
    line = line.rstrip('\n')
    columns  = line.split(",")
    tempname = columns[0]
    tempindvar = columns[1:]

templist = []

for j in groupvariables:
    tempvar=tempindvar[headers.index(j)]
    if tempvar != ".":
        templist.append(tempvar)

newList = list(set(templist))

if len(newList) > 1:
    output = 'nomatch'
elif len(newList) == 0:
    output = "."
else:
    output = newList[0]

tempoutrow = (tempname,output)
writer.writerow(tempoutrow)

f2.close()

CSV is a line-based file format, so the only way to add a column to an existing CSV file is to read it into memory and overwrite it entirely, adding the new column to each line.

If all you want to do is add lines , though, appending will work fine.

Here is something that might help. I assumed the first field on each row in each csv file is a primary key for the record and can be used to match rows between the two files. The code below reads the records in from one file, stored them in a dictionary, then reads in the records from another file, appended the values to the dictionary, and writes out a new file. You can adapt this example to better fit your actual problem.

import csv
# using python3

db = {}
reader = csv.reader(open('t1.csv', 'r'))
for row in reader:
    key, *values = row
    db[key] = ','.join(values)

reader = csv.reader(open('t2.csv', 'r'))
for row in reader:
    key, *values = row
    if key in db:
        db[key] = db[key] + ',' + ','.join(values)
    else:
        db[key] = ','.join(values)

writer = open('combo.csv', 'w')
for key in sorted(db.keys()):
    writer.write(key + ',' + db[key] + '\n')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM