简体   繁体   English

如何用Python替换CSV文件中的列?

[英]How to Replace a column in a CSV file in Python?

I have 2 csv files. 我有2个csv文件。 I need to replace a column in one file with a column from the other file but they have to stay sorted according to an ID column. 我需要用一个来自另一个文件的列替换一个文件中的列,但是它们必须根据ID列保持排序。

Here's an example: 这是一个例子:

file1: 文件1:

ID, transect, 90mdist                                      
1, a, 10,                                                  
2, b, 20,                                                
3, c, 30,     

file2: 文件2:

ID, transect, 90mdist                                
1, a, 50                                                   
2, b, 70                                                     
3, c, 90          

basically I created a new file with the correct 90mdist and I need to insert it into the old file but it has to line up with the same ID #. 基本上我用正确的90mdist创建了一个新文件,我需要将它插入到旧文件中,但它必须排列相同的ID#。

It's my understanding that Python treats csv files as a string. 我的理解是Python将csv文件视为字符串。 so I can either use a dictionary or convert the data into a list and then change it? 所以我可以使用字典或将数据转换为列表然后更改它? which way is best? 哪种方式最好?

Any help would be greatly appreciated!! 任何帮助将不胜感激!!

The CSV Module in the Python Library is what you need here. Python库中的CSV模块就是您需要的。

It allows you to read and write CSV files, treating lines a tuples or lists of items. 它允许您读取和写入CSV文件,处理行元组或项目列表。

Just read in the file with the corrected values, store the in a dictionary keyed with the line's ID. 只需使用更正的值读入文件,将其存储在以该行ID标识的字典中。

Then read in the second file, replacing the relevant column with the data from the dict and write out to a third file. 然后读入第二个文件,用dict中的数据替换相关列,并写出第三个文件。

Done. 完成。

Try this: 试试这个:

from __future__ import with_statement

import csv

def twiddle_csv(file1, file2):
    def mess_with_record(record):
        record['90mdist'] = 2 * int(record['90mdist']) + 30
    with open(file1, "r") as fin:
        with open(file2, "w") as fout:
            fields = ['ID', 'transect', '90mdist']
            reader = csv.DictReader(fin, fieldnames=fields)
            writer = csv.DictWriter(fout, fieldnames=fields)
            fout.write(",".join(fields) + '\n')
            reader.next()   # Skip the column header
            for record in reader:
                mess_with_record(record)
                writer.writerow(record)

if __name__ == '__main__':
    twiddle_csv('file1', 'file2')

A couple of caveats: 几点需要注意:

  • DictReader seems to use the first row as data, even if it matches the fields. DictReader似乎使用第一行作为数据,即使它与字段匹配。 Call reader.next() to skip. 调用reader.next()跳过。
  • Data rows cannot have trailing commas. 数据行不能包含尾随逗号。 They will be interpreted as empty columns. 它们将被解释为空列。
  • DictWriter does not appear to write out the column headers. DictWriter似乎没有写出列标题。 DIY. DIY。

Once you have your csv lists, one easy way to replace a column in one matrix with another would be to transpose the matrices, replace the row, and then transpose back your edited matrix. 获得csv列表后,将一个矩阵中的列替换为另一个矩阵的一种简单方法是转置矩阵,替换行,然后转置回编辑的矩阵。 Here is an example with your data: 以下是您的数据示例:

csv1 = [['1', 'a', '10'], ['2', 'b', '20'], ['3', 'c', '30']]
csv2 = [['1', 'a', '50'], ['2', 'b', '70'], ['3', 'c', '90']]

# transpose in Python is zip(*myData)
transposedCSV1, transposedCSV2 = zip(*csv1), zip(*csv2)
print transposedCSV1
>>> [['1', '2', '3'], ['a', 'b', 'c'], ['10', '20', '30']]

csv1 = transposedCSV1[:2] + [transposedCSV2[2]]
print csv1
>>> [['1', '2', '3'], ['a', 'b', 'c'], ['50', '70', '90']]

csv1 = zip(*csv1)
print csv1
>>> [['1', 'a', '50'], ['2', 'b', '70'], ['3', 'c', '90']]

If you're only doing this as a one-off, why bother with Python at all? 如果你只是一次性这样做,为什么还要烦扰Python呢? Excel or OpenOffice Calc will open the two CSV files for you, then you can just cut and paste the column from one to the other. Excel或OpenOffice Calc将为您打开两个CSV文件,然后您可以将列从一个剪切并粘贴到另一个。

If the two lists of IDs are not exactly the same then a simple VB macro would do it for you. 如果两个ID列表不完全相同,那么一个简单的VB宏将为您完成。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM