简体   繁体   中英

Python read CSV file, and write to another skipping columns

I have CSV input file with 18 columns I need to create new CSV file with all columns from input except column 4 and 5

My function now looks like

def modify_csv_report(input_csv, output_csv):
    begin = 0
    end = 3

    with open(input_csv, "r") as file_in:
        with open(output_csv, "w") as file_out:
            writer = csv.writer(file_out)
            for row in csv.reader(file_in):
                writer.writerow(row[begin:end])
    return output_csv

So it reads and writes columns number 0 - 3, but i don't know how skip column 4,5 and continue from there

You can add the other part of the row using slicing , like you did with the first part:

writer.writerow(row[:4] + row[6:])

Note that to include column 3, the stop index of the first slice should be 4. Specifying start index 0 is also usually not necessary.

A more general approach would employ a list comprehension and enumerate :

exclude = (4, 5)
writer.writerow([r for i, r in enumerate(row) if i not in exclude])

If your CSV has meaningful headers an alternative solution to slicing your rows by indices, is to use the DictReader and DictWriter classes.

#!/usr/bin/env python
from csv import DictReader, DictWriter

data = '''A,B,C
1,2,3
4,5,6
6,7,8'''

reader = DictReader(data.split('\n'))

# You'll need your fieldnames first in a list to ensure order
fieldnames = ['A', 'C']
# We'll also use a set for efficient lookup
fieldnames_set = set(fieldnames)

with open('outfile.csv', 'w') as outfile:
    writer = DictWriter(outfile, fieldnames)
    writer.writeheader()
    for row in reader:
        # Use a dictionary comprehension to iterate over the key, value pairs
        # discarding those pairs whose key is not in the set
        filtered_row = dict(
            (k, v) for k, v in row.iteritems() if k in fieldnames_set
        )
        writer.writerow(filtered_row)

This is what you want:

import csv


def remove_csv_columns(input_csv, output_csv, exclude_column_indices):
    with open(input_csv) as file_in, open(output_csv, 'w') as file_out:
        reader = csv.reader(file_in)
        writer = csv.writer(file_out)
        writer.writerows(
            [col for idx, col in enumerate(row)
             if idx not in exclude_column_indices]
            for row in reader)

remove_csv_columns('in.csv', 'out.csv', (3, 4))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM