[英]Python read CSV file, and write to another skipping columns
我有18列的CSV输入文件,我需要使用输入的所有列(第4列和第5列除外)创建新的CSV文件
我的功能现在看起来像
def modify_csv_report(input_csv, output_csv):
begin = 0
end = 3
with open(input_csv, "r") as file_in:
with open(output_csv, "w") as file_out:
writer = csv.writer(file_out)
for row in csv.reader(file_in):
writer.writerow(row[begin:end])
return output_csv
所以它读取和写入第0-3列,但是我不知道如何跳过第4,5列并从那里继续
您可以使用slicing添加行的另一部分,就像对第一部分所做的那样:
writer.writerow(row[:4] + row[6:])
请注意,要包括第3列,第一个切片的停止索引应为4。通常也不必指定开始索引0。
更通用的方法是使用列表理解并enumerate
:
exclude = (4, 5)
writer.writerow([r for i, r in enumerate(row) if i not in exclude])
如果您的CSV具有有意义的标头,则可以使用DictReader
和DictWriter
类来按索引对行进行切片。
#!/usr/bin/env python
from csv import DictReader, DictWriter
data = '''A,B,C
1,2,3
4,5,6
6,7,8'''
reader = DictReader(data.split('\n'))
# You'll need your fieldnames first in a list to ensure order
fieldnames = ['A', 'C']
# We'll also use a set for efficient lookup
fieldnames_set = set(fieldnames)
with open('outfile.csv', 'w') as outfile:
writer = DictWriter(outfile, fieldnames)
writer.writeheader()
for row in reader:
# Use a dictionary comprehension to iterate over the key, value pairs
# discarding those pairs whose key is not in the set
filtered_row = dict(
(k, v) for k, v in row.iteritems() if k in fieldnames_set
)
writer.writerow(filtered_row)
这就是你想要的:
import csv
def remove_csv_columns(input_csv, output_csv, exclude_column_indices):
with open(input_csv) as file_in, open(output_csv, 'w') as file_out:
reader = csv.reader(file_in)
writer = csv.writer(file_out)
writer.writerows(
[col for idx, col in enumerate(row)
if idx not in exclude_column_indices]
for row in reader)
remove_csv_columns('in.csv', 'out.csv', (3, 4))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.