简体   繁体   English

初学者从 CSV 中删除列(无熊猫)

[英]Beginner deleting columns from CSV (no pandas)

I've just started coding, I'm trying to remove certain columns from a CSV for a project, we aren't supposed to use pandas.我刚刚开始编码,我正在尝试从 CSV 中删除某些列,我们不应该使用 pandas。 For instance, one of the fields I have to delete is called DwTm , but there's about 15 columns I have to get rid of;例如,我必须删除的字段之一称为DwTm ,但我必须删除大约 15 列; I only want the first few, Here's what I've gotten:我只想要前几个,这就是我得到的:

import csv
FTemp = "D:/tempfile.csv"
FOut = "D:/NewFile.csv"


with open(FTemp, 'r') as csv_file:
    csv_reader = csv.reader(csv_file)

    with open(FOut, 'w') as new_file:
        fieldnames = ['Stn_Name', 'Lat', 'Long', 'Prov', 'Tm']
        csv_writer = csv.DictWriter(new_file, fieldnames=fieldnames)

        for line in csv_reader:
            del line['DwTm']
            csv_writer.writerow(line)

When I run this, I get the error当我运行它时,我得到了错误

del line['DwTm']
TypeError: list indices must be integers or slices, not str

This is the only method I've found to almost work without using pandas.这是我发现在不使用 pandas 的情况下几乎可以工作的唯一方法。 Any ideas?有任何想法吗?

The easiest way around this is to use a DictReader to read the file.解决此问题的最简单方法是使用DictReader读取文件。 Like DictWriter, which you are using to write the file, DictReader uses dictionaries for rows, so your approach of deleting keys from the old row then writing to the new file will work as you expect.与您用于写入文件的 DictWriter 一样,DictReader 将字典用于行,因此您从旧行中删除键然后写入新文件的方法将按您的预期工作。

import csv
FTemp = "D:/tempfile.csv"
FOut = "D:/NewFile.csv"


with open(FTemp, 'r') as csv_file:

    # Adjust the list to be have the correct order
    old_fieldnames = ['Stn_Name', 'Lat', 'Long', 'Prov', 'Tm', 'DwTm']
    csv_reader = csv.DictReader(csv_file, fieldnames=old_fieldnames)

    with open(FOut, 'w') as new_file:
        fieldnames = ['Stn_Name', 'Lat', 'Long', 'Prov', 'Tm']
        csv_writer = csv.DictWriter(new_file, fieldnames=fieldnames)

        for line in csv_reader:
            del line['DwTm']
            csv_writer.writerow(line)

Below以下

import csv

# We only want to read the 'department' field 
# We are not interested in 'name' and 'birthday month'

# Make sure the list items are in ascending order
NON_INTERESTING_FIELDS_IDX = [2,0]
rows = []
with open('example.csv') as csv_file:
    csv_reader = csv.reader(csv_file, delimiter=',')
    for row in csv_reader:
        for idx in NON_INTERESTING_FIELDS_IDX:
          del row[idx]
        rows.append(','.join(row))
with open('example_out.csv','w') as out:
  for row in rows:
    out.write(row + '\n')

example.csv例子.csv

name,department,birthday month
John Smith,Accounting,November
Erica Meyers,IT,March

example_out.csv example_out.csv

department
Accounting
IT

It's possible to simultaneously open the file to read from and the file to write to.可以同时打开要读取的文件和要写入的文件。 Let's say you know the indices of the columns you want to keep, say, 0,2, and 4:假设您知道要保留的列的索引,例如 0、2 和 4:

good_cols = (0,2,4)
with open(Ftemp, 'r') as fin, open(Fout, 'w') as fout:
    for line in fin:
        line = line.rstrip()        #clean up newlines
        temp = line.split(',')      #make a list from the line
        data = [temp[x] for x in range(len(temp)) if x in good_cols]
        fout.write(','.join(data) + '\n')

The list comprehension (data) pulls only the columns you want to keep out of each row and immediately writes line-by-line to your new file, using the join method (plus tacking on an endline for each new row).列表推导(数据)仅从每一行中提取您想要保留的列,并立即使用 join 方法逐行写入新文件(加上为每个新行添加尾行)。

If you only know the names of the fields you want to keep/remove it's a bit more involved, you have to extract the indices from the first line of the csv file, but it's not much more difficult.如果您只知道要保留/删除的字段的名称,则涉及更多,您必须从 csv 文件的第一行中提取索引,但这并不困难。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM