跳过 CSV 文件中特定列的 Python 脚本

Question

I have a Python code which filters the data according to specific column and creates multiple CSV files.我有一个 Python 代码，它根据特定列过滤数据并创建多个 CSV 文件。

Here is my main csv file:这是我的主要 csv 文件：

Name,    City,      Email
john     cty_1      a@g.com
jack     cty_1      b@g.com
...
Ross     cty_2      c@g.com
Rachel   cty_2      d@g.com
...

My python logic currently creates separate csv for separate city.我的 python 逻辑目前为单独的城市创建单独的 csv。 Existing python logic is:现有的python逻辑是：

from itertools import groupby
import csv

with open('filtered_final.csv') as csv_file:
    reader = csv.reader(csv_file)
    next(reader) #skip header
    
    #Group by column (city)
    lst = sorted(reader, key=lambda x : x[1])
    groups = groupby(lst, key=lambda x : x[1])

    #Write file for each city
    for k,g in groups:
        filename = k[21:] + '.csv'
        with open(filename, 'w', newline='') as fout:
            csv_output = csv.writer(fout)

            csv_output.writerow(["Name","City","Email"])  #header
            for line in g:
                csv_output.writerow(line)

Now, I want to remove the "City" Column on each new CSV files.现在，我想删除每个新 CSV 文件上的“城市”列。

Answer 1

然后尝试导入：

df = pd.read_csv('filtered_final.csv', usecols=['Name','Email'])

Answer 2

If you data is small enough to put on ram, you can just read the whole thing in and do a groupby:如果您的数据小到可以放在 ram 上，您可以读取整个内容并进行分组：

import pandas as pd

df = pd.read_csv('filtered_final.csv')

for city, data in df[['Name','Email']].groupby(df['City']):
    data.to_csv(f'{city}_data.csv', index=False)

跳过 CSV 文件中特定列的 Python 脚本

问题描述

2 个解决方案

解决方案1
1 2020-10-06 03:09:57

解决方案2
1 已采纳 2020-10-06 03:16:00

跳过 CSV 文件中特定列的 Python 脚本

问题描述

2 个解决方案

解决方案1 1 2020-10-06 03:09:57

解决方案2 1 已采纳 2020-10-06 03:16:00

解决方案1
1 2020-10-06 03:09:57

解决方案2
1 已采纳 2020-10-06 03:16:00