如何根据列值从datatframe提取行到多个CSV文件？

Question

I have following dataframe: 我有以下数据框：

data = {'participant_id': [1, 100, 125, 125, 1, 100], 
        'test_day':['Day_1', 'Day_1', 'Day_12', 'Day_14', 'Day_4', 'Day_4'], 
        'favorite_color': ['blue', 'red', 'yellow', 'green', 'yellow', 'green'],  
        'grade': [88, 92, 95, 70, 80, 30]}
df = pd.DataFrame(data, columns = ['participant_id', 'test_day', 'favorite_color', 'grade'])

It has 10000 rows and contains data for 400 test participants labelled with unique and completely random ID's stored in 'participant_id' column. 它有10000行，包含400位测试参与者的数据，这些数据用“ participant_id”列中存储的唯一且完全随机的ID标记。 My task is to create dataframes for individuals (per 'participant_id') and then save them to the separate csv files (400 in total). 我的任务是为每个人创建数据框（每个“ participant_id”），然后将其保存到单独的csv文件（总共400个）中。

I've been trying to figure out how to do it for a couple of days now but with no luck. 我已经尝试了几天了，但是没有运气。

Can you please help me? 你能帮我么？

I am still learning how to program and trying to apply knowledge from data science course. 我仍在学习如何编程并尝试应用数据科学课程中的知识。 I am using Pandas and normally I access data about individual participant with df.loc, I have also created a list of all of the participant_id's but I don't know how to combine both to achieve the desired result automatically. 我使用的是Pandas，通常我使用df.loc访问有关单个参与者的数据，我还创建了所有partner_id的列表，但我不知道如何将两者结合起来以自动实现所需的结果。

Answer 1

`groupby` + `to_csv` `groupby` + `to_csv`

You can group by a particular field and iterate: 您可以按特定字段分组并迭代：

for part_id, df_id in df.groupby('participant_id'):
    df_id.to_csv(f'{part_id}.csv')

Answer 2

Solution by @jpp is great. @jpp解决方案很棒。 My adaptation based on your solution is 我根据您的解决方案做出的调整是

import pandas as pd
import numpy as np

data = {'participant_id': [1, 100, 125, 125, 1, 100], 
        'test_day':['Day_1', 'Day_1', 'Day_12', 'Day_14', 'Day_4', 'Day_4'], 
        'favorite_color': ['blue', 'red', 'yellow', 'green', 'yellow', 'green'],  
        'grade': [88, 92, 95, 70, 80, 30]
       }

col = list(data.keys())
df = pd.DataFrame(data, columns = col)

for part_id, df_id in df.groupby('participant_id'):
  df_id.to_csv(f'{part_id}.csv',index=False)

如何根据列值从datatframe提取行到多个CSV文件？

问题描述

2 个解决方案

解决方案1
2 已采纳 2018-11-25 01:00:14

`groupby` + `to_csv` `groupby` + `to_csv`

解决方案2
1 2018-11-25 01:53:42

如何根据列值从datatframe提取行到多个CSV文件？

问题描述

2 个解决方案

解决方案1 2 已采纳 2018-11-25 01:00:14

groupby + to_csv groupby + to_csv

解决方案2 1 2018-11-25 01:53:42

解决方案1
2 已采纳 2018-11-25 01:00:14

`groupby` + `to_csv` `groupby` + `to_csv`

解决方案2
1 2018-11-25 01:53:42