简体   繁体   English

Pandas 按唯一列值拆分 Dataframe

[英]Pandas Split Dataframe by Unique Column Value

I have a Dataframe that is being output to a spreadsheet called 'All Data'.我有一个 Dataframe 是 output 到一个名为“所有数据”的电子表格。 Let's say this data contains a business addresses (column for street, city, zip, state).假设此数据包含一个企业地址(街道、城市、zip、州的列)。 However, I also want to create a worksheet for each unique state containing the exact same columns.但是,我还想为每个包含完全相同列的唯一 state 创建一个工作表。

My basic idea was to iterate over every row using df.iterrows() and divide the dataframe like that by appending it to a new dataframe but that seems extremely inefficient.我的基本想法是使用df.iterrows()遍历每一行,然后通过将 dataframe 附加到新的 dataframe 来划分 Z6A8064B5DF479455500557DZ ,但这似乎效率极低。 Is there a better way to do this?有一个更好的方法吗?

I found this answer but that is just a boolean index.我找到了这个答案,但这只是一个 boolean 索引。

The groupby answers on the other question will work for you too.另一个问题的 groupby 答案也对您有用。 In your case, something like:在您的情况下,类似于:

df_list = [d for _, d in df.groupby(['state'])]

This uses a list comprehension to return a list of dataframes, with one dataframe for each state.这使用列表推导返回数据帧列表,每个 state 有一个 dataframe。

A simple way to do it would be to get the unique states and then filtering them out and saving them as individual CSVs or do any other operation after一种简单的方法是获取唯一状态,然后将它们过滤掉并将它们保存为单独的 CSV 或在之后执行任何其他操作

Here's an example:这是一个例子:

# df[column].unique() returns a list of unique values in that particular column
for state in df['state'].unique():
    # Filter the dataframe using that column and value from the list
    df[df['state']==state].to_csv()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM