简体   繁体   中英

Pandas Split Dataframe by Unique Column Value

I have a Dataframe that is being output to a spreadsheet called 'All Data'. Let's say this data contains a business addresses (column for street, city, zip, state). However, I also want to create a worksheet for each unique state containing the exact same columns.

My basic idea was to iterate over every row using df.iterrows() and divide the dataframe like that by appending it to a new dataframe but that seems extremely inefficient. Is there a better way to do this?

I found this answer but that is just a boolean index.

The groupby answers on the other question will work for you too. In your case, something like:

df_list = [d for _, d in df.groupby(['state'])]

This uses a list comprehension to return a list of dataframes, with one dataframe for each state.

A simple way to do it would be to get the unique states and then filtering them out and saving them as individual CSVs or do any other operation after

Here's an example:

# df[column].unique() returns a list of unique values in that particular column
for state in df['state'].unique():
    # Filter the dataframe using that column and value from the list
    df[df['state']==state].to_csv()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM