Pandas Split Dataframe by Unique Column Value

Question

I have a Dataframe that is being output to a spreadsheet called 'All Data'. Let's say this data contains a business addresses (column for street, city, zip, state). However, I also want to create a worksheet for each unique state containing the exact same columns.

My basic idea was to iterate over every row using df.iterrows() and divide the dataframe like that by appending it to a new dataframe but that seems extremely inefficient. Is there a better way to do this?

I found this answer but that is just a boolean index.

Answer 1

The groupby answers on the other question will work for you too. In your case, something like:

df_list = [d for _, d in df.groupby(['state'])]

This uses a list comprehension to return a list of dataframes, with one dataframe for each state.

Answer 2

A simple way to do it would be to get the unique states and then filtering them out and saving them as individual CSVs or do any other operation after

Here's an example:

# df[column].unique() returns a list of unique values in that particular column
for state in df['state'].unique():
    # Filter the dataframe using that column and value from the list
    df[df['state']==state].to_csv()

Pandas Split Dataframe by Unique Column Value

Question

2 answers

solution1
3 ACCPTED 2020-06-11 18:23:27

solution2
2 2020-06-11 18:21:42

Pandas Split Dataframe by Unique Column Value

Question

2 answers

solution1 3 ACCPTED 2020-06-11 18:23:27

solution2 2 2020-06-11 18:21:42

solution1
3 ACCPTED 2020-06-11 18:23:27

solution2
2 2020-06-11 18:21:42