简体   繁体   中英

Create New Dataframes from Original for values in a column (Need to change name for every new dataframe)

I'm not sure if this question has been asked before, but I have a dataframe with > 2M rows and there is a column that identifies which location each transaction occurred at. I am trying to filter down and create a new dataframe for each Location code. I can filter that dataframe, but the problem I'm running into is having a function that changes the name of each new dataframe so that I end up with each one having a distinct name. I have some code to show what I have so far:

df  = pd.DataFrame({'location':[1, 2, 3, 4, 5], 'col2': [234.34, 34.80, 23.65, 24.23, 12.00]})
filter_array = []

def new_df_for_columns(df, column, filter_array):
    i = 0
    for column in filter_array:
        newdf = df[df[column] == filter_array[i]]
        i += 1
    return newdf.head()

So in this case, I need to change "newdf" for each new created dataframe.

If the transaction codes are ordered numbers, then you may use the index of the dataframe by just typing:

df.reindex (a list of indexes that correspond to the transaction codes)

For example, if your data is:

df = pd.DataFrame({'location':[1, 2, 3, 4, 5], 'col2': [234.34, 34.80, 23.65, 24.23, 12.00],index = range(5)})

And you want to filter locations 3 and 4, then type df.reindex([2,3]) This does not transform your data. It just creates a view. Your data will be the same.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM