简体   繁体   中英

Pandas create multiple dataframe based on group from another dataframe

I have a pandas dataframe

df=pd.DataFrame({'Name':['Jhon','Andy','Jenny','Joan','Paul','Rosa'],
                 'Position':['Programmer','Designer','Programmer','Designer','Analyst','Analyst']})

I want to create multiple of other dataframe based on the Position, and named each dataframe as "Job_as_"

Expected output would be

Job_as_Programmer=['Jhon','Jeny']
Job_as_Designer=['Andy','Jhon']

You could create a dictionary:

{"Job_as_"+ x : df.loc[df.Position==x, "Name"].to_list() for x in df.Position.unique()}

Output

{
 'Job_as_Programmer': ['Jhon', 'Jenny'],
 'Job_as_Designer': ['Andy', 'Joan'],
 'Job_as_Analyst': ['Paul', 'Rosa']
}

Use pandas.DataFrame.groupby with pandas.Series.add_prefix :

df2 = df.groupby("Position")["Name"].apply(list)
df2.add_prefix("Job_as_").to_dict()

Output:

{'Job_as_Analyst': ['Paul', 'Rosa'],
 'Job_as_Designer': ['Andy', 'Joan'],
 'Job_as_Programmer': ['Jhon', 'Jenny']}

you could just use groupby as below:

import pandas as pd
df=pd.DataFrame({'Name':['Jhon','Andy','Jenny','Joan','Paul','Rosa'],
                     'Position':['Programmer','Designer','Programmer','Designer','Analyst','Analyst']})
newDf = df.groupby(["Position" , "Name"]).first()
newDf #To Print Table

Output:

Position    Name
Analyst     Paul
            Rosa

Designer    Andy
            Joan

Programmer  Jenny
            Jhon

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM