data = {"Team": ["Red Sox", "Red Sox", "Red Sox", "Red Sox", "Red Sox", "Red Sox", "Yankees",
"Yankees", "Yankees", "Yankees", "Yankees", "Yankees"],
"Pos": ["Pitcher", "Pitcher", "Pitcher", "Not Pitcher", "Not Pitcher", "Not Pitcher",
"Pitcher", "Pitcher", "Pitcher", "Not Pitcher", "Not Pitcher", "Not Pitcher"],
"Age": [24, 28, 40, 22, 29, 33, 31, 26, 21, 36, 25, 31]}
df1 = pd.DataFrame(data)
Now im grouping by 2 columns using the following code:
grouped_multiple = df1.groupby(['Team', 'Pos']).agg({'Age': ['mean', 'min', 'max']})
grouped_multiple.columns = ['age_mean', 'age_min', 'age_max']
grouped_multiple = grouped_multiple.reset_index()
Now I create a second dataframe with also 3 columns with same lenght but only numbers as values. Imagine each cell of dataframe 1 is linked with the same positional cell of dataframe 2. When I groupby dataframe 1 --> I want to get the corresponding values of dataframe 2
so df1 groupyby column 1
["Red Sox", "Red Sox", "Red Sox", "Red Sox", "Red Sox", "Red Sox", "Yankees",
"Yankees", "Yankees", "Yankees", "Yankees", "Yankees"]
results in
["Red Sox", "Yankees"]
lets say df2 column 1 looks like
[1,2,4,3,2,3,4,5,3,5,6,7]
so I want to have the values of df2 - column 1 --> in one list where the corresponding index of df1 were taken of each "Red Sox" and "Yankees"
like
[[1,2,4,3,2,3][4,5,3,5,6,7]]
I am a bit unclear as to what you are trying to do, but if you concatenate the two dataframes thus:
newdf = pd.concat([df1, df2], axis=1)
then you can do your groupby
and do the needful with the last three columns.
Not sure where grouped_multiple
comes into your problem, I think you can do if df1 and df2 have same length
df2 = pd.DataFrame({'col1':[1,2,4,3,2,3,4,5,3,5,6,7]})
s = df2['col1'].groupby(df1['Team']).agg(list)
and you get
print (s)
Team
Red Sox [1, 2, 4, 3, 2, 3]
Yankees [4, 5, 3, 5, 6, 7]
Name: col1, dtype: object
or if you want a list of list, then
l = s.tolist()
print (l)
[[1, 2, 4, 3, 2, 3], [4, 5, 3, 5, 6, 7]]
And if you want to groupby both columns from df1, then you can do
df2['col1'].groupby([df1['Team'], df1['Pos']]).agg(list)
Team Pos
Red Sox Not Pitcher [3, 2, 3]
Pitcher [1, 2, 4]
Yankees Not Pitcher [5, 6, 7]
Pitcher [4, 5, 3]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.